Cloudera acquires Octopai's platform to enhance metadata management capabilities

Read the press release
Cloudera launches AI Inference Service, with NVIDIA NIM, accelerating GenAI development.
Overview

Why Cloudera + NVIDIA?

Today, data processing and data engineering has become the world's largest computing segment. Modest improvements in the accuracy of analytics models translate into billions to the bottom line. To build the best models, data scientists toil to train, evaluate, iterate, and retrain for highly accurate results and performant models. With RAPIDS on Cloudera, processes that took days now take minutes, making it easier and faster to build and deploy value generating models. Enterprises can easily leverage GPU-accelerated Apache Spark 3.0 on Cloudera to remove bottlenecks and quickly improve performance—significantly improving time to insight and the return on investment for data-driven enterprises.

 

With Cloudera Powered by NVIDIA, enterprises will be able to seamlessly accelerate data analytics on critical applications like Spark 3.0 without any code changes. These breakthroughs will enable companies to analyze data in real time to gain the intelligence needed to navigate evolving customer demands.

– Manuvir Das, Head of Enterprise Computing, NVIDIA

About NVIDIA

NVIDIA pioneered accelerated computing to tackle challenges ordinary computers cannot.

 

Key Highlights
 

CATEGORY

Independent Hardware Vendor (IHV)


Website

Partner website


Partnership Highlights
 
  • Expand AI use cases with a complete production ML toolkit enabled by NVIDIA computing
  • Generate models that produce highly accurate data and insights trusted by the business
  • Operate a fully secure ML environment that can meet evolving requirements
  • Reduces ML training time and the frequency of model deployment from days to minutes
Reference Architectures

See all for NVIDIA

Joint Solution Overview

Running data science workloads on an accelerated Cloudera greatly improves time to value by enabling data scientists to collaborate in a single unified platform that is all inclusive for powering any AI use case. With the latest release, accelerated Apache Spark 3.0 workloads now run seamlessly on Cloudera. With GPU acceleration, data science teams can leverage purpose-built tooling for agile experimentation, data analytics and machine learning 10x faster and at lower cost.

Cost-effective NVIDIA infrastructure empowers IT teams to deliver an accelerated Cloudera solution for intuitive, self-service ML—now and into the future. NVIDIA-Certified servers are available from leading OEM server vendors. For companies looking to jumpstart their AI journey, Accelerated Cloudera Starter Solutions are available to confidently deploy scalable hardware and software solutions that securely and optimally run accelerated workloads.

 

Joint Solution Benefits

NVIDIA and Cloudera have tested and benchmarked workloads across a wide range of infrastructure configurations and boiled it down to two simple recommendations:

For companies buying servers dedicated for running Apache Spark for data analytics and ETL in Cloudera, a Cloudera-READY configuration comprised of four NVIDIA-Certified servers with two NVIDIA A30 GPUs per server offers over five times the performance at less than 50% incremental cost relative when compared to modern CPU-only alternatives. 

For companies buying servers for running not just Apache Spark but also machine learning on Cloudera, or if these servers may be used for other AI-related applications during their lifetime, upgrade to an AI-READY configuration comprised of four NVIDIA-Certified servers with one NVIDIA A100 GPU per server offers over eight times the performance at less than 50% incremental cost relative when compared to modern CPU-only alternatives. And these numbers are just the Apache Spark benchmarks; acceleration on ML and AI training is even more significant.

Cloudera and NVIDIA:
Predicting customer churn using RAPIDS, Apache Spark, and NVIDIA GPUs

Easily deploy end-to-end data science pipelines on Cloudera running on NVIDIA accelerated infrastructure to improve your data-driven operations.

Ebook

Accelerating Customer Churn Prediction

Datasheet

NVIDIA GPU acceleration on Cloudera

Whitepaper

Turbocharge Your ETL Pipelines With NVIDIA GPUs and Cloudera

Whitepaper

An end-to-end blueprint for churn prediction and modeling

Solution Brief

Accelerate your Cloudera workloads with NVIDIA-certified systems

Related blog posts

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.