Overview
Why Cloudera + NVIDIA?
Today, data processing and data engineering has become the world's largest computing segment. Modest improvements in the accuracy of analytics models translate into billions to the bottom line. To build the best models, data scientists toil to train, evaluate, iterate, and retrain for highly accurate results and performant models. With RAPIDS on Cloudera, processes that took days now take minutes, making it easier and faster to build and deploy value generating models. Enterprises can easily leverage GPU-accelerated Apache Spark 3.0 on Cloudera to remove bottlenecks and quickly improve performance—significantly improving time to insight and the return on investment for data-driven enterprises.
About NVIDIA
NVIDIA pioneered accelerated computing to tackle challenges ordinary computers cannot.
Key Highlights
CATEGORY
Independent Hardware Vendor (IHV)
Website
Partnership Highlights
- Expand AI use cases with a complete production ML toolkit enabled by NVIDIA computing
- Generate models that produce highly accurate data and insights trusted by the business
- Operate a fully secure ML environment that can meet evolving requirements
- Reduces ML training time and the frequency of model deployment from days to minutes
Reference Architectures
Joint Solution Overview
Running data science workloads on an accelerated Cloudera greatly improves time to value by enabling data scientists to collaborate in a single unified platform that is all inclusive for powering any AI use case. With the latest release, accelerated Apache Spark 3.0 workloads now run seamlessly on Cloudera. With GPU acceleration, data science teams can leverage purpose-built tooling for agile experimentation, data analytics and machine learning 10x faster and at lower cost.
Cost-effective NVIDIA infrastructure empowers IT teams to deliver an accelerated Cloudera solution for intuitive, self-service ML—now and into the future. NVIDIA-Certified servers are available from leading OEM server vendors. For companies looking to jumpstart their AI journey, Accelerated Cloudera Starter Solutions are available to confidently deploy scalable hardware and software solutions that securely and optimally run accelerated workloads.
Joint Solution Benefits
NVIDIA and Cloudera have tested and benchmarked workloads across a wide range of infrastructure configurations and boiled it down to two simple recommendations:
For companies buying servers dedicated for running Apache Spark for data analytics and ETL in Cloudera, a Cloudera-READY configuration comprised of four NVIDIA-Certified servers with two NVIDIA A30 GPUs per server offers over five times the performance at less than 50% incremental cost relative when compared to modern CPU-only alternatives.
For companies buying servers for running not just Apache Spark but also machine learning on Cloudera, or if these servers may be used for other AI-related applications during their lifetime, upgrade to an AI-READY configuration comprised of four NVIDIA-Certified servers with one NVIDIA A100 GPU per server offers over eight times the performance at less than 50% incremental cost relative when compared to modern CPU-only alternatives. And these numbers are just the Apache Spark benchmarks; acceleration on ML and AI training is even more significant.
Cloudera and NVIDIA:
Predicting customer churn using RAPIDS, Apache Spark, and NVIDIA GPUs
Easily deploy end-to-end data science pipelines on Cloudera running on NVIDIA accelerated infrastructure to improve your data-driven operations.
Related blog posts
Blog
Cloudera Introduces AI Inference Service With NVIDIA NIM
By Robert Hryniewicz | June 3, 2024
Blog
Announcing Cloudera’s Enterprise Artificial Intelligence Partnership Ecosystem
By Abhas Ricky & Nashua Springberry | December 20, 2023
Blog
Enabling NVIDIA GPUs to accelerate model development in Cloudera Machine Learning
By Pete Ableda | April 10, 2021