X

Cloudera Tutorials

Optimize your time with detailed tutorials that clearly explain the best way to deploy, use, and manage Cloudera products. Login or register below to access all Cloudera tutorials.

Cloudera acquires Octopai's platform to enhance metadata management capabilities

Read the press release

 

Introduction

 

Experience the benefits of having access to a hybrid cloud solution. Using Cloudera AI (formerly Cloudera Machine Learning) on Cloudera, see how an AI workload compares running on-premises versus leveraging computational resources in the cloud.

 

 

Prerequisites

 

 

 

Watch Video

 

The video below provides a brief overview of what is covered in this tutorial:

 

 

Download Assets

 

There are two (2) options in getting assets for this tutorial:

  1. Download a ZIP file

It contains only necessary files used in this tutorial. Unzip tutorial-files.zip and remember its location.

  1. Clone our GitHub repository

It provides assets used in this and other tutorials; organized by tutorial title.

 

 

Set up Cloudera AI

Provision Machine Learning workspace

 

If your environment doesn’t already have a Machine Learning workspace provisioned, let’s provision it.

Select Machine Learning from the Cloudera home page:

 

cdp-homepage-machine-learning

 

In the ML Workspaces section, select Provision Workspace.

Two simple pieces of information are needed to provision an ML workspace - the Workspace Name and the Environment name. For example:

  1. Workspace Name: cml-tutorial
  2. Environment: <your environment name>
  3. Select Provision Workspace

 

cml-workspace-provision

 

Add Resource Profile for additional vCPU / Memory

 

Beginning from the ML Workspaces section, open your workspace by selecting its name, cml-tutorial.

In the Site Administration section, select Runtime/Engine. Create a new resource profile using:

vCPU: 2

Memory (GiB): 16

Select Add

 

cml-create-resource-profile

 

Create Project

 

Beginning from the ML Workspaces section, open your workspace by selecting its name, cml-tutorial.

Select New Project.

Complete the New Project form using:

  1. Project Name: Transfer Learning
  2. Project Description:
    A project showcasing the speed improvements of running heavy AI workloads on-premises versus using GPU resources on the cloud.
  3. Initial Setup: Local Files
    Upload or Drag-Drop cml-files folder you downloaded earlier

Select Create Project

 

cml-new-project

 

Run Experiments

 

We will create three (3) experiments to verify speed improvements of AI workload and see the effect GPUs have on training the model.

Beginning from the Projects section, select the project name, Transfer Learning.

In the Experiments section, select Run Experiment and complete the form as follows:

  1. Script: main.py
  2. Kernel: Python 3.8
  3. Edition: Nvidia GPU
  4. Version: 2021.06
  5. Resource Profile: 2 vCPU / 16 GiB Memory, 0 GPUs
  6. Comment: 0 GPU
  7. Select Start Run

 

Similarly, let’s create an experiment using 1 GPUs: 

  1. Script: main.py
  2. Kernel: Python 3.8
  3. Edition: Nvidia GPU
  4. Version: 2021.06
  5. Resource Profile: 2 vCPU / 16 GiB Memory, 1 GPUs
  6. Comment: 1 GPU
  7. Select Start Run

 

Similarly, let’s create an experiment using 2 GPUs: 

  1. Script: main.py
  2. Kernel: Python 3.8
  3. Edition: Nvidia GPU
  4. Version: 2021.06
  5. Resource Profile: 2 vCPU / 16 GiB Memory, 2 GPUs
  6. Comment: 2 GPU
  7. Select Start Run

 

cml-run-experiments

 

As the experiment results were completing, you could see an order of magnitude difference between having access to GPUs and having to train the model on CPU only.

Your results should be similar to:

 

cml-experiment-results

 

The training time utilized for 0 GPU should be comparable to on-premises with no GPUs.

You can review the output of the python program, main.py, by selecting a Run id, then select Session.

 

cml-program-output

 

Summary

 

Congratulations on completing the tutorial.

As you’ve now experienced, having access to a hybrid cloud solution allows the opportunity to leverage cloud resources only when you need them. In our experiments, the use of GPUs resulted in huge time savings, empowering users to spend their valuable time creating value instead of waiting for their model to train.

 

 

Further Reading

Blogs

Meetup

Other

 

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.