See for yourself how easy it is to use Cloudera Machine Learning (CML) on Cloudera Data Platform Public Cloud (CDP-PC).
In this tutorial, we will create a linear regression model using housing data, deploy the model within CML, and test it using a web application.
In the ML Workspaces section, select Provision Workspace:
Two simple pieces of information are needed to a provision workspace - Workspace Name and Environment. For example:
cml-tutorial
usermarketing
Complete the New Project form using:
Project Name: Predicting House Prices
Project Visibility: Public
Initial Setup: Local
Upload or Drag-Drop cml-project-houseprice.zip you downloaded earlier
Create Project
Now that we have a working environment, let’s create a session in our project. We will use the housing data to train a linear regression model.
Beginning from the Projects section, select the project name, Predicting House Prices.
Select New Session and complete the session form:
Session Name: Home Price Prototype
Editor: Workbench
Kernel: Python 3
Engine Image: Default
Resource Profile: Default (1 vCPU / 2 GiB Memory)
Start Session
Let’s open a terminal window by selecting, >_ Terminal Access and type:
sh cdsw-build.sh
This will install the dependent libraries needed for the project (sklearn, pandas and numpy). Once it completes, close the terminal window.
NOTE: You only need to install dependent libraries once - this step can be skipped in future sessions.
Select file, train-model.py and click on to run the entire program.
Using house_data.csv, a linear regression model will be created and saved in a new file called, housePredictor.pickle.
Now that we’ve created our model, we no longer need this session - select Stop to terminate the session.
In the Models section, select New Model to add the model we’ve just created to the project. Complete form as follows:
Name: HousePredictor
Description: Predicts the price of a home
File: model-wrapper.py
Function: PredictFunc
Example Input:
{ "bathrooms": "2", "bedrooms": "3", "sqft_living": "1800", "sqft_lot": "2200", "floors": "1", "waterfront": "1", "condition": "3" }
Example Output: { "result" : 100000 }
Kernel: Python 3
Engine Profile: Default
Replicas: 1
Select Deploy Model
Under the Overview and Shell tabs, capture hostURL and accessKey.
As part of the download assets, we provided a folder named, cml-webapp-houseprice. Using your favorite editor, modify cml-webapp-houseprice/src/App.js by replacing:
<accessKey> with accessKey
<hostURL> with hostURL
In the command line, move into folder cml-webapp-houseprice and run the following commands:
npm install
npm start
A new browser window/tab should automatically open using http://localhost:3000
. You are encouraged to play with different home configurations and see its predicted value.
Congratulations on completing the tutorial.
While playing with the web application, you may have noticed interesting price values being predicated. If you in the mood for a good challenge, modify train-model.py and improve the model.
As you have seen, it is easy to use Cloudera Machine Learning (CML) to deploy your machine learning projects. This is only the beginning - there is so much more to learn.
Videos
Blogs
Other
This may have been caused by one of the following: