CCP: Data Scientist Solution Kit

Web Analytics Challenge: Classification, Clustering, and Collaborative Filtering

Get Hands-On with Live Data

The explosion of data is leading to new business opportunities that draw on advanced analytics and require a broader, more sophisticated skills set, including software development, data engineering, math and statistics, subject matter expertise, and fluency in a variety of analytics tools. Brought together by data scientists, these capabilities can lead to deeper market insights, more focused product innovation, faster anomaly detection, and more effective customer engagement for the business.

The Data Science Challenge Solution Kit is your best resource to get hands-on experience with a real-world data science challenge in a self-paced, learner-centric environment. The free solution kit includes a live data set, a step-by-step tutorial, and a detailed explanation of the processes required to arrive at the correct outcomes.

Data Science at Your Desk

The Web Analytics Challenge includes five sections that simulate the experience of exploring, then cleaning, and ultimately analyzing web log data. First, you will work through some of the common issues a data scientist encounters with log data and data in JSON format. Second, you will clean and prepare the data for modeling. Third, you will develop an alternate approach to building a classifier, with a focus on data structure and accuracy. Fourth, you will learn how to use tools like Cloudera ML to discover clusters within a data set. Finally, you will select an optimal recommender algorithm and extract ratings predictions using Apache Mahout.

Get Hired as a Certified Data Scientist

The Data Science Challenge Solution Kit is also your best resource to prepare and practice for the CCP: Data Scientist Certification. The solution kit will put you on the path to data science expertise and Cloudera Certified Professional status.

