Cloudera acquires Octopai's platform to enhance metadata management capabilities

Read the press release

CDP Data Engineer Exam Guide CDP-3002

NOTE: Although the exam is currently in beta (meaning that there may be some small edits made to the content) the exam allows you to earn your certification if you pass. 

Audience

The exam tests the skills and knowledge required by data engineers to use the Cloudera Data Platform:

It is required for a Data Engineer professional, who knows how to work proficiently designing, developing and optimizing data workflows using Cloudera tools. Strong grasp of data modeling for efficient storage, including formats, partitioning and schema design, and Apache Iceberg. Expertise in performance optimization, bottleneck identification, query tuning and resource efficiency. Proficient in security configuration, monitoring, troubleshooting and cloud integration for Cloudera clusters using mainly Spark and Airflow.

Exam Details

  • Exam Number: CDP-3002
  • Number of questions: 50
  • Duration: 90 minutes
  • Pass Score: 55%
  • Delivery: online, proctored
  • Please review the system requirements to enable online, proctored testing through QuestionMark
  • Allowed resources: none.
  • You may not use reference materials, white papers, user guides or any other resources during your exam.
  • Support: if you need help, please email us.

Cloudera Skills & Knowledge Measured

This exam measures the skills and knowledge topics listed in Table 1. below. The weighting of each topic is also listed.

Topic WEIGHT (% of exam)

Spark

  • Fundamentals on Spark over Kubernetes
  • Work with DataFrames
  • Understand Distribute Processing
  • Implement Hive and Spark Integration
  • Understand Distributed Persistence

 

48%

Airflow

  • Implement incremental extraction in Apache Airflow from source system
  • Use Apache Airflow to schedule ETL pipelines
  • Use Apache Airflow to schedule quality checks
  • Work with DAGs

 

10%

Performance Tuning

  • Know Basic tools in (Spark) Performance Tuning
  • Understand Optimization Framework and Explain plans
  • Understand Inferring Schemas
  • Work with Improving Join Performance
    Leverage Caching Data for Reuse
  • Work with Partitioned and Bucketed Tables

 

22%

Deployment

  • Use the API and CLI
  • Work in the Data Engineering Service

 

10%

Iceberg

  • Understand CDP Iceberg
10%

Suggested Training

Although optional, the Cloudera Educational Services courses listed below cover some of the same topics as listed in the table above. Real world, hands-on experience is highly recommended whether you participate in training or not.

Preparing with Cloudera Data Engineering

Advanced Spark Application Performance Tuning

CDP Iceberg Integration (FREE OnDemand)

Helpful documentation

Data Engineering

Orchestrating workflows and pipelines

Apache Airflow

Iceberg

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.