Introduction to Apache Pig

The Hadoop Distributed File System (HDFS): what it is and how it works, MapReduce features and how it works with HDFS and the general topology of a Hadoop cluster.

Date: Tuesday, Oct 04 2011


Pig is a simple-to-understand data flow language used in the analysis of large data sets. Pig scripts are automatically converted into MapReduce jobs by the Pig interpreter, so you can analyze the data in a Hadoop cluster even if you aren't familiar with MapReduce. If you want to find out more about Pig use cases, Pig's Language, Pig Latin and the benefits of utilizing Pig, you're in the right place.

Next Steps