New to Data Science
Get started on the path toward becoming a data science practitioner.
A data scientist, as defined by Cloudera's Director of Data Science Josh Wills, is a "peculiar blend of developer and statistician that is capable of turning data into awesome." There are other definitions as well, but the common denominators are great curiousity and a love of data.
As the Hadoop stack becomes the de fact platform for management of Big Data, people with data science skills will be in very high demand. Even CNN calls data science one of the "best new jobs in America."
> See "Data Science" posts on the Cloudera Blog
Background
- What is Data Science? (O'Reilly)
- Data Science: The Sexiest Job of the 21st Century (Harvard Business Review)
- How does Data Science Differ from Traditional Statistical Analysis? (Quora)
- A Very Short History of Data Science (Blog)
- Data Science is the Future of IT (GigaOm)
- Data Science: A Personal History (Presentation by Jeff Hammerbacher)
- Definition of a Data Scientist (Video)
- "Innovation and Data": Jeff Hammerbacher keynote at SAInnovations 2012 (Video)
Training & Certification
- Introduction to Data Science (via Cloudera)
Papers
- "The Future of Data Analysis" (Annals of Mathematical Statistics, 1962)
- "Statistical Modeling: The Two Cultures" (Statistical Science, 2001)
Tutorials & Courses
- Getting Started With Python For Data Science, Kaggle
- Introduction to Data Science, Berkeley
- Data Science and Analytics: Thought Leaders, Berkeley
- Analyzing Big Data with Twitter, Berkeley
- Paradigms for Computing with Data, Stanford
- Machine Learning with Large Datasets, CMU
- How to Process, Analyze and Visualize Data, MIT
- Learning from Data, Caltech
- Introduction to Data Science, Columbia
- Data Science: Large-scale Advanced Data Analysis, University of Florida
- Introduction to Data Science, University of Washington
- On Coursera: Statistics, Data Analysis, and Scientific Computing
Meetups
- Data Science & Business Analytics (Denver)
- Data Science DC (Washington, DC)
- Data Science London
- The Data Scientist (Boston)
- Big Data Science (Fremont, CA)
Books
- Data Analysis with Open Source Tools
- Beautiful Data
- Statistics: The Art and Science of Learning from Data
- Super Crunchers
- The Numerati
- Data Driven
- Data Source Handbook
- Programming Collective Intelligence
- Python for Data Analysis
- Mining the Social Web
- The Visual Display of Quantitative Information
- Mathematical Statistics and Data Analysis
- The Elements of Statistical Learning
- Mining of Massive Data Sets
- Data Analysis: What Can Be Learned From the Past 50 Years
- Probably Not
- The Practice of Data Analysis