In this video, Jeff Hammerbacher of Cloudera mentions that good data scientists are “data rats.” Athletes are often considered “gym rats” if they spend a lot of time in the gym, so Jeff believes “data rats” need to spend a lot of time with data. Having a high level of curiosity is very important.
Jeff also teaches an introductory course in Data Science at Berkeley. In the course, he tries to cover 5 skills that are not typically covered in an undergraduate curriculum.
- Data Collection and Integration – know how to acquire and integrate data
- Visualization Design – not just chart design but entire dashboard design
- Large-scale Experimentation – rapidly design and deploy features to be tested
- Causal Inference – you don’t get to design the studies, you just deal with the data
- Data Products – how to deploy and evaluate a machine learning algorithm