When telling friends and family that I blog about data science, I am frequently asked to explain more. I usually respond with an answer similar to this:
You know the world is generating huge amounts of data everyday due to financial transactions, medical records, social networks, and other internet uses. Data Science aims to make better decisions based upon that data. Here are some possibilities. What type of people buy TVs in October? Which patients will get better with this new drug? Who are some other people that you probably already know?
Data Science is all about answering these types of questions with real data instead of assumptions.
I think this explanation could use some refinement. What am I leaving out? What should I remove? How do you explain data science to other people (preferably non-technical or non-data people)?
This is a nice graphic showing where data science is being taught. It appears that data science is being taught all over the country.
Jeff Hammerbacher, founder and Chief Scientist of Cloudera, gives a nice talk about data science. He explains what he has done in the past, and what he plans to do in the future.
It is the second video, I have posted recently, emphasizing the importance of data science for more than just advertising. Jeff is getting involved in a Medical School to see how data can help.
Note: The video is about 45 minutes, but it contains some really good information.
Code School is offering a course title Try R. The course is completely free and can be completed online with the interactive tutorial. You will learn by doing. If you have been looking to learn R or need a quick refresher, this is probably a very good option.
I recently read, Big Data Education: 3 Steps Universities must take
Here are the 3 steps listed:
- Data Science cannot be an undergraduate degree
- A graduate degree should contain math, stats and computer science
Step 2 seems obvious. Math, stats, and computer science are some of the key areas for data science. I would add communication and presentation skills to the list because people with just math, stats, and CS skills are not known to be naturally good communicators. I agree with step 3. More research needs to be done, but most of the research will need to be interdisiplinary. Universities need to put more effort into interdisiplinary research.
Step 1 confused me a bit. The argument was data science has too many necessary skills and an applied focus area. Of course a person cannot learn everything about data science in an undergraduate degree. Earning a computer science degree does not mean you will know everything about computer science. It just means you know the fundamentals about algorithms, architecture, and operating systems. You know enough about computer science to understand the field and learn more as you go. I think 4 years should be enough time to do the same for data science.
What are your thoughts?