- Computer scientists discover statistics and find it useful – Ever wonder why computer scientists are getting all the attention for data science? Well, computer scientists stole ideas from statistics. Read the article and it will make more sense.
- Top 3 Myths About Data Science – Here is a highlight of the myths:
- Data science is a field for mathematical geeks.
- Learning a tool is the equivalent of learning data science
- Data scientists will be replaced by artificial intelligence soon
- The Big Data Fallacy And Why We Need To Collect Even Bigger Data – More Data is not always better because it does not necessarily mean more information. Read this for a good description of data vs. information vs. insights.
- MLbase – A distributed machine learning system, here is an academic paper about the system
- Predictive Analytics and Machine Learning: An Overview (PDF) – this is a very nice slide deck from IBM
Jeff Hammerbacher, founder and Chief Scientist of Cloudera, gives a nice talk about data science. He explains what he has done in the past, and what he plans to do in the future.
It is the second video, I have posted recently, emphasizing the importance of data science for more than just advertising. Jeff is getting involved in a Medical School to see how data can help.
Note: The video is about 45 minutes, but it contains some really good information.
Code School is offering a course title Try R. The course is completely free and can be completed online with the interactive tutorial. You will learn by doing. If you have been looking to learn R or need a quick refresher, this is probably a very good option.
I recently read, Big Data Education: 3 Steps Universities must take
Here are the 3 steps listed:
- Data Science cannot be an undergraduate degree
- A graduate degree should contain math, stats and computer science
Step 2 seems obvious. Math, stats, and computer science are some of the key areas for data science. I would add communication and presentation skills to the list because people with just math, stats, and CS skills are not known to be naturally good communicators. I agree with step 3. More research needs to be done, but most of the research will need to be interdisiplinary. Universities need to put more effort into interdisiplinary research.
Step 1 confused me a bit. The argument was data science has too many necessary skills and an applied focus area. Of course a person cannot learn everything about data science in an undergraduate degree. Earning a computer science degree does not mean you will know everything about computer science. It just means you know the fundamentals about algorithms, architecture, and operating systems. You know enough about computer science to understand the field and learn more as you go. I think 4 years should be enough time to do the same for data science.
What are your thoughts?
The search tern “data scientist” has exploded in popularity in the last 18 months. Most of the interest appears to be coming from the U.S., India, and the U.K. Anyhow, I just thought the page was fun to look at.
To start, here is a nice quote from the video. The quote is from Eric Schmidt of Google.
From the dawn of civilization until 2003, humankind generated 5 exabytes of data.
Now we produce 5 exabytes of data every two days.
…and the pace is accelerating
Rick Smolan provides a good talk. He is behind The Human Face of Big Data project. I don’t have a copy of the book, but it looks really intriguing. The talk briefly explains what the book/project is all about.
- Data Visualization
- Interactive Design
- Web Development
The book is in early release and all the sections are available. You are also welcome to comment on any part of the book to help make it better.