The Data Scientific Method

DJ Patil and Josh Elman, both of Greylock Partners, give an insightful talk at LeWeb London 2012. The most important part was the introduction of the Data Scientific Method.

Data Scientific Method

  1. Start with a Question
  2. Leverage your current data
  3. Create features and run tests
  4. Analyze the results and draw insights
  5. Let the data frame a conversation

How To Learn Data Science? Part 2

Yesterday, I posted about some traditional strategies to acquire data science skills. Today, I will post a nontraditional strategy.

Internet Based

There is hoards of data science information available on the internet for free. With enough personal motivation, a person could learn all the skills necessary for free (or cheap) online. Coursera is probably a great place to start. There are also other good sites such as Udacity, the Kaggle Wiki, other blogs and websites.

The problem with this approach is knowing exactly what to learn. A course in machine learning is great, but data science is more than just machine learning. How do you know what to learn? It would be really nice to have a collection of data science topics and the associated online training materials.

Would this strategy work for you?

How To Learn Data Science?

Based upon the popularity of a previous post about a certificate program from the University of Washington, it appears that many people are interested in learning the skills necessary to become a data scientist. Thus, I decided to compile a list of some of the possible learning strategies.

Traditional College Education

The most obvious path would be to study at a traditional college or university. Colleges and universities are starting to notice the demand for data science skills, and many colleges are currently offering programs to prepare someone as a data scientist. This path is safe and predictable. Do the homework, complete the courses, and get the degree or certificate. Most people are familiar with the process, and it offers few surprises. The problems here are the costs, lack of flexibility, and time involved.

Corporate Training

Companies are now starting to offer training programs for data science. EMC is leading the way in this category with their data science training program. Cloudera also offers lots of training related to hadoop and big data. Wolfram offers data science training with Mathematica. One of the problems with this category is the cost. Another problem is the companies have the tendency to teach and promote their own products. This may leave the student with numerous gaps in the full data science spectrum.

Your Thoughts?

What are you thoughts about the above approaches? What are the positives and negatives? Also, later this week I will be posting some less-traditional approaches to learning data science.

Network World Article: Could data scientist be your next job?

Sandra Gittlen wrote a very nice article for Network World titled Could data scientist be your next job? She did an excellent job explaining the problem with defining exactly what a data scientist is. She also interviewed a couple people leading the push to get universities to provide data scientist training. According to her article and many others on the internet, it is a great time to be learning data science skills. Companies cannot find enough data scientists. Plus, Sandra was kind enough to interview me for the article.

Data Science Training Program in New York

If you are in New York City or the surrounding area and you want to learn data science, this post is for you. General Assembly; a technology, design, and entrepreneurship campus in New York City; is running a 12-week Intensive Program in Data Science. The course consists of lectures (twice a week), labs, homework, and a comprehensive project. The instructors are Max Shron of OkCupid fame and Ryan Witt, founder of Opani. The course does cost $3000, but that seems like a fair price for the knowledge gain and a certificate.

Are you aware of any other training programs like this?