Category Archives: Learn Data Science

This is a category for all things related to learning data science.

Make 2013 Your Year for Learning Data Science

While the buzz around bigdata might die down a bit in 2013, the main ideas of data science are not going away. Being able to make smart decisions based upon data is a concept that is here to stay.

In fact, 2013 might be the best year ever to start learning data science skills. Colleges and Universities are opening new data science/analytics programs. There are numerous MOOCs avaiable or you can follow along on with a Coursera curriculum. Plus, there are thousands of blog posts, wikis, and other articles all covering various data science topics. If you want to learn data science, the timing could not be much better.

Happy New Year and Happy Learning!

5 Free Programming Languages for Data Science

  1. R There is a package for nearly any algorithm you will ever need. That is where R really excels. It is widely used and has a strong community. The only slight downfall (in my opinion) is the cumbersome syntax.
  2. Python A very good language for beginning programmers. The syntax is quite readable and intuitive. With the NumPy and SciPy packages, python has many of the tools/algorithms necessary to do data science.
  3. Octave Octave was created to be very similar to the commercial product, Matlab. Octave is used and highly recommended in Dr. Andrew Ng’s Coursera machine learning course.
  4. Java While I don’t read a lot about people using Java for quickly testing new statistical models, a couple of the larger open-source data science products are built with Java, Hadoop and Storm to name a couple. Plus, Java does have libraries for just about everything, and it has proved itself to be a fairly descent production environment.
  5. Julia This is the newcomer on the list. Julia claims to have really great performance along with built-in support for parallelism and cloud computing. I am not too familiar with Julia, but it will be interesting to see how the Julia community grows over the coming months and years. Julia is currently lacking some of the libraries/algorithms that the others on the list support.

Top 5 Data Science Guys

  1. DJ Patil DJ is one of the most recognizable faces in data science. He is constantly speaking at conference and appearing in the news articles. He is currently Data Scientist in Residence at Greylock Partners. Also, he helped create the term data scientist.
  2. Jeff Hammerbacher Jeff helped build the first data science team at Facebook. Then he moved on to help co-found Cloudera. Now he is involved with helping a Medical School perform better data analysis. Along with DJ, he helped to coin the term data scientist.
  3. Drew Conway A PhD student at NYU and Scientist at IA Ventures, Drew is very active in the data science world. He speaks at conferences, co-authored a book on machine learning, creates Venn Diagrams, and more.
  4. Jake Porway Jake is a former data scientist with the New York Times. He now spends a lot of time speaking about DataKind, a nonprofit organization (he founded) attempting to help the world.
  5. David Smith David is a blogger for Revolution Analytics. He is also a big fan of R. David is working very hard to help others learn about data science by speaking at conferences and hosting webinars.

This post goes along with Top 5 Data Science Gals.

Top 5 Data Science Gals

  1. Hilary Mason Hilary is the Chief Scientist at Bitly. She is a frequent speaker at conferences. She is commonly cited, interviewed and referenced in data science news/blogs/articles.
  2. Cathy O’Neil Cathy is better known to the internet world as mathbabe. She is a blogger (although not strictly about just data science), conference speaker, and soon to be book author.
  3. Carla Gentry Founder of Analytical-Solution.com, Carla is one of the most frequent #datascience tweeters on Twitter. She is known to the twitter world as @data_nerd
  4. Monica Rogati Monica is a Senior Data Scientist at LinkedIn. She speaks at conferences, publishes academic papers, tweets, and creates great data products at LinkedIn. She likes data so much, she uses data for parenting.
  5. Rachel Schutt Rachel just recently completed teaching and blogging the Introduction to Data Science course at Columbia University. She is also a Senior Statistician at Google Research. Along with Cathy, she will be a book author.

This post goes along with Top 5 Data Science Guys.