2015 Summer of Data Science Learning

The twitter hashtag #SoDS is being used in 2015 to help people track and share what they are learning. The hashtag originated on the Becoming Data Scientist blog.

I recently wrote a post for Sense about a number of freely available learning opportunities this summer, Start Learning with the Summer of Data Science. The post covers:

  • MOOCs starting soon
  • Large list of open-access journals

If you are interested, go check out the post and start your #SoDS. Hurry, many of the opportunities start very soon.

Learn Apache Spark this Summer with edX

edX has just announced a new series of Big Data courses. The series consists of 2 courses focused around Apache Spark. If you are not familiar with Spark, it is a very fast engine for large-scale data processing. It claims to perform up to 100 times faster than hadoop. Here are the 2 courses:

  1. Introduction to Big Data with Apache Spark
  2. Scalable Machine Learning

The first course starts June 1, 2015, and lasts four weeks. The second course starts in late June and lasts five weeks.

The courses are free but verifiable certificates can be purchased for $50 per course.

If you have been hoping to learn Spark, this might be just the opportunity your were waiting for.

New Coursera Specializations for Data Scientists

Coursera just launched 18 new specializations. Not all of them are relevant to data science, but here are 3 of the specializations that pertain to data science.

All of these specializations will provide great content. They are quite specific though, but if your goals match the topics, it is hard to beat Coursera.

Are you excited about any of the new specializations?

Process Mining Course via Coursera

Process mining is a bridge between data mining and business process modeling. Process Mining can be used to study event and log files to extract meaning.

The Coursera course, Process Mining: Data science in Action, starts November 12, 2014.

Foundations of Data Analysis on EdX

EdX will be offering Foundations of Data Analysis via the University of Texas at Austin. The course starts November 4, 2014. Here is a list of topics:

  • Tutorials on using R
  • Descriptive Statistics
  • Statistical Models (Regression)
  • Inferential Stats

Free Big Data MOOC

Coursera is offering the course Mining of Massive Datasets from Stanford University. This is a popular course at Stanford and goes along with the book by the same name. The FREE course starts September 29, 2014, and runs for 7 weeks. The prerequisites are some SQL, algorithms, and data structures knowledge.

Thanks to David Trower for the tip on this course

Caltech Machine Learning Course Now on EdX

The widely popular Caltech course, Learning from Data, will be offered on EdX this fall. The course starts September 25, 2014, and it will run for 10 weeks. Here is an abbreviated list of the course topics.

  • Linear Models
  • Bias/Variance
  • Neural Networks
  • Cross Validation
  • and much more

EdX offers a number of other Data Science related courses. See all of them on the Statistics and Data Analysis course list.

Coursera Data Science Specialization – Started This Week

Coursera now offers a Data Science Specialization. The courses are taught by Johns Hopkins University. If you would like to earn a Specialization Certificate, each course will cost you $49 otherwise you can take the courses for free without earning the certificate.

Best of all, the first course started this week (April 7, 2014).

The Specialization consists of the following courses. You must complete all courses for the certification. If you are not interested in the certificate, you can take any or all the courses.

  1. The Data Scientist’s Toolbox
  2. R Programming
  3. Getting and Cleaning Data
  4. Exploratory Data Analysis
  5. Reproducible Research
  6. Statistical Interence
  7. Regression Models
  8. Practical Machine Learning
  9. Developing Data Products
  10. Capstone Project

Google Data Science MOOC

Google recently announced the launch of their own Massive Open Online Course (MOOC). The course is titled, Making Sense of Data, and it begins tomorrow, March 18, 2014.

The prerequisites are quite simple. All that is needed is: a google account, a web browser, and a basic knowledge of spreadsheets.

The content of the course will focus on Fusion Tables, which is a new experimental product from Google. Fusion Tables is a web application for visualizing, gathering, and sharing data. I am not familiar with Fusion Tables, but the description sounds very useful.

Here is the promotional video.

Data Mining with Weka MOOC

Professor Ian Witten of The University of Waikato has just begun his second iteration of the online course, Data Mining With Weka. Hurry, because the course started March 3, but there is still time to register and complete the course. The course lasts 5 weeks and covers how to analyze your own data with Weka.

Weka is an open source tool for machine learning and data mining.