R is a hugely popular language among data scientists and statisticians. One of the difficulties with open-source R is the memory constraint. All the data needs to be loaded into a data.frame. Microsoft solves this problem with the RevoScaleR package of the Microsoft R Server. Just launched this week is an EdX course on
Analyzing Big Data with Microsoft R Server.
According the syllabus:
Upon completion, you will know how to use R for big-data problems.
Full Disclosure: I work at Microsoft, and the course instructor, Seth Mottaghinejad, is one of my colleagues.
edX has just announced a new series of Big Data courses. The series consists of 2 courses focused around Apache Spark. If you are not familiar with Spark, it is a very fast engine for large-scale data processing. It claims to perform up to 100 times faster than hadoop. Here are the 2 courses:
- Introduction to Big Data with Apache Spark
- Scalable Machine Learning
The first course starts June 1, 2015, and lasts four weeks. The second course starts in late June and lasts five weeks.
The courses are free but verifiable certificates can be purchased for $50 per course.
If you have been hoping to learn Spark, this might be just the opportunity your were waiting for.
EdX will be offering Foundations of Data Analysis via the University of Texas at Austin. The course starts November 4, 2014. Here is a list of topics:
- Tutorials on using R
- Descriptive Statistics
- Statistical Models (Regression)
- Inferential Stats
The widely popular Caltech course, Learning from Data, will be offered on EdX this fall. The course starts September 25, 2014, and it will run for 10 weeks. Here is an abbreviated list of the course topics.
- Linear Models
- Neural Networks
- Cross Validation
- and much more
EdX offers a number of other Data Science related courses. See all of them on the Statistics and Data Analysis course list.
EdX, a MOOC site, is offering Learning From Data. This is a course about machine learning offered by Caltech. The course started yesterday, so there is still time to get started. The course has 2 tracks: audit and certificate. It looks great. Good Luck.
Just yesterday, MIT and Harvard University announced a new partnership to offer online education. The goal is to increase learning for students on-campus and others throughout the globe. Both schools plan to study the results of edX to better understand how students learn and how technology affects learning.
See the official announcement here.
EdX Video Announcement
How will this affect Data Science Learning?
It is too early to know exactly what courses will be offered, but given MIT’s strength in engineering, those courses would seem reasonable. I am guessing (and hopeful) that many courses pertinent to data science will be offered by edX. Also, the announcement is most likely a response by MIT and Harvard to compete with Coursera, a company started by 2 Stanford University faculty. Obviously, the elite schools do not want to be outdone by each other. In any case, I only see these new and different methods for education as a good thing. Happy Learning!