During the Spring 2012, Alex Smola taught a course at Berkeley on Scalable Machine Learning. Alex is an Adjunct Professor at the University of California at Berkeley and a Visiting Scientist at Google.
Alex was kind enough to put all the course materials on the internet. That includes papers, slides, links, and video lectures. Like the title suggests, the course appears to focus on large-scale machine learning. Below is one of the lectures from the Statistics portion of the course.
Recently, I read an article titled, Why Online Education Won’t Replace College–Yet. The article is most likely a response the recent success of Massive Open Online Courses (MOOCs) such as those offered by Udacity and Coursera. The author, David Youngberg an Assistant Professor of Economics, presents 5 reasons why online education won’t replace college. I disagree with his reasons, so I thought I should share more details. I will go through each of his 5 reasons.
- It’s too easy to cheat. Cheating has always been an issue in education, and I think it always will be. Students even manage to cheat in ivory tower institutions. Online Colleges such as University of Phoenix have been very successful and cheating can easily exist in that scenario. I do think online classes make cheating easy, but I don’t really see that stopping the success of online education.
- Star students can’t shine. This is just simply not true. The star students are the ones answering questions in the forums and getting the assignments done first. This is very similar to the star students in a regular college setting. The brightest students have their work done first and are frequently found helping their peers. Udacity has even hired one of the former star students.
- Employers avoid weird people. Just because a person takes an online course does not make him/her weird. Taking an online course means a person is willing to find cheaper and easier ways to solve old problems. It also means the person has the initiative to go out and complete something. All of those traits are attractive to companies. The problem here is credentials. MOOCs have not yet solved the credential problem. MOOCs don’t offer degrees or widely-acknowledged certifications yet. Companies want to hire people with degrees, not people with a piece of paper stating “I completed an online course.” I think MOOCs will quickly figure out this problem. Also, many of the Coursera and Udacity students are former college graduates. Why are they now weird for taking an online course?
- Computers can’t grade everything. Not so fast. Earlier this year, Kaggle and the Hewlett Foundation sponsored a competition to see if technology could be created to automatically grade standardized test essays. Well, the competition was a big success. See the full press release. The competition results will probably not generalize to all essays, but the technology to automatically grades papers is not that far away. Also, Coursera is experimenting with crowd-sourced grading of papers. One student grades the papers of 4 unknown classmates, then a final score is calculated by a computer. See the Peer Assessments section on the Coursera website. This technique may even be more effective than grading by a single highly-trained person.
- Money can substitute for ability. The author argued that students will pay for tutors, buy dishwashers or anything else to help get better grades. I do not think banks are going to start handing out loans for dishwashers, so students can have more time for homework. I think MOOCs will allow students to learn without building massive amounts of debt.
Now, I cannot say with certainty whether or not MOOCs will replace traditional colleges. I just did not believe the above reasons are what will determine the outcome.
On a side note, this blog is focused on material about learning to become a data scientist. I think MOOCs are going to be hugely helpful for people wishing to obtain data science skills.
Thanks to Ed for leaving the comment yesterday. I have reposted the comment here because I thought it was so good.
Looks like Coursera added a new data science course entitled “Web Intelligence and Big Data” while nobody was looking! Plus, it starts at the end of the month, for those who can’t wait until the UW Intro to Data Science course to be scheduled.
Here is a link to the Coursera Web Intelligence and Big Data Course. The course is looking to focus on map-reduce and parallel programming applied to data problems.
Daphne Koller of Coursera gives a very intriguing TedTalk. She even throws out some good data. The machine learning class would have to be taught on-campus for 250 years to reach the same amount of students enrolled in the online class last fall. Here are some other impressive numbers.
- 640,000 students
- 190 countries
- 1.5 million enrollments
- 6 million quizzes
- 14 million videos viewed
Those are some impressive numbers. Daphne also provided the following quote.
In many of our [online] courses, the median response time for a question on the question and answer forum was 22 minutes — which is not a level of service I have ever offered to my Stanford students.
Watch the video below. What coursera is doing is completely fascinating. In the second half of the video, she shows some examples of how Coursera is using data to improve education.
With the large increases in college tuition and the ever increasing amount of information available on the internet. It is no wonder many people are trying to learn new skills on their own. Data Science is one of those disciplines that many people are turning to the internet to acquire the necessary skills. The problem is knowing exactly where to find the best material.
If you have the necessary background in math, statistics, and computer science; then it is a good time to learn some data science specific skills. Coursera just recently launched a course specifically devoted to Data Science. It is titled: Introduction to Data Science. The course is being taught by Bill Howe of the University of Washington’s eScience Institute. I believe this course is an excellent place to start. I am very excited about this course.
Other Data Science Learning Resources
Here is a listing of other materials that could be helpful to learning data science.
In case you missed the announcement yesterday, Coursera added 12 new universities and over 100 new courses. The exciting part for people learning data science is a new category of courses: Statistics, Data Analysis, and Scientific Computing. None of the courses have started yet. Most are scheduled for this fall or early 2013. The courses look very good.
Are you excited about these new courses?
My favorite part of the infographic is the demographics portion. Notice the gender, age, income, and education of the users.
Last week, Udacity started a course on Introduction to Statistics, Making Decisions Based on Data. This is a beginners level course on statistics, so it should be accessible to everyone. The course consists of seven units, which are intended to last about one week each. Udacity does not enforce any time limits though. Homework problems are also a part of the course, so you will get a chance to practice what you learn.
Udacity is a learning environment similar to Coursera. I would say the presentation is more focused on the web and the experience is a bit more enjoyable. Courses at both sites are taught by professors from top universities and other leading experts in the field. Both sites offer lots of knowledge for free, and I say try them both. Then let you own personal preference decide which you like better.
What do you think about Udacity? Have you tried it?