Tag Archives: coursera

New Coursera Specializations for Data Scientists

Coursera just launched 18 new specializations. Not all of them are relevant to data science, but here are 3 of the specializations that pertain to data science.

All of these specializations will provide great content. They are quite specific though, but if your goals match the topics, it is hard to beat Coursera.

Are you excited about any of the new specializations?

Process Mining Course via Coursera

Process mining is a bridge between data mining and business process modeling. Process Mining can be used to study event and log files to extract meaning.

The Coursera course, Process Mining: Data science in Action, starts November 12, 2014.

Free Big Data MOOC

Coursera is offering the course Mining of Massive Datasets from Stanford University. This is a popular course at Stanford and goes along with the book by the same name. The FREE course starts September 29, 2014, and runs for 7 weeks. The prerequisites are some SQL, algorithms, and data structures knowledge.

Thanks to David Trower for the tip on this course

Coursera's Intro to Data Science Returns

The Coursera platform will once again be hosting the amazing course Introduction to Data Science from the University of Washington. The course will
start June 30, 2014. It is free, and it will last 8 weeks.

Having taken the course the previous time it was offered, I would highly recommend the course for anyone interested in digging into data science.

Coursera Data Science Specialization – Started This Week

Coursera now offers a Data Science Specialization. The courses are taught by Johns Hopkins University. If you would like to earn a Specialization Certificate, each course will cost you $49 otherwise you can take the courses for free without earning the certificate.

Best of all, the first course started this week (April 7, 2014).

The Specialization consists of the following courses. You must complete all courses for the certification. If you are not interested in the certificate, you can take any or all the courses.

  1. The Data Scientist’s Toolbox
  2. R Programming
  3. Getting and Cleaning Data
  4. Exploratory Data Analysis
  5. Reproducible Research
  6. Statistical Interence
  7. Regression Models
  8. Practical Machine Learning
  9. Developing Data Products
  10. Capstone Project

Coursera Machine Learning Starts (Again) Today

The excellent and popular Machine Learning class from Coursera and Andrew Ng starts today. This is the 3rd or 4th run of the course.

Coursera Class on Recommender Systems

In about 1 month, the course, Introduction to Recommender Systems, will begin on Coursera. The course is being offered by the Computer Science and Engineering Department from the University of Minnesota.

The course is 14 weeks long and has 2 tracks:

  1. Programming Track – 6 different recommender systems will be programmed
  2. Concept Track – great for people that want to know about recommender systems, but don’t want program

Recommender systems are an important part of data science, and this course looks to provide an excellent in-depth overview of the topic.

What is Maching Learning

Machine Learning is a term that can mean different things to different people. Andrew Ng, cofounder of Coursera and Professor at Stanford, provides two definitions in his popular Machine Learning Course. The first definition comes from Arthur Samuel around 1959.

Field of study that gives computers the ability to learn without being explicitly programmed.

The second definition comes from Tom Mitchell’s 1997 Machine Learning textbook. This definition is a bit more formal and rigorous. This book defines a well-posed learning problem as:

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Machine Learning Categories

Machine learning can be broken down into a few categories. The two most popular are supervised and unsupervised learning. A couple other categories are recommender systems and reinforcement learning.

Supervised Learning

Probably the most common category of machine learning, supervised learning is concerned with fitting a model to labeled data. Labeled data is data that has the correct answer supplied. Regression and Classification are the most common types of problems in supervised learning.

Unsupervised Learning

Unsupervised learning deals with unlabeled data. Therefore, the goal of unsupervised learning is to find structure in the data. Clustering is probably the most common technique.


Recommender systems deal with making recommendations based upon previously collected data. Reinforcement learning is concerned with maximizing the reward of a given agent(person, business, etc).

Learn More

Most of the above information comes from the Coursera Machine Learning Course. There is still time to sign up since the first assignments are not due until the end of the week.

Coursera Machine Learning Starts Today

Andrew Ng’s wonderful Coursera course on machine learning starts today. It is not too late to sign up.

Levels of Data Analysis

The list is ordered according to the level of difficulty.

  • Descriptive just describe the data, common for census type of data
  • Exploratory find relationships that were not clear beforehand, useful for defining future studies, remember correlation does not imply causation
  • Inferential use a small dataset to say something about a larger population, most common goal of statistical analysis
  • Predictive use data from some object to predict something(values) for another object, important to measure the right values and to use as much data as possible
  • Causal what happens to one variable when you force another variable to change, usually requires a randomized study, this is the gold standard of data analysis
  • Mechanistic understanding the exact changes in variables that lead to changes in other variables for individual objects, typically from engineering and physical sciences, data analysis can be used to infer the parameters if the equations are known

This list comes from information presented in the first week of the Coursera Data Analysis class.