Tag Archives: unsupervised learning

What is Maching Learning

Machine Learning is a term that can mean different things to different people. Andrew Ng, cofounder of Coursera and Professor at Stanford, provides two definitions in his popular Machine Learning Course. The first definition comes from Arthur Samuel around 1959.

Field of study that gives computers the ability to learn without being explicitly programmed.

The second definition comes from Tom Mitchell’s 1997 Machine Learning textbook. This definition is a bit more formal and rigorous. This book defines a well-posed learning problem as:

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

Machine Learning Categories

Machine learning can be broken down into a few categories. The two most popular are supervised and unsupervised learning. A couple other categories are recommender systems and reinforcement learning.

Supervised Learning

Probably the most common category of machine learning, supervised learning is concerned with fitting a model to labeled data. Labeled data is data that has the correct answer supplied. Regression and Classification are the most common types of problems in supervised learning.

Unsupervised Learning

Unsupervised learning deals with unlabeled data. Therefore, the goal of unsupervised learning is to find structure in the data. Clustering is probably the most common technique.

Others

Recommender systems deal with making recommendations based upon previously collected data. Reinforcement learning is concerned with maximizing the reward of a given agent(person, business, etc).

Learn More

Most of the above information comes from the Coursera Machine Learning Course. There is still time to sign up since the first assignments are not due until the end of the week.

Stanford Machine Learning Class – What is covered

A few days ago, I mentioned that the Stanford Machine Learning class will be starting soon.  I thought I should quickly mention some of the topics covered.  The list also serves as a great outline for machine learning.

Supervised Learning

In supervised learning, one has a set of data with features and labels.

  • Linear Regression – one/multiple variables
  • Gradient Descent – a general algorithm for minimizing a function
  • Logistic Regression – This is useful when predicting classification type results.  For example, are you looking for a yes or no result.  Does the patient have cancer?  Will the customer buy my new product?  It can also be helpful for more than 2 results.  What color will a person choose (red, blue, green, silver)?
  • Neural Networks – A learning algorithm that is modeled after the brain.  Think of neurons.
  • Support Vector Machines

Unsupervised Learning

In unsupervised learning, one has a set of data with no features and labels.  Can some structure be found for the data?

  • Clustering – The most popular technique is K-means.
  • PCA (Principal Components Analysis) – speed up a learning algorithm

Anomaly Detection

This section covers methods to determine if data is bad.  Bad data is considered an anomaly.

Recommender Systems

Like the name says, recommender systems are used to make recommendations.  Companies like Netflix use recommender systems to recommend new movies to customers.  LinkedIn also recommends people to connect with.  This is a fairly hot topic in the tech world right now.

  • Content Based(Features)
    • Modified Linear Regression
  • Non-content Based(No Features)
    • Collaborative Filtering
    • Matrix Factorization

If any of these topics sound interesting to you, signup for the Stanford Machine Learning class.  Professor Andrew Ng will do an excellent job explaining the details.