Tag Archives: apache

Learn Apache Spark this Summer with edX

edX has just announced a new series of Big Data courses. The series consists of 2 courses focused around Apache Spark. If you are not familiar with Spark, it is a very fast engine for large-scale data processing. It claims to perform up to 100 times faster than hadoop. Here are the 2 courses:

  1. Introduction to Big Data with Apache Spark
  2. Scalable Machine Learning

The first course starts June 1, 2015, and lasts four weeks. The second course starts in late June and lasts five weeks.

The courses are free but verifiable certificates can be purchased for $50 per course.

If you have been hoping to learn Spark, this might be just the opportunity your were waiting for.

Introduction to Apache Mahout Slides

Although Apache Mahout is not an absolute beginners topic in data science, this slide deck provides a nice overview of machine learning, and it provides some excellent links at the end. In case you are wondering, Mahout is a scalable machine learning library for very large data sets.

The slides were prepared by Varad Meru.