Tag Archives: hadoop

Learn Apache Spark this Summer with edX

edX has just announced a new series of Big Data courses. The series consists of 2 courses focused around Apache Spark. If you are not familiar with Spark, it is a very fast engine for large-scale data processing. It claims to perform up to 100 times faster than hadoop. Here are the 2 courses:

  1. Introduction to Big Data with Apache Spark
  2. Scalable Machine Learning

The first course starts June 1, 2015, and lasts four weeks. The second course starts in late June and lasts five weeks.

The courses are free but verifiable certificates can be purchased for $50 per course.

If you have been hoping to learn Spark, this might be just the opportunity your were waiting for.

Advertisements

Strata/Hadoop World 2014 Live Stream Starting Soon

Strata + Hadoop World 2014 is currently going on in New York City this week. Some of the keynotes will be live streamed this morning. The live streaming starts at 8:45 Eastern Standard Time. Also, keynotes will be live streamed tomorrow (Oct. 17, 2014) as well.

The keynotes are always great, and the line-up this year includes speakers from Cloudera, Pinterest, Platfora, Intel and others. So, if you did not make it to NYC, the live stream is the next best thing.

Enjoy the keynotes!

What is a “Data Lake”?

I have frequently been hearing the term data lake. Being the curious person that I am, I decided to go in search of a definition.

Currently, the company Pivotal is responsible for marketing the term. However, I believe the term was originally coined by Dan Woods of CITO Research back in 2011. Anyhow, here is a basic description of a data lake.

A data lake is an information system consisting of the following 2 characteristics

  1. A parallel system able to store big data
  2. A system able to perform computations on the data without moving the data

Currently, Hadoop is the most common technology to implement a data lake, but it might not be that way forever. Thus it is important to distinguish the difference between Hadoop and a data lake. A data lake is a concept, and Hadoop is a technology to implement the concept.

The following is a recent Strata Talk by Kaushik Das of Pivotal. He discusses how a data lake can be used to create the digital brain.

International School of Engineering Programs Beginning Soon

I recently received the following information.

International School of Engineering is announcing their 3rd batch of live e-Learning certificate programs starting 4-Sep-2013 in “Engineering Big Data with R and Hadoop Ecosystem” and “Essentials of Applied Predictive Analytics” (http://goo.gl/kHckP).

These programs helped Engineers and Managers transform into Hadoop Developers/Data Scientists, get industry certifications, revolutionize their workspace and establish exciting careers.

Highlights:

•Taught by experts who are Carnegie Mellon, Johns Hopkins and Stanford University’s alumni with Fortune 50 experience
•Applied and interactive classes
•Classes ranked among the top 1% and 5% of all classes in the world in piazza
•1/3rd the cost of other similar programs
•95% Success with Cloudera and EMC2

For details visit http://goo.gl/bPJEF

For any queries mail us at elearning@insofe.edu.in or call us at +91 9502334561/2/3