Data Mining MOOC

The University of Waikato in New Zealand will be offering a free online course titled, Data Mining with Weka.

Weka is a widely-used toolkit for data mining and machine learning. The University of Waikato developed the toolkit.

Don’t wait too long to sign up, the course starts September 9, 2013.

Here is a video of the instructor of the course providing a brief overview.

7 Important Data Science Papers

It is back-to-school time, and here are some papers to keep you busy this school year. All the papers are free. This list is far from exhaustive, but these are some important papers in data science and big data.

Google Search

  • PageRank – This is the paper that explains the algorithm behind Google search.

Hadoop

  • MapReduce – This paper explains a programming model for processing large datasets. In particular, it is the programming model used in hadoop.
  • Google File System – Part of hadoop is HDFS. HDFS is an open-source version of the distributed file system explained in this paper.

NoSQL

These are 2 of the papers that drove/started the NoSQL debate. Each paper describes a different type of storage system intended to be massively scabable.

Machine Learning

Bonus Paper

  • Random Forests – One of the most popular machine learning techniques. It is heavily used in Kaggle competitions, even by the winners.

Are there any other papers you feel should be on the list?

Wireless Communication Without a Battery

The University of Washington is developing wireless devices that can operate without a battery. The devices operate by reflecting radio waves in the air. Although I can think of many uses for these devices, the article points out one in particular.

For example, sensors placed in a bridge could monitor the health of the concrete and steel, then send an alert if one of the sensors picks up a hairline crack.

After reading that, I was struck at the amount of data that could collected. Just think of all the bridges in your city/state/country. All this data is going to need analysis. Which alerts require immediate action? This sounds like a bigdata problem to me.

How can you imagine these devices being used?

International School of Engineering Programs Beginning Soon

I recently received the following information.

International School of Engineering is announcing their 3rd batch of live e-Learning certificate programs starting 4-Sep-2013 in “Engineering Big Data with R and Hadoop Ecosystem” and “Essentials of Applied Predictive Analytics” (http://goo.gl/kHckP).

These programs helped Engineers and Managers transform into Hadoop Developers/Data Scientists, get industry certifications, revolutionize their workspace and establish exciting careers.

Highlights:

•Taught by experts who are Carnegie Mellon, Johns Hopkins and Stanford University’s alumni with Fortune 50 experience
•Applied and interactive classes
•Classes ranked among the top 1% and 5% of all classes in the world in piazza
•1/3rd the cost of other similar programs
•95% Success with Cloudera and EMC2

For details visit http://goo.gl/bPJEF

For any queries mail us at elearning@insofe.edu.in or call us at +91 9502334561/2/3

Undergraduate Programs in Data Science

While most of the degrees on the list of Colleges with Data Science Degrees are master’s degrees, there are a few schools offering data science as an undergraduate program.

3 Top Data Scientists Change Jobs

Three of the Top Data Scientists have recently changed jobs.

Name Former Company New Company Announcement
Hilary Mason Bit.ly Data Scientist in Residence @ Accel Partners Techcrunch
DJ Patil Greylock Partners VP of Product @ RelateIQ Techcrunch
Monica Rogati LinkedIn VP of Data @ Jawbone Techcrunch