Learn to Analyze Big Data with R – Free Course

R is a hugely popular language among data scientists and statisticians. One of the difficulties with open-source R is the memory constraint. All the data needs to be loaded into a data.frame. Microsoft solves this problem with the RevoScaleR package of the Microsoft R Server. Just launched this week is an EdX course on Analyzing … Continue reading Learn to Analyze Big Data with R – Free Course

Free Stats book for Computer Scientists

Professor Norm Matloff from the University of California, Davis has published From Algorithms to Z-Scores: Probabilistic and Statistical Modeling in Computer Science which is an open textbook. It approaches statistics from a computer science perspective. Dr. Matloff has been both a professor of statistics and computer science so he is well suited to write such … Continue reading Free Stats book for Computer Scientists

Introduction to Microsoft R Open (Webinar)

Tomorrow, January 28, 2016, David Smith will present a webinar titled Introduction to Microsoft R Open. David is the R Community Lead at Microsoft. The webinar will discuss: Introduction to R History of R Enhancements of Microsoft R Open (Microsoft’s enhanced distribution of open-source R) CRAN Time Machine Reproducible Data Analysis If you are looking … Continue reading Introduction to Microsoft R Open (Webinar)

Learn Data Science Online with DataCamp

If you are looking to get started in the field of data science in 2014, then DataCamp just might be the site for you. DataCamp, formerly DataMind provides a tutorial for interactive data analysis in the browser. The data analysis is taught using R. The DataCamp platform provides: Courses to learn data science A Platform … Continue reading Learn Data Science Online with DataCamp

International School of Engineering Programs Beginning Soon

I recently received the following information. International School of Engineering is announcing their 3rd batch of live e-Learning certificate programs starting 4-Sep-2013 in “Engineering Big Data with R and Hadoop Ecosystem” and “Essentials of Applied Predictive Analytics” (http://goo.gl/kHckP). These programs helped Engineers and Managers transform into Hadoop Developers/Data Scientists, get industry certifications, revolutionize their workspace … Continue reading International School of Engineering Programs Beginning Soon

R Commands for Cleaning Data

This post is notes from the Coursera Data Analysis Course. Here are some R commands that might serve helpful for cleaning data. String Replacement sub() replace the first occurrence gsub() replaces all occurrences Quantitative Variables in Ranges cut(data$col, seq(0,100, by=10)) breaks the data up by the range it falls into, in this example: whether the … Continue reading R Commands for Cleaning Data