R is a hugely popular language among data scientists and statisticians. One of the difficulties with open-source R is the memory constraint. All the data needs to be loaded into a data.frame. Microsoft solves this problem with the RevoScaleR package of the Microsoft R Server. Just launched this week is an EdX course on Analyzing Big Data with Microsoft R Server.
According the syllabus:
Upon completion, you will know how to use R for big-data problems.
Full Disclosure: I work at Microsoft, and the course instructor, Seth Mottaghinejad, is one of my colleagues.
Professor Norm Matloff from the University of California, Davis has published From Algorithms to Z-Scores: Probabilistic and Statistical Modeling in Computer Science which is an open textbook. It approaches statistics from a computer science perspective. Dr. Matloff has been both a professor of statistics and computer science so he is well suited to write such a textbook. This would a good choice of a textbook for a statistics course targeted at primarily computer scientists. It uses the R programming language. The book starts by building the foundations of probability before entering statistics.
If you are looking to get started in the field of data science in 2014, then DataCamp just might be the site for you. DataCamp, formerly DataMind provides a tutorial for interactive data analysis in the browser. The data analysis is taught using R.
The DataCamp platform provides:
Courses to learn data science
A Platform to create new courses
If you are familiar with Codecademy, DataCamp follows very much the same model except for data analysis instead of programming. This is definitely a site to watch in 2014.
Here is my opinion. I prefer R to Python when performing exploratory data analysis. R has so many packages for every possible statistical technique. The plots, although not beautiful by default, are quick and easy to create. However, I prefer Python when I need to pull data from an API or build a software system or website. Python is more than just a statistical analysis tool; it is a complete programming language. I might even end up using Java for a project in the near future.
There does not have to be a clear winner or one single language to use. Use the best tool for the job and get on with your data science. In the end, the world cares more what you produced not whether you used R or Python or something else.