Tag Archives: textbook

Elements of Statistical Learning Textbook (Free)

The Elements of Statistical Learning textbook is available for free. It is a classic, widely-used textbooks for statistics and machine learning. Here is a far from complete list of some of the topics:

  • Supervised Learning
  • Linear/Logistic Regression
  • Regularization
  • Model Selection
  • Trees
  • Neural Networks
  • Support Vector Machines
  • Random Forests
  • Unsupervised Learning
  • Clustering

As you can see, the book is quite extensive.


Note: This book has been available for a quite a while, but I realized I have not added a link to it on my blog.

Free Textbook: Mining of Massive Datasets

A few professors from Stanford University have released version 1.1 of their textbook, Mining of Massive Datasets. The book has been created from materials used for a couple of Stanford computer science classes including large-scale data-mining and web mining. The book looks excellent and really focuses on the analysis of data at a large scale. Some people would use the word bigdata. Below is a list of some of the topics covered in the textbook.

  • data mining
  • map-reduce
  • clustering
  • recommender systems
  • and more

The book is free for download, or available from Cambridge University Press.

Large Scale Text Processing with MapReduce: A Free Textbook

Data-Intensive Text Processing with MapReduce is a Free online (PDF) textbook about text processing on large amounts of data. The 1st edition has been available for a couple of years, and a 2nd edition is in the works. Here is quick overview of some of the topics.

  • Mapreduce
  • Graph Algorithms
  • Text Processing

Happy Reading (and Text Processing)!

Think Stats – An Online Statistics Book For Programmers

Previously I mentioned that online statistics learning resources are not abundant.

Well, here is a new online book for learning statistics. It is geared towards programmers, and it looks to be a great fit for people wanting to learn data science.  Here is a small excerpt from the Preface:

It emphasizes the use of statistics to explore large datasets.

I have only had time to quickly browse the book, but it looks quite good.

Think Stats: Probability and Statistics for Programmers

(The book has a Creative Commons license, so it is free and OK to download)