This is just a short list of a few books that I have have recently discovered online.
- Model-Based Machine Learning – Chapters of this book become available as they are being written. It introduces machine learning via case studies instead of just focusing on the algorithms.
- Foundations of Data Science – This is a much more academic-focused book which could be used at the undergraduate or graduate level. It covers many of the topics one would expect: machine learning, streaming, clustering and more.
- Deep Learning Book – This book was previously available only in HTML form and not complete. Now, it is free and downloadable.
Andrew Ng [Co-Founder of Coursera, Stanford Professor, Chief Scientist at Baidu, and All-Around Machine Learning Expert] is writing a book during the summer of 2016. The book is titled, Machine Learning Yearning. It you visit the site and signup quickly you can get draft copies of the chapters as they become available.
Andrew is an excellent teacher. His MOOCs are wildly successful, and I expect his book to be excellent as well.
Professor Norm Matloff from the University of California, Davis has published From Algorithms to Z-Scores: Probabilistic and Statistical Modeling in Computer Science which is an open textbook. It approaches statistics from a computer science perspective. Dr. Matloff has been both a professor of statistics and computer science so he is well suited to write such a textbook. This would a good choice of a textbook for a statistics course targeted at primarily computer scientists. It uses the R programming language. The book starts by building the foundations of probability before entering statistics.
Understanding Machine Learning: From Theory to Algorithms by Shai Shalev-Shwartz, Associate Professor at the School of Computer
Science and Engineering at The Hebrew University, Israel, and
Shai Ben-David, Professor in the School of Computer Science at the
University of Waterloo, Canada. The book looks very thorough. Below is just a sampling of the topics covered.
- Bias-Complexity Tradeoff
- Model Selection
- Support Vector Machines
- Decision Trees
- Neural Networks
- Dimensionality Reduction
- Feature Selection and Generation
- Advanced Theory
- And LOTS LOTS more….
A great read for people without an extensive math, statistics or computer science background. And still an interesting read for those people.
The book includes tons of non-technical descriptions for data science terms.
You can download a copy of the book on SlideShare, or you can purchase a paperback copy via Lulu.
A new edition of Mining Massive Datasets is now available. It is used for a number of data mining courses at colleges across the US (and globe). Here are just a few of the topics from the book.
- Recommendation Systems
- Dimensionality Reduction
- Social Network Analysis
Yoshua Bengio, Ian Goodfellow and Aaron Courville are writing a deep learning book for MIT Press. The book is not yet complete, but the drafts of the chapters are all available online. The authors are also collecting comments about the chapters before the book goes to press.
The book is broken into 3 sections:
- Math and Machine Learning Fundamentals
- Modern Deep Neural Networks
- Current Research in Deep Learning
The book is very technical and probably suitable for a graduate level course. However, if you have the time and interest, resources such as this are highly valuable.
O’Reilly just published a free ebook profiling 15 influential women in data science, Women in Data. The book is written by Cornelia Levy-Bencheton.
The following women are profiled in the book:
- Michele Chambers, COO of RapidMiner
- Camille Fournier, CTO of Rent the Runway
- Carla Gentry, CEO of Analytical Solution
- Kelly Hoey, Speaker and Early-stage Investor
- Cindi Howson, VP of Research at Gartner
- Neha Narkhede, Co-founder of Confluent
- Claudia Perlich, Chief Scientist at Dstillery
- Kira Radinsky, Co-founder of SalesPredict
- Gwen Shapira, Software Engineer at Cloudera
- Laurie Skelly, Data Scientist at Datascope
- Kathleen Ting, Technical Account Manager at Cloudera
- Renetta Garrison Tull, Associate Vice Provost of UMBC
- Hanna Wallach, Researcher at Microsoft
- Alice Zheng, Director of Data Science at Dato
- Margit Zwemer, Founder of LiquidLandscape
Markets for Good, an organization focused on performing data science for the social sector, recently released an ebook highlighting their 17 most influential blog posts. The ebook is titled, Markets for Good Selected Readings: Making Sense of Data and Information in the Social Sector.
Here is just a small sampling of the topics you can read about:
- 3 Reasons Why Open Data Will Change the World
- Let Our Data Define Us
- Put Your Data Where Your Mouth Is
If you are interested in how data can be used to help the world, this ebook is a good place to start.