Mohammed J. Zaki, Computer Science Professor at RPI, and Wagner Meira Jr., Computer Science Professor at Universidade Federal de Minas Gerais, have written the textbook Data Mining and Analysis: Fundamental Concepts and Algorithms. The book is currently available as a PDF download.
Based upon the chapters, the book looks very good. It contains large sections on data analysis, clustering, and classification. The final book will be published sometime in 2014.
Alteryx is offering the book, Big Data Analytics For Dummies, for free. If you are new to the term big data, this book provides a brief (about 40 pages) overview of the topic and what big data should be able to do for your company.
You have to register, but it is worth it for the free book.
This is an online, HTML version of the book, Natural Language Processing with Python. The book is a companion for NLTK which is a free, open source toolkit, written in python, for Natural Language Processing (NLP).
David Easley and Jon Kleinberg, both of Cornell University, have placed the contents of their social networking textbook online. All 24 chapters of Networks, Crowds, and Markets: Reasoning About A Highly Connected World are available for download. This could serve as a wonderful learning resource or an excellent reference tool. The material covered is quite extensive, and it provides many real applications of social network analysis. Not all the examples are online social networks.
In case you missed the announcement yesterday, Coursera added 12 new universities and over 100 new courses. The exciting part for people learning data science is a new category of courses: Statistics, Data Analysis, and Scientific Computing. None of the courses have started yet. Most are scheduled for this fall or early 2013. The courses look very good.
Are you excited about these new courses?
Springer has just release a new data science journal named EPJ Data Science. The journal is open access which means that articles are freely available online. That catch is that people whom submit articles must pay a fee for publication. Sometimes the fee will be covered by the author’s university or company. Anyhow, if you are interested in data science research, this journal is probably worth following.
Are you interested in academic journals?
Does this excite you?
May Strata 2012 will occur online this year. The cost is zero, and the event is tomorrow (May 16, 2012). The only catch is that you must register first. The entire conference is scheduled to take place in the morning, so the format looks quick. Judging from other Strata videos I have seen, I would guess this will be an event of high quality.
Thanks to DataGeeks-MSP for alerting me to the conference.
OpenIntro is an organisation that was started to create a free and open source introductory statistics textbook. The book is available as a free PDF download, or it can be purchased in paperback from Amazon for less than $10. If you want to learn statistics or need a little refresher, check it out.
Having trouble keeping track of what schools offer what courses for free online? Problem solved!
Class Central maintains a updated list of courses from Coursera(Stanford), Udacity, MITx, and others as they become available. Not all of the courses are related to data science, but I still thought it was valuable to share the link.
Check it out and start learning.
Also this spring, Stanford will be offering two more courses that might benefit a person learning data science.
If you feel these 2 classes might be a bit too advanced at this point, then here are a couple more fundamental computer science classes. If you are new to computer science and programming, CS 101 would be a good choice. If you are not not as new to computer science or might be a bit rusty on your core algorithms knowledge, then Design and Analysis of Algorithms 1 might be appropriate.
Actually, the courses are no longer being offered by just Stanford. A few others schools have been added. The courses are now being offered through Coursera. Plus all the courses are free.