A few professors from Stanford University have released version 1.1 of their textbook, Mining of Massive Datasets. The book has been created from materials used for a couple of Stanford computer science classes including large-scale data-mining and web mining. The book looks excellent and really focuses on the analysis of data at a large scale. Some people would use the word bigdata. Below is a list of some of the topics covered in the textbook.
- data mining
- recommender systems
- and more
The book is free for download, or available from Cambridge University Press.
As the Olympics are coming to a close, here is one more infographic. There are a lot of nice numbers here. The athlete caloric intake section is fun. Michael Phelps must be eating all the time. There is also a section about Acer computers. Acer installed 11,000 computers and 900 servers. Other than competition results, what other data was being collected? I would love to hear more about that Do you have any idea about what other data is collected at the olympics?
O’Reilly just announced the creation of a new conference focusing on data science and healthcare. It is named StrataRx 2012. The conference will take place in San Francisco on October 16-17, 2012.
Thanks to Ed for leaving the comment yesterday. I have reposted the comment here because I thought it was so good.
Looks like Coursera added a new data science course entitled “Web Intelligence and Big Data” while nobody was looking! Plus, it starts at the end of the month, for those who can’t wait until the UW Intro to Data Science course to be scheduled.
Here is a link to the Coursera Web Intelligence and Big Data Course. The course is looking to focus on map-reduce and parallel programming applied to data problems.
Neo Technology, the company behind the graph database Neo4j, is hosting a webinar on Thursday. Pablo Pareja from the Bio4j project will provide an overview of bioinformatics and neo4j, as well as some applications.
Bioinformatics can be viewed as data science for biology. Bioinformatics was cool before data science was even a term.
If you are interested in learning more about bioinformatics and graph databases, the register for this webinar and start learning.