Amazon appears to believe that in about 20 years, nearly all enterprises will run their computing systems in the cloud. I would have to agree with them. This article is worth a look, especially the paragraph about Pinterest running completely on AWS.
I recently ran across the following articles about data visualization.
- iPad for Visualizations – Microstrategy has created an iPad app thats can be used for the analysis of bigdata.
- Big Picture of BigData – How visualization can be applied to the 3 Vs of bigdata (Volume, Velocity, Variety)
- The Explanatory Power of Data Points – Data points are important, but so is a story.
Good visualizations are an important part of the storytelling for data science.
- What types of media store the most data?
- Where are the world’s 10 largest data centers?
Check out this infographic for the answers.
See the source at Mozy
To make things even more awesome, GitHub is also hosting a Data Challenge. The challenge is to play around with data and create the best visualization possible. You better start now, because the competition ends May 21st. I am not familiar with Google BigQuery so this might be a good time to learn.
Data-Intensive Text Processing with MapReduce is a Free online (PDF) textbook about text processing on large amounts of data. The 1st edition has been available for a couple of years, and a 2nd edition is in the works. Here is quick overview of some of the topics.
- Graph Algorithms
- Text Processing
Happy Reading (and Text Processing)!
The above link goes to a great story about using data science. What makes the story great is the company. It is not a science company or a tech startup. It is a truck management company. Data science is truly reaching all industries.
This report by McKinsey & Company is frequently referenced, so I thought I should post a link to it. It includes the following quote about the lack of talent to fill Big Data positions.
By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.
This quote is why now is a great time to be learning to become a data scientist.