The US and Europe store a lot of data!
Seriously, if have a startup that uses hadoop, you should get in touch with Mike Olson of Cloudera.
From Simon Rogers, “What is a Data Scientist?”:
“Someone who can bridge the raw data and the analysis – and make it accessible. It’s a democratising role; by bringing the data to the people, you make the world just a little bit better.”–Simon Rogers
“A data scientist is that unique blend of skills that can both unlock the insights of data and tell a fantastic story via the data.”–DJ Patil
“A data scientist is someone who blends, math, algorithms, and an understanding of human behavior with the ability to hack systems together to get answers to interesting human questions from data.”–Hilary Mason
“A data scientist is a rare hybrid, a computer scientist with the programming abilities to build software to scrape, combine, and manage data from a variety of sources and a statistican who knows how to derive insights from the information within. S/he combines the…
View original post 116 more words
A nice, short, 2 minute video from edCetra Training with some good facts about big data and data analysis.
- The digital universe is 10 times the size it was in 2006
- Greater literacy and cloud computing are helping fuel big data
- 80% of companies data is unstructured – difficult to analyze
- Employees spend 2 hours per day searching for the right information
The Coursera Probabilistic Graphical Models course officially starts today. Sign up and start learning.
Google with Incorrect Spelling
Just the other day, I was googling for “strata conference” information. I noticed that I had mistakenly typed “starta” instead of “strata“. I proceeded to backspace the incorrect letters and fix the spelling. Later in the day, I also noticed that I frequently mistype the letters “ar” and “ra”. That got me thinking.
Does Google Know How Poorly I Spell?
Since Google Instant was released in 2010, Google is now able to track every keystroke I type into the Google Search box. Thus, Google will know when I hit the backspace key. Using some data analysis, Google should be able to answer the following questions:
- What words are most commonly misspelled in Google searches? I would guess the answer would be a good indicator of the most commonly misspelled words in general.
- What words do I misspell the most often?
- How many letters get typed after the misspelling?
- What percentage of Google searches are completed without a backspace?
- Do people in certain parts of the world/country have better spelling?
- How often do people backspace a correctly spelled word, just to then spell it incorrectly? This could be amusing. I would also like to know what words.
Misspelling Don’t Matter
As it turns out, the misspellings don’t really matter that much. Google is smart enough to fix many spelling errors.
What other spelling questions could Google answer?
This posts provides a nice quick overview of 6 machine learning algorithms.
- Decision Trees
- Linear Regression
- Neural Networks
- Bayesian Networks
- Support Vector Machines (SVMs)
- Nearest Neighbor