The above link goes to a great story about using data science. What makes the story great is the company. It is not a science company or a tech startup. It is a truck management company. Data science is truly reaching all industries.
The Non-Profit Organization Data Without Borders has renamed itself to DataKind.
DataKind is an organisation that matches data from non-profit and government organisations with data scientists. DataKind hosts weekend DataDives and they are planning to build a DataCorps. See a previous post, Use Data Science to Help The World, to find out more about what DataKind is all about.
This report by McKinsey & Company is frequently referenced, so I thought I should post a link to it. It includes the following quote about the lack of talent to fill Big Data positions.
By 2018, the United States alone could face a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.
This quote is why now is a great time to be learning to become a data scientist.
Hans Rosling, co-founder of GapMinder Foundation, provides a good Ted Talk about HIV in the world. He does an excellent job of using data to highlight countries(not continents) that have the most serious problems. He also states some reasons why HIV/AIDS is not dropping off as quickly in some rich countries.
Here is a second Ted Talk by Hans Rosling. This one is a bit more entertaining, but it still contains excellent use of data. Hint: He shows why the washing machine is so important.
Since recently announcing $16M in funding, Coursera has been making quite a bit of noise. Last fall, Stanford University decided to freely offer a couple computer science classes online. The response was huge, and that led to the creation of Coursera.
The courses are no longer limited to computer science, and Stanford is no longer the only school involved. Here is a list of academic areas being offered and another list with the schools involved.
- Healthcare, Medicine, and Biology
- Economics, Finance, and Business
- Humanities and Social Sciences
- Mathematics and Statistics
- Computer Science
- Society, Networks, and Information
Although, not all of the courses will be directly related to data science, many of them are very close. Naturally Math, Statistics, and Computer Science areas have direct relations to data science. However, some of the other areas such as Networks, Biology, and Economics are some of the most popular application areas for data science. This is very exciting. My only concern is that the courses are a bit too much like traditional university courses with specific start/end dates and homework due dates. It will be interesting to see if the course structures change over time.
Anyhow, the following courses are starting today. Signup and start learning.
- Machine Learning – A major focus area of data science
- Computer Science 101 – probably a good starting point if you don’t know how to program
- Compilers – good for understanding how programming languages work
- Automata – hard to explain in 1 line, but it contains some fundamental principles in computer science
- Intro to Logic – learn to reason systematically
- Computer Vision – not sure of the relation to data science, but I am sure there is one, if you know, please leave a comment
Are you going to enroll in any of these courses?
Events are happening across the globe.
The above article provides a nice brief overview of 5 clustering algorithms.
- Hierarchical Clustering
- Fuzzy C-Means
- Multi-Gaussian with Expectation-Maximization
- Density-based Cluster
This goes well with a previous post about 6 Machine Learning Algorithms.
- Why Data Scientists are Tech’s Rock Stars?
- What does a Data Scientist do?
- How to become a Data Scientist?
This is a very entertaining Ted Talk about how what books can tell us over time. Just watch the video.