Yesterday’s announcement included a few main highlights:
- NIH will make 200TB of human genetic variation data freely available on Amazon Web Services
- NSF will provide $2M in support of undergraduate education for studying graphical and visualization techniques of bigdata
- DoD will announce some prize competitions in the coming months
- Numerous other projects to increase data analysis across the US Government
Here is one note I did find interesting (or funny). One of the speakers mentioned the need for workers with bigdata skills. He mentioned the workforce needs 159,000 workers with data skills. Then a couple minutes later he mention a need for data savvy managers. He mentioned the workforce will need 1,500,000 managers with data knowledge. I just thought those 2 numbers did not match up well.
Here are a few more links to other articles summarizing the Research Initiative:
Later today (2-3:45 pm ET), the White House will announce a $200M BigData research Initiative. Appropriately, it is being named the “Big Data Research and Development Initiative.”
The announcement will be broadcast live on Science360.
See this PDF for a listing of bigdata projects within the US Government.
I am excited to see how this will affect the education and training of data scientists.
What are your thoughts? Is this a good idea?
Having trouble keeping track of what schools offer what courses for free online? Problem solved!
Class Central maintains a updated list of courses from Coursera(Stanford), Udacity, MITx, and others as they become available. Not all of the courses are related to data science, but I still thought it was valuable to share the link.
Check it out and start learning.
Data Without Borders
Jake Porway started Data Without Borders because he attended a hack-a-thon and the groups came up with apps that didn’t really better the world very much. I believe he used the word, “unfulfilling” to describe the apps. He decided to create a way to provide organizations (Government or non-profit) with access to data scientists. His thinking goes like this. There are lots of data scientists that love to work with data. There are great organization with lots of data. If the two can be matched together, what amazing things can be done? Data Without Borders hopes to find out.
Data Without Borders organizes a bunch of DataDives, which are weekend hack-a-thons that match up a group of data scientists and developers with data.
Jake concluded with some wonderful remarks:
What if we started using data not just to make better decisions about what kind of movies we wanted to see? What if we started using data to make betters decisions about what kind of a world we wanted to see?
What is Data Without Borders looking for?
Jake’s Presentation at PopTech
Previously I mentioned that online statistics learning resources are not abundant.
Well, here is a new online book for learning statistics. It is geared towards programmers, and it looks to be a great fit for people wanting to learn data science. Here is a small excerpt from the Preface:
It emphasizes the use of statistics to explore large datasets.
I have only had time to quickly browse the book, but it looks quite good.
Think Stats: Probability and Statistics for Programmers
(The book has a Creative Commons license, so it is free and OK to download)
By 2075, digital data will surpass all the data in human brains.
The US and Europe store a lot of data!
Seriously, if have a startup that uses hadoop, you should get in touch with Mike Olson of Cloudera.
A nice, short, 2 minute video from edCetra Training with some good facts about big data and data analysis.
- The digital universe is 10 times the size it was in 2006
- Greater literacy and cloud computing are helping fuel big data
- 80% of companies data is unstructured – difficult to analyze
- Employees spend 2 hours per day searching for the right information