Microsoft Research Open Data is a search engine for free datasets available from Microsoft Research. The datasets are primarily aimed at Natural Language Processing (NLP) and computer vision. Take a look if you are in need of a dataset for your next project.
Looking for datasets for your next project? You are in luck because Google just launched Dataset Search. The name is self-explanatory. Go try it out.
Title says it all, Some datasets for teaching data science
So, you have identified a fascinating new problem to solve with data. You correctly started with a problem and not the data. It seems both beneficial and interesting. Now where do you get the data? Here are 4 steps (in order) for how to find data. 1. Existing Data The best place to start is … Continue reading 4 Steps to Finding Your Data
March 4, 2017 is Open Data Day. Open Data Day is an annual celebration across the globe. Over 300 groups around the world schedule activities to use open data for their communities. See if there is a gathering in your area. Also, the focus this year is on: Open research data Tracking public money flows … Continue reading It is Open Data Day!
Our World in Data is data visualization site for exploring the history of civilization. The site was created by Max Roser. Our World in Data contains tons of information about many aspects of people's lives. It also includes numerous visuals (like the one below) which can be easily shared or embedded on other sites. https://ourworldindata.org/grapher/life-expectancy … Continue reading Our World In Data
Recently, a number of resources for publicly available datasets have been announced. Kaggle becomes the place for Open Data - I think this is big news! Kaggle just announced Kaggle Datasets which aims to be a repository for publicly available datasets. This is great for organizations that want to release data, but do not necessarily … Continue reading Recent Resources for Open Data
DataUSA.io a huge collection of visualizations displaying U.S. public data. It is fun to browse the visualizations, plus there is also an API.
Somewhat lost in the hype of Google's Cloud Machine Learning announcement (which is itself neat), was the release of Google's Public Data Sets. I think this has been previously happening, but now Google has an official location for these public data sets stored in BigQuery. You can: Access and use the data in your applications … Continue reading Google Announces Public Data Sets
Yahoo just released a 1.5 TB dataset of "anonymized user interactions on the news feeds". If you have been looking for a new dataset to analyze, this just might be it. It contains approximately 110 billion rows of data regarding user-news interactions. Happy data exploring!