Tag Archives: data

5 great Data Strategy Resources

I am putting together some of my own resources on Data Strategy. Here are a few of the top resources I found helpful so far.

  1. What is a Data Strategy? – various definitions of a data strategy
  2. The 5 essential Components of a Data Strategy – a detailed whitepaper(PDF) from SAS
  3. How to Create a Successful Data Strategy – a detailed report from MIT
  4. How Do You Develop a Data Strategy (including 6 steps) – by Bernard Marr, He has created more data strategies than anyone, so his advice is rock-solid. Also, the entire site contains more helpful information.
  5. Building the AI-Powered Organization – while not specific to data strategy, it fits the topic

Keep watching the blog for more information around my thoughts on Data Strategy.

Recent Resources for Open Data

Recently, a number of resources for publicly available datasets have been announced.

  • Kaggle becomes the place for Open Data – I think this is big news! Kaggle just announced Kaggle Datasets which aims to be a repository for publicly available datasets. This is great for organizations that want to release data, but do not necessarily want the overhead of running an open data portal. Hopefully it will gain some traction and become an exceptional resource for open data.
  • NASA Opens Research – NASA just announced all research papers funded by NASA will be publicly available. It appears the research articles will all be available at PubMed Central, and the data available at NASA’s Data Portal.
  • Google Robotics Data – Google continues to do interesting things, and this topic is definitely that. It is a dataset about how robots grasp objects (Google Brain Robot Data). I am not overly familiar with this topic, so if you want to know more, see their blog post, Deep Learning for Robots.

For more options of open data, see Data Sources for Cool Data Science Projects Part 1 and Part 2.

Are you aware of any other resources that have been recently announced? If so, please leave a comment.

Dat – Version Controlled Data

Dat is an open source project focusing on data storage. In particular, the project wants to version control data. What is version control? In short it allows for tracking of history associated with something (typically source code files or documents). Dat takes the idea a bit further, and the data is versioned at the row level and not the file level. Plus, it is built for collaboration among teams.

Use the online tutorial to learn more.

Dat is currently in beta. This is going to be a very interesting project to watch. I can see many great use cases.