Microsoft Research Open Data is a search engine for free datasets available from Microsoft Research. The datasets are primarily aimed at Natural Language Processing (NLP) and computer vision. Take a look if you are in need of a dataset for your next project.
A few weeks ago, I attended my first SQLSaturday event. I brought along my camera and was lucky enough to record a couple interviews. This is one of those interviews.
I sat down with John Byrnes and we discussed:
- Where has SQL taken his career?
- What does the SQLSaturday community mean to him?
- Why should someone attend a SQLSaturday event?
I am putting together some of my own resources on Data Strategy. Here are a few of the top resources I found helpful so far.
- What is a Data Strategy? – various definitions of a data strategy
- The 5 essential Components of a Data Strategy – a detailed whitepaper(PDF) from SAS
- How to Create a Successful Data Strategy – a detailed report from MIT
- How Do You Develop a Data Strategy (including 6 steps) – by Bernard Marr, He has created more data strategies than anyone, so his advice is rock-solid. Also, the entire site contains more helpful information.
- Building the AI-Powered Organization – while not specific to data strategy, it fits the topic
Keep watching the blog for more information around my thoughts on Data Strategy.
A number of new impactful open source projects have been released lately.
Open Source Data Science Projects
- Pythia – from Facebook for deep learning with vision and language, “such as answering questions related to visual data and automatically generating image captions “
- InterpretML – from Microsoft, ” package for training interpretable models and explaining blackbox systems “
- ML framework for Julia – from Alan Turing Institute, MLJ is a machine learning toolbox for Julia
- Plato – a conversational AI platform from Uber
Is the list missing a project released in 2019? If so, please leave a comment.
Just released this week, Nuts about Data, is a fun introductory book about the data science process. Meor Amer tells a witty story about squirrels, mining for nuts, teamwork, and survival. It brings together the entire data science lifecycle from asking questions to final storytelling.
It is a quick read and really fun. I highly recommend it and hope you enjoy it.
The competition site Kaggle has recently released some micro-courses aimed at helping people to quickly learn the skills of data science. It is called Kaggle Learn, Faster Data Science Education. It includes courses on:
- Deep Learning
- and more
Check them out to quickly get up to speed. Happy Learning.