Category Archives: Learn Data Science

This is a category for all things related to learning data science.

Microsoft Weekly Data Science News for March 02, 2018

Here are the latest articles from Microsoft regarding cloud data science products and updates.

Conversations with future data scientists (YouTube Playlist)

Last week I spent some time chatting with future data scientists. I set up a camera to record some of the answers. Below are a few of the questions addressed.

  • How did I transition to data science?
  • Why start a data science project?
  • Should a new person focus on machine learning or deep learning?
  • What is an example data science project?
  • Why is real-time important?

Hopefully the videos and answers are helpful to others. Enjoy! And I kept most of the videos fairly short. If you enjoy the videos, please subscribe to the YouTube channel, Learn Data Science. Also, if you have a question you would like answered, please leave a comment below.

Data Science Live Book

Pablo Casas has published a book freely available online, Data Science Live Book. To quote from the book,

It is a book about data preparation, data analysis and machine learning.

The book is open source, and the code examples are written in R.

Learn Data Science Youtube Channel

Transitioning to a career in data science can be full of unanswered questions. I am here to help you get answers to those questions.

Today, I am launching a new Youtube Channel, Learn Data Science. I will select a question and make a video providing an answer. I will provide some of the answers, and I may have some guests answer the questions as well.

If you have any questions about becoming a data scientist, please leave a comment.

Columbia University Applied Machine Learning Online

Columbia University’s course Applied Machine Learning Spring 2018 by Andreas C. Müller has all the lecture notes, slides, homework, and videos posted online.

Andreas is also the author of the book Introduction to Machine Learning with Python.

DataCamp Community News Site

DataCamp recently launched a new community site, Data Science News, for sharing and discovering data science news. It is similar to Hacker News if you are familiar with that site.

#Datathon2018: An online Data Hackathon

Data Science Society is organizing the first ONLINE #Datathon2018 – a 48-hours challenge for all people passionate about data, willing to experiment with new types of data, and expand their network of connections in the field globally.

The Online #Datathon2018

The Datathon is one of the initiatives of Data Science Society, happening for the third time, this time fully digital!

The participants will have the chance to work on real- world cases of top companies such as Telenor, Receipt Bank, Ontotext, Kaufland, VMWare, ZenCodeo, and А Data Pro, while working and communicating on an internal platform, supported by the services of the best cloud providers – IBM, Microsoft and Amazon.

NLP,Computer Vision and AI

At the #Datathon2018 are expected many data passionates coming from a variety of backgrounds and interests. Academics and practitioners will have the chance to bring their knowledge in action in three categories of cases – NLP, Artificial Intelligence and Computer Vision. Go out of the theory and see the data from a different perspective while collaborating in a team of like-minded people and learning to deal with unexpected issues regarding the real-world data.

The Mentors

All data scientists, mathematicians, data analytics experts, software engineers and data enthusiasts will have the chance to dive deep in the data and be mentored by internationally renowned experts.

The #Datathon2018 is happening between 9th and 11th of February and the registration is open

Join the Challenge now!

Google Colaboratory

Google has recently released a Jupyter Notebook platform called Google Colaboratory. You can run Python code in a browser, share results, and save your code for later. It currently does not support R code.

4 Steps to Finding Your Data

So, you have identified a fascinating new problem to solve with data. You correctly started with a problem and not the data. It seems both beneficial and interesting. Now where do you get the data? Here are 4 steps (in order) for how to find data.

1. Existing Data

The best place to start is the data you currently have. What data does your organization currently collect? How can you get access to that? Start there.

2. OpenData

Then look for industry specific open data (data that is freely available). Many industries publish data monthly or yearly. Also, data is frequently available with government funded research. If industry specific data is not available, what other related data is openly available? It is often beneficial to augment your existing data with open data. Here are some lists of open data, Open Data, Part 1, Open Data, Part 2. There are also many others available.

3. API

Next, explore the opportunity of using an API to access data. Many application have existing API access. An API (Application Programming Interface) allows a person to write some computer code to pull machine-readable data from an existing system. Some are freely available, others have associated costs. Many allow the data to be available in near real-time. There are numerous API’s available where you can pull in data. Check with some of your existing applications. They are available for weather, stocks, news, social media, web analytics, and many more.

4. Create The Data

The last resort is to begin the creation of data. An obvious choice is to create a survey. Be careful because surveys can be trickier than initially thought. You often do not get good representation and the result is biased data. Another way to collect data is to change your application to begin collecting the desired information. You may even have to build a new application. Sometimes an entire process needs to be created or modified to include methods to collect the data. This last step usually takes the longest and costs the most money.

Tips for Data Science Students – FB Live

Awhile ago, I recorded a Facebook Live on Tips for Data Science Students. It goes along with the following post: Getting the most from your Data Science Masters Program.