Category Archives: Learn Data Science

This is a category for all things related to learning data science.

Free Reinforcement Learning Textbook

Reinforcement Learning: An Introduction by Rich Sutton and Andrew Barto was recently released on October 15, 2018. The authors were kind enough to put a late draft version of the book online as a PDF. If you are hoping to learn about Reinforcement Learning, this is a great place to start.

Full text is available on a Google Drive at Reinforcement Learning. Take a look.

Good Luck and Happy Learning.

Advertisements

Data Science in 30 Minutes with Jake Porway of DataKind

This month, The Data Incubator is hosting another free webinar on data science. This time it features an interview with Jake Porway, founder of DataKind. Jake and DataKind are doing amazing things, so I hope you can check out the webinar.

Below are the webinar details:

Data Science in 30 Minutes: Using Data Science in the Service of Humanity

Abstract: With his non-profit, DataKind, Jake Porway connects data scientists with social causes. Picture Doctors Without Borders for data nerds—with machine learning and AI engineers parachuting in to help the UN with humanitarian tasks, like tracking virus outbreaks using mobile data. “Data is like a bucket of crude oil,” Porway says. “Potentially great, but only if someone knows how to refine it (data scientists) and someone else has vehicles that will run on it (the social sector).” In this talk, Porway discusses the strategies of DataKind, its projects and the future of big data to service humanity.

The free online webinar is:
Thu, November 15, 2018 @ 5:30 PM – 6:30 PM EST
You can register at Data Science in 30 minutes.

Azure Functions for Data Science

Data Scientists do more than build fancy AI and machine learning models. They often times need to get involved with the data acquisition process. It is common for data to be pulled from other databases or even an API. Plus, the models need to be deployed. These tasks fall to the data scientist to solve (unless there is a data engineer willing to help). Recently, I have discovered Azure Functions to be an extremely useful tool for solving these types of tasks.

What are Azure Functions?

Simply stated, Azure Functions are pieces of code that run. More formally stated,

Azure Functions are serverless computing which allows code to run on-demand without the need to manage servers or hardware infrastructure.

This is exactly what a data scientist needs to solve the tasks mentioned above. I, for one, do not enjoy managing servers (hardware or virtual). I have done it before, but I find it time-consuming and tedious. It is just not my thing. Thus, I happily welcome the serverless capabilities of Azure Functions. I just focus on the code and get the task completed.

Because the code does not always need to be running, Azure Functions invoke the code based upon specified triggers. Once the trigger is activated, the code will begin to run. The following list provides some examples of the triggers available.

Triggers for Azure Functions:
  1. Timer – Set a timer to run the Azure Function as often as you like. Timing is specified with a cron expression.
  2. HTTP Rest call – Have some other code fire off an HTTP request to start the Azure Function.
  3. Blob storage – Run the Azure Function whenever a new file is added to a Blob storage account.
  4. Event Hubs – Event Hubs are often used for collecting real-time data, and this integration offers Azure Functions the ability to run when a real-time event occurs.
  5. Others – Cosmos DB, Service Bus, IoT Hub, GitHub are other events which can trigger an Azure Function.

What can Azure Functions Do?

Once you begin to understand the concept, you can quickly see some of the possibilities. Without having to configure servers or virtual machines, the following tasks become much simpler:

  • Reading and writing data from a database
  • Processing images
  • Interacting with an HTTP endpoint
  • Automating decisions in real-time
  • Computing descriptive statistics
  • Creating your own endpoint for other data scientists to call
  • Automatically analyzing code after commits
Programming Languages for Azure Functions

As of August 2018, full support is provided for C#, Javascript, and F#. Experimental support is provided for Batch, PowerShell, Python, and TypeScript. Python can be used to create an HTTP endpoint. This would allow someone to quickly create an endpoint for running machine learning models via scikit-learn or another python module. Unfortunately, R is not yet available, but Microsoft has a lot invested in R, so I am expecting this eventually.

Simplify Tasks for Data Science

Next time you have a data science task which requires a little coding, consider using an Azure Function to run the code. It will most likely save you some deployment and configuration time. Then you can quickly get back to optimizing those fancy AI models.

See the video below for a quick demonstration of how to create an Azure function via a web browser (no IDE needed).