This is a great overview of bias and variance.

Follow the Data

The online machine learning course given by Andrew Ng in 2011 (available here among many other places, including YouTube) is highly recommended in its entirety, but I just wanted to highlight a specific part of it, namely the “Practical advice part”, which touches on things that are not always included in machine learning and data mining courses, like “Deciding what do to do next” (the title of this lecture) or “debugging a learning algorithm” (the title of the first slide in that talk).

His advice here focuses on the concepts of the bias and variance  in statistical learning. I had been vaguely aware of the concepts of “bias and variance tradeoff” and “bias/variance decomposition” for a long time, but I had always viewed those as theoretical concepts that were mostly helpful for thinking about the properties of learning algorithms; I hadn’t thought that much about connecting them to the…

Network World Article: Could data scientist be your next job?

Sandra Gittlen wrote a very nice article for Network World titled Could data scientist be your next job? She did an excellent job explaining the problem with defining exactly what a data scientist is. She also interviewed a couple people leading the push to get universities to provide data scientist training. According to her article and many others on the internet, it is a great time to be learning data science skills. Companies cannot find enough data scientists. Plus, Sandra was kind enough to interview me for the article.

Bitmarks: Bitly’s Data Science One URL At A Time

Earlier this week, Bitly launched a new bookmarking service. They call links/URLs bitmarks instead of bookmarks. It has a nice Chrome Extension and Bitmarklet. So far, I very much like the service.

So, Why Should You Care?

Well, at its core, Bitly is a data science company. This is just another way for Bitly to collect more URLs. I think that is a good thing. Bitly has huge amounts of data created by collecting lots and lots of small things.

What do you think Bitly is doing with all those URLs? I am not completely sure, but I would bet some of it is really neat. Bitly can already track breaking news in near real-time. I will be curious if Bitly can predict the winner of the November presidential election before the news organizations can.

By the way

I have create a Data Science 101 Bitmark Bundle. You are welcome to follow along, although I do not know if there is a way to follow a bundle.