Tag Archives: data science

16 Companies Hiring Data Scientists Right Now

Data Scientist is the hot new job for 2012.  Does this job really exist?  Who hires these people? Are companies currently hiring? The answers are: yes, lots of companies, and yes. I decided to spend last night looking for companies that are currently hiring data scientists.  It did not take long to compile a pretty good list.

Data Scientist Job Openings

Company Location Link
Microsoft Redmond, WA Microsoft Sr. Data Scientist
Netflix Los Gatos, CA NetFlix Senior Data Scientist
Kaggle San Francisco, CA Kaggle Data Scientist
Greenplum San Mateo, CA Greenplum Data Scientist
Last.fm London Last.fm Data Scientist
Rackspace San Antonio, TX Rackspace Data Scientist
Amazon Seattle, WA Amazon Data Scientist/System Architect
Facebook Menlo Park, CA Facebook Data Scientist
Twitter San Francisco, CA Twitter Data Scientist
LinkedIn Mountain View, CA LinkedIn Data Scientist
Cobalt/ADP Cambridge, MA Cobalt Data Scientist
Ebay/Paypal San Jose, CA Paypal Data Scientist
Bunchball San Jose, CA Bunchball Data Scientist
A9 Palo Alto, CA Principal Engineer/Data Scientist
Acxiom Little Rock, AR Acxiom Data Scientist
Trulia San Francisco, CA Trulia Data Scientist – Data Science Lab

Do you know of any other companies hiring Data Scientists right now?

Learning Statistics for Data Science

Statistics – This is a topic that could use some more attention from the online community.
I would love to see Stanford (or Coursera) offer a free statistics course online much like the other free courses online.

I did find a series of Youtube videos by Daniel Judge, a Professor in the East Los Angeles College Mathematics Department. The videos start at the very beginning of statistics. I have watched a couple of the videos, and they seem quite good. Daniel does a nice job of explaining the information. Here is the first video in the series.

Stay tuned to the blog in case other stats options appear online. Also, please leave a comment if you know of some good online statistics resources.

What Makes a Good Data Scientist?

Jeremy Howard is the Chief Scientist at Kaggle. At the end of this interview, from the Strata Conference 2012, he identified 4 simple traits that a data scientist needs.

  1. Creativity
  2. Open-mindedness
  3. Tenacity
  4. A Good Skillset

Jeremy Howard of Kaggle at Strata 2012

In this brief interview he covers a range of other data science topics:

  • Big Data is an engineering problem
  • Analytics generate value/insight from data
  • Predictive Modeling is about answering a question – build a model to do that
  • Is Data Science about tools or people? – watch the video for Jeremy’s answer
  • And others…

See this previous post for more videos from Strata 2012.

Link to Data Science Infographic

This infographic does a great job of displaying what a data scientist does and what skills are needed.  Just click and check it out for yourself.

Heroku Thinks Sharing Data is Important

Last week,  Heroku announced a new feature to its PostgreSQL database service.  The new feature is called Data Clip, and it allows users to share results of an SQL query.  It has options to store the exact data from when the query was originally run or the query can be refreshed to return the current data.  I can definitely see this being useful for debugging of code and troubleshooting, which may have been Heroku’s original intent.

I can also see the Data Clip being very useful for data science and quick sharing of relevant data. I doubt the Data clip can handle huge result sets, but huge data is not always necessary. Sometimes, being able to quickly share data results is just as important. Plus the Data Clip allows the results to be downloaded into Excel, csv, json, or yaml formats. Therefore the data can be easily manipulated from there.

See an example in action.

What is a data scientist?

If I am going to create a blog about becoming a data scientist, I must at least provide some type of definition.  One of the best definitions I have read is by Hilary Mason, Chief Scientist at Bit.ly,

A data scientist is someone who can obtain, scrub, explore, model and interpret data, blending hacking, statistics, and machine learning.

This definition is short and simple, but there are many more definitions out there.  In fact CITO Research, a site for CIOs and CTOs, set out to define what a data scientist is.  They interviewed six leaders in the data science community, and posted all of the interviews online.  The interviews produced varied results, but focused on some main themes of what a data scientist should know.

After reading Hilary’s definition, the CITO Research interview’s, a great post at Quora, and numerous other articles, I created a list of data science skills:

  • Machine Learning
  • Statistics
  • Story Telling (Communication)
  • Big Data
  • Algorithms
  • Curiosity

I am sure this list will change and evolve over time, but that is where I am going to focus for now.  If you have anything to add to the list, please leave a comment.  If you are interested in gaining some data science skills, please follow along and let’s learn together.

Why did I create Data Science 101?

Obviously the world does not need another blog. However, blogs are a great way to share information, and I am creating a new one anyway.

The analysis of data is becoming more important everyday. Data Science is quickly becoming a hot topic of interest, and I have a desire to become a data scientist. Thus, this blog will contain information I find useful during my data science journey. I hope others find the blog useful too.

If you are interested in becoming a data scientist, please follow along and let’s start learning together.