Karl Schmitt, Director of Data Sciences at Valparaiso University, has started a blog to share his experiences with building an undergraduate data science program. The blog is titled, From the Director’s Desk. Karl is regularly posting about textbooks, curriculum, visualizations and learning objectives from the perspective of an educator. Tons of great resources!
Today, we are lucky to have Daniel Levine of RJMetrics provide a guest post. RJMetrics created an extensive report detailing The State of Data Science. I asked Daniel to provide some results as they relate to the current education of data scientists.
Recently, RJMetrics released a benchmark report that looked to answer many of the questions people have about today’s data scientists, such as how many data scientists are there, what degrees do they have, and what skills do they posses.
From LinkedIn data on the 11,400 data scientists working now, we can get a much better sense of what types of data scientists companies are hiring, and how senior data scientists differ from their junior counterparts.
While it was typical to see data scientists report multiple degrees, when we looked at the percentages of all distinct bachelor’s, master’s, and doctorate degrees, we found that 42% finished their education with a master’s.
The high number of data scientists that receive graduate degrees (79%) is indicative of the increasing demand for specialists and a desire from data scientist for advanced training.
Additionally, these numbers may indicate that data science is simply attracting highly educated educated individuals because of its sexy and lucrative career path.
So what does this distribution look like as you climb the corporate ladder? You may assume that the higher the position, the more PhDs; but in fact, across Junior, Senior, and Chief Data Scientists, we saw the highest ratio of PhDs to Master’s at the Senior level.
We speculate that the drop from 43% at the Senior level to 35% at the chief level actually reflects how long those individuals have been in the field. In a study by Heirick & Struggles titled, “Understanding Today’s Chief Data Scientist,” they found that chief Data Scientists “average nearly 15 years of post-degree commercial (PDC) experience.” What we’re likely seeing in this data is the “first crop” of Chief Data Scientists who earned this title in the field, not in the classroom.
When we looked at what data scientists studied during their education, we found that besides Business Administration/Management, they were mostly STEM-focused.
We believe that Computer Science is so popular because a data scientist without CS skills is at an extreme disadvantage because they won’t be able to extract the data well enough to properly analyze it. DJ Patil and Hilary Mason, in their book Creating a Data Culture, went as far as to say, “a data scientist who lacks the tools to get data from a database into an analysis package and back out again will become a second-class citizen in the technical organization.”
In analyzing 254,600 records of skills, we found the most popular skills to be more generic than we’d expect. Popular buzz term like “big data” and “hadoop” didn’t crack the top 10, while programming languages like “r” and “python” are extremely popular among data scientists.
When the data was sliced by seniority, we saw a major difference between Junior, Senior, and Chief levels. To make these differences easier to digest, we compared each level to the same common denominator: the average data scientist.
Again, the chief data scientists data is of particular interest. These C-suite professionals are more likely to list skills like “business intelligence,” “analytics,” “leadership,” “strategy,” and “management” among their skills than both junior and senior data scientists; but less likely to list skills on the more technical side, like “python” and “r”.
While it’s true that chief data scientists may be simply emphasizing skills that are more relevant to their position within the company, we also speculate that many chief data scientists assumed these roles by virtue of being in the field longer or having additional qualifications, such as a business degree. Therefore, it is also possible that some chief data scientists never actually learned many of the skills listed by more junior people.
If you’d like more analysis about this data and a more detailed explanation about our methods, you can check out the full State of Data Science.
The Data ScienceTech Institute (DSTI) in France is starting 2 new master’s degree programs in data science. Both programs are highly innovative and offer a strong industry focus. Classes begin in October 2015, and each program is limited to 30 students. Therefore, if you are interested, it is important to apply as soon as possible.
The other day, the faculty at DSTI were announced. I am honored to say I was selected as one of the faculty. Thus, I will serve as a visiting faculty member for portions of the program.
DSTI offers 2 master’s degree programs:
Data Scientist Designer – Located in Paris, this 2-year program is part-time and focused on working professionals looking to transition or enhance skills in the data science field. The course will rotate between 2 and 3 days a week.
Executive Big Data Analyst – Located in Nice along the French Riviera, this program is a more traditional intensive 16-month program targeting full-time students.
If you are in France or Europe or interested in studying in France, the programs from DSTI are definitely worth a look.
Sound appealing? Probably not! Unfortunately, this is the sad reality for many children in Sub-Saharan Africa. Even worse, this sad reality is only for those children lucky enough to even attend school. In the world today, there are 58 million out of school children, and 43% of those children will never start attending school.
DataQuest is a recently launched online data science learning platform for python. The site consists of a gamified series of missions that increase in difficulty as your skills progress. Here are a few other features of the site.
The site is still under development and the founder, Vik Paruchuri, is looking for help developing more content and missions for the site. If that is something of interest to you, get in touch with Vik via the DataQuest website.
The program is entirely online and runs 1 course at a time starting in late February 2015. The 5th and final course should complete sometime in late 2015. Thus, the entire program lasts less than a year. Below are some more details on the program.
What do you need to get into the program?
Bachelor’s Degree in math, CS, stats, business, science, or other related field
A couple year’s work experience
What type of people should attend this program?
Someone intested in transitioning to data science career
Managers whom wish to better understand data-driven decisions
Programmers and Statisticians looking to become data scientists
Analysts looking to move beyond Excel
Full Disclosure: I am a member of the Advisory Board for this program.
The list of Data Science Bootcamps is now live at http://datascience.community/bootcamps
The list currently contains 11 programs. The programs range from full-time 12 week programs to part-time online training.
Data Science is one field that has definitely adopted the newer, innovative forms of learning. MOOCs are full of data science related courses and the List of Data Science Bootcamps definitely shows the variety of new techniques being used. For example, Zipfian Academy uses a 12-week immersive program to train students and work on projects together. Insight, Persontyle, and The Data Incubator focus on filling in the gaps of recent PhDs, and other programs such as Statistics.com and Leada are focusing on online programs. Leada will be an interesting program to watch in the coming months and years. The program is definitely different and could be a game-changer if it continues to grow.
Last week, I got the opportunity to spend some time with the team from Insight Data Engineering. They offer a free program that trains people to be data engineers. Then they help those people connect with a job at an impressive company. The program runs a few times a year and consists of 6 intense weeks learning about and working on a data engineering project.
Although the program is free, it does have a highly-selective application process. Once accepted, you can expect the following:
A beautiful office space in sunny Palo Alto, CA
Mentoring from experts in the field
Meet and Greets with some of the biggest names in data science
Introductions to some of the leading data engineering companies
Access to a growing network of program alumni
A bright future as a data engineer
Insight Data Engineering is the same company that has run Insight Data Science, a similar type of program but for scientists instead of engineers, for the past 2 years. That program has 100% placement so far, and I don’t see that number ever changing. The program has an excellent advisory board that is actively involved in the program.