School without Water, Electricity or Toilets

Sound appealing? Probably not! Unfortunately, this is the sad reality for many children in Sub-Saharan Africa. Even worse, this sad reality is only for those children lucky enough to even attend school. In the world today, there are 58 million out of school children, and 43% of those children will never start attending school.

UIS LeftBehind

FFunction, a Montreal-based data visualization studio, and UNESCO Institute for Statistics (UIS) recently launched 2 interactive data visualizations. Both are creative and innovative ways to present information.

  • Out of School Children – Explore how gender, income, and location affect a child’s education
  • Left Behind – View how and why African girls struggle to obtain an education

DataQuest – Free Browser-based Learning for Data Science

DataQuest is a recently launched online data science learning platform for python. The site consists of a gamified series of missions that increase in difficulty as your skills progress. Here are a few other features of the site.

  • Sample Code
  • Live, Interactive Browser-based Coding Environment
  • Step by Step Instructions
  • Instant Feedback
  • Helpful Forums for Q&A

The site is still under development and the founder, Vik Paruchuri, is looking for help developing more content and missions for the site. If that is something of interest to you, get in touch with Vik via the DataQuest website.

New Data Science Certificate Program – Cal State Fullerton

If you are interested in earning a data science certificate, California State University Fullerton has just announced a new data science certificate program.

The program is entirely online and runs 1 course at a time starting in late February 2015. The 5th and final course should complete sometime in late 2015. Thus, the entire program lasts less than a year. Below are some more details on the program.

What do you need to get into the program?

  • Bachelor’s Degree in math, CS, stats, business, science, or other related field
  • A couple year’s work experience

What type of people should attend this program?

  • Someone intested in transitioning to data science career
  • Managers whom wish to better understand data-driven decisions
  • Programmers and Statisticians looking to become data scientists
  • Analysts looking to move beyond Excel

Full Disclosure: I am a member of the Advisory Board for this program.

Data Science Bootcamps

The list of Data Science Bootcamps is now live at http://datascience.community/bootcamps

The list currently contains 11 programs. The programs range from full-time 12 week programs to part-time online training.

Data Science is one field that has definitely adopted the newer, innovative forms of learning. MOOCs are full of data science related courses and the List of Data Science Bootcamps definitely shows the variety of new techniques being used. For example, Zipfian Academy uses a 12-week immersive program to train students and work on projects together. Insight, Persontyle, and The Data Incubator focus on filling in the gaps of recent PhDs, and other programs such as Statistics.com and Leada are focusing on online programs. Leada will be an interesting program to watch in the coming months and years. The program is definitely different and could be a game-changer if it continues to grow.

Wanna Be a Data Engineer? – Insight Data Engineering Can Help

Last week, I got the opportunity to spend some time with the team from Insight Data Engineering. They offer a free program that trains people to be data engineers. Then they help those people connect with a job at an impressive company. The program runs a few times a year and consists of 6 intense weeks learning about and working on a data engineering project.

Although the program is free, it does have a highly-selective application process. Once accepted, you can expect the following:

  • A beautiful office space in sunny Palo Alto, CA
  • Mentoring from experts in the field
  • Meet and Greets with some of the biggest names in data science
  • Introductions to some of the leading data engineering companies
  • Access to a growing network of program alumni
  • A bright future as a data engineer

Insight Data Engineering is the same company that has run Insight Data Science, a similar type of program but for scientists instead of engineers, for the past 2 years. That program has 100% placement so far, and I don’t see that number ever changing. The program has an excellent advisory board that is actively involved in the program.

The Data Engineering program is actively accepting applications for the next session scheduled to start in September. Hurry, the deadline for applications is July 7, 2014.

School of Data Science Launched by Persontyle

Recently, Persontyle launched their School of Data Science. The goal is to produce data science training and education for professionals. Here is a brief list of the type of programs being offered.

The offerings a not free, but they look very good and are taught in cities around the globe.

They are different than Coursera and Udacity because the training is more specific and individualized. Plus, it is targeted at businesses and working professionals. A number of other companies offer data science training, but Persontyle appears to be the only ones offering data science training without trying to push their own products. If you or your organization is looking for training in data science, I would highly recommend The School of Data Science from Persontyle.

Another Data Science Program in NYC (also online)

Recently, both NYU and Columbia launched academic programs in data science. Well, another school in New York City is entering the mix. The City University of New York (CUNY) is now offering an online masters degree in data analytics. If you would like more information, there will be an online information session on May 22.

In 2013, Learn Data Science via Coursera (a curriculum)

Coursera has some excellent courses coming up in 2013. Here are some potential curriculum paths for someone looking to learn data science.


Either sequence requires/recommends some basic programming experience. If you are unfamiliar with programming, you still have a couple weeks to get familiar with some basic programming concepts. Some good places to start would be either Coursera’s Computer Science 101 or Codecademy’s Python tutorial.

Data Science Curriculum #1

If you are new to programming, this would be the recommend sequence. The first course focuses on programming.

Course Start Date Completion Date
Computing for Data Analysis Jan. 2, 2013 Jan. 25, 2013
Data Analysis Jan. 22, 2013 Mar. 15, 2013
Introduction to Data Science April 2013 June 2013

Data Science Curriculum #2

Course Start Date Completion Date
Computational Methods for Data Analysis Jan. 7, 2013 Mar. 15, 2013
Introduction to Data Science April 2013 June 2013

Additional Courses

Neither of the Coursera machine learning (Stanford or U of Washington) courses are scheduled for 2013, but either of them would be a great (maybe necessary) follow up course. Hopefully, one of those courses will be starting in July or shortly there after.

After completing one of the above sequences combined with a machine learning course, a person should be skilled enough to begin doing useful data science work. (Note: A new job as a data scientist is not guaranteed, but the courses won’t hurt your chances.) Plus, Coursera offers numerous other classes that could be taken at a later time to increase depth in certain areas of data science (Natural Language Processing, Image Processing, and more).

Happy Learning in 2013!

Learn R for Free at Code School

Code School is offering a course title Try R. The course is completely free and can be completed online with the interactive tutorial. You will learn by doing. If you have been looking to learn R or need a quick refresher, this is probably a very good option.

Big Data Education

I recently read, Big Data Education: 3 Steps Universities must take

Here are the 3 steps listed:

  1. Data Science cannot be an undergraduate degree
  2. A graduate degree should contain math, stats and computer science
  3. Research

Step 2 seems obvious. Math, stats, and computer science are some of the key areas for data science. I would add communication and presentation skills to the list because people with just math, stats, and CS skills are not known to be naturally good communicators. I agree with step 3. More research needs to be done, but most of the research will need to be interdisiplinary. Universities need to put more effort into interdisiplinary research.

Step 1 confused me a bit. The argument was data science has too many necessary skills and an applied focus area. Of course a person cannot learn everything about data science in an undergraduate degree. Earning a computer science degree does not mean you will know everything about computer science. It just means you know the fundamentals about algorithms, architecture, and operating systems. You know enough about computer science to understand the field and learn more as you go. I think 4 years should be enough time to do the same for data science.

What are your thoughts?