Online Adds Are More Important Than Social Networking For Consumers? Huh?

Marketers should target customers that share a lot on social networks. Or not! I found it interesting that Twitter and Facebook were in the bottom 5 list of influences for the social consumer. I would have expected it to be different. Search results, brand websites and online adds are all more important. This is one example of data going contrary to my own belief.

Global Data Hackathon

What is being hyped as the First Global Data Hackathon is scheduled for April 28-29, 2012.  The event is being organized by Data Science London. 

BoostTurku in Finland is organising a group for the Hackathon event. See the website at Datathon.

A Data Science Curriculum

This is not intended to be mapped to a set of college courses. It is intended to be a listing of necessary skills for a data scientist. For a definition of data scientist, see this previous post.


  • Calculus – not directly important to data science, but the knowledge is important to understand the statistics and machine learning
  • Matrix Operations


  • Regression – Linear and Logistic
  • Bayesian Statistics


  • Hadoop
  • R – stats
  • Octave – machine learning


  • Basic Programming – Java, C/C++, and Python seem to be good language choices
  • Machine Learning
  • Database Knowledge – not limited to just relational databases


  • Data Visualization – how to make data look good: maps, graphs, etc
  • Presentation – story telling, be comfortable explaining data to others
  • Writing

Do you have anything to add/remove from the list?

Tell Someone About Data Science

Please spread the word about why data science is important. If you are excited, others will be too. If you are not sure what to say, here is a list of possible topics.

What can you tell people about data science?

What are some other things you could tell people about data science?

STEM Graduates Quit Because The Material Is Difficult

STEM stands for Science, Technology, Engineering and Mathematics. Due to the difficulty of STEM degrees, it appears many students abandon the degrees in college. While this fact is not surprising, it is still concerning. Our country and world need more good people with STEM skills.

A STEM degree is not essential to becoming a data scientist, but many data scientists have STEM backgrounds. Thus, I thought this information fit well with the Data Science Education Week theme.

How do we convince students to not abandon the STEM degrees?

One solution is to put less emphasis on grades. Grades in STEM courses are typically the lowest on campus, and this causes some students to switch degree programs in order to get better grades. Second, tell young people about some of the cool STEM projects available. Lots of people in Science and Math work on really interesting projects. If you can, tell the world about your projects.

What are some other ways to keep students in STEM programs?

Below is a nice infographic with various numbers about STEM students.

Thanks to Online Engineering Degree for the infographic.

Learn To Code

Coding (a.k.a. computer programming) is not the primary function of a data scientist, but some coding skills are necessary. Modifying machine learning algorithms or scaling/altering data are both good examples of when writing a few lines of code could be very beneficial. Well, if you have desire to learn to code, then there is no time better than the present. A handful of companies have recently launched products that will help with just that task.

  1. Udemy – not specific to coding, but there are many computer programming classes available
  2. Code School – The courses here are focused on web development.  If you want to learn the ruby programming language and eventually Rails, this may be a good place to start.  Plus, you can currently get access to all courses for $25 per month.
  3. Code Lesson – Courses are not free, but the range of courses is nice.  Also, the courses are structured to fit the evening/weekend schedule. Update: CodeLesson does offer free courses, see here.
  4. Codecademy – Probably the most interesting site on the list.  If I did not know how to code, I would probably start here.
  5. Coursera – Soon they will be offering CS 101.  I have not seen a syllabus, but it may serve as a good resource for learning to code.
  6. Of course, there is always the option to go to college.  Nearly every college or university offers at least a class or two about programming.  This is probably the most expensive route, but if you thrive in a classroom setting, then this is a good option.

With all the options available, there are others too, 2012 might be the best year ever for learning to code.

Are you aware of other sites devoted to helping people learn how to program?

Open Source Online Statistics Book (OpenIntro)

OpenIntro is an organisation that was started to create a free and open source introductory statistics textbook.  The book is available as a free PDF download, or it can be purchased in paperback from Amazon for less than $10.  If you want to learn statistics or need a little refresher, check it out.

Data Science Courses

Data Science Courses

This is a nice collection of data science related courses offered at various colleges and universities. It is on a wiki page so you are free to add  links.

College Graduates Not Ready For Big Data

This infographic displays the need for colleges and universities to start preparing more data science graduates.

Do You Need College To Be A Data Scientist?

With some of the top tech entrepreneurs in the U.S. either dropping out of college or not attending, there is some debate about whether college is the right choice or not. This post will focus on college for data science. However, for college in general, if you know what you want to study, then college or graduate school is a great option. If you are going to college because you do not know what else to do, I would say college is too expensive for that.


Most would agree that an undergraduate degree in some highly analytical field (math, CS, economics, physics) is definitely beneficial. Plus college has a strict set of guidelines and a specific order for the learning. A formal degree program often provides the necessary motivation for a person to continue learning. The U.S. college education system is not perfect, but if it keeps a person from quitting, it will help to reach the goal of becoming a data scientist.

All this leads to a second point. Only a few colleges offer undergraduate degree programs for data science. Thus, graduate school or more learning will still be required. College should provide the necessary prerequisites and many employers will pay for the continued learning.

No College?

A highly motivated person could probably learn most if not all the data science skills on the internet for free or very low cost. The key is being a highly motivated person. That person must have the drive to not quit when the learning becomes difficult. Also, there are no classmates or professors to help with difficult concepts. Sure, the internet can help there, but it requires a bit more work to find the help. Plus, knowing what topics to learn and in what order can be challenging. Already, this blog has much helpful content, but it is not organized based upon a sequence of learning. Not attending college presents some obstacles that only the most highly motivated students will overcome. As more and more learning resources appear online, the no college option may become more popular.

What is the Answer?

Strictly speaking, I would say the answer is NO. However, many people will not succeed without the rigor of school, and some companies will not hire a person without a degree. So, college is not 100% essential to being a data scientist, but for many it is probably the best option.