Tag Archives: college

You Don't Need a PhD to do Data Science

Many of the top data scientists you will read about or hear speak have PhD degrees. Therefore, many people think a PhD is a requirement for becoming a data scientist. That is completely not true. There is a lot of work in the data science field that does not require a PhD. In all actuality, there is not a lot of data science work that does require a PhD.

What is a PhD and why would a person get one? A PhD degree is a research degree that usually takes between two and five years of study beyond a master’s degree. The majority of the program will be focused on researching and expanding upon a very specific topic. A PhD student will push the edge of known human knowledge.

In daily tasks, most data scientists do not go that far and do not need a PhD. Most of the necessary skills can be obtained at the bachelors or masters level. Combine that education with the amazing tools available and some experience and being a data scientist is definitely achievable.

The reasons many data scientists have PhD degrees are because of the curiosity and love for learning. Those are essential traits of both a data scientists and PhD students. However, you can be curious and love learning without attending enough school to obtain a PhD.

All of this is not to say that earning a PhD is bad. If you really love learning, thrive in the academic environment, and have the desire; then definitely go for the PhD. However, do not let a lack of a PhD stop you from doing data science.

Choosing a Data Science Graduate Program

Due to the large list of Colleges with Data Science Degrees, I receive a number of email inquires with questions about choosing a program. I have not attended any of the programs, and I am not sure how qualified I am to provide guidance. Anyhow, I will do my best to share what information I do have.

Originally, the list started out with 5 schools. Now the list is well over 100 schools, so I have not been able to keep up with all the intricate details of every program. There are not very many undergraduate options, and the list only contains a few PhD programs, so the information here will be focused on pursuing a masters degree.

Start by asking 2 questions:

  1. What are my current data science skills?
  2. What are my future data science goals?

Those 2 questions can provide a lot of guidance. Understand that data science consists of a number of different topic areas:

  1. Mathematical Foundation (Calculus/Matrix Operations)
  2. Computing (DB, programming, machine learning, NoSQL)
  3. Communication (visualization, presentation, writing)
  4. Statistics (regression, trees, classification, diagnostics)
  5. Business (domain specific knowledge)

After seeing the above lists, this is where things get cloudy. Everyone brings a different set of existing skills, and everyone has different future goals. Here are a few scenarios that might clear things up.

Data Scientist

The most common approach is to attempt to build knowledge in all 5 topic areas. If this is your goal, find the topic areas where you are weakest and target a graduate program to help you bolster those weak skills. In the end, you will come out with a broad range of very desired skills.

Specialist

A different approach is to select one topic area and get really, really good. For example, maybe you want to be an expert on machine learning. If that is your goal, then maybe a traditional computer science graduate program is what is best. In the end, you will be well-suited to be an effective member of a data science team or pursue a PhD.

Data Manager

A third and also common approach is from people that want to help fill the expected void of 1.5 million data-savvy managers. These people do not necessarily want to know the deep details of the algorithms, but they would like an understanding of what the algorithms can do and when to use which algorithm. In this case, a graduate program from a business school (MBA) might be a good choice. Just make sure the program also involves coverage from the non-business topics of data science.

Example

I think NYU is the best example of a school that can help a person achieve just about any data science goal. The NYU program is a university-wide initiative, so the program is integrated with many departments (math, CS, Stats, Business, and others). Therefore, a student could possibly tailor a program to reach a variety of future goals. Plus, New York has a lot of companies solving interesting data science problems.

Conclusion

There you have it. It does not narrow the choices down, but it should help to provide some guidance. Other factors to consider are length of a program and/or location.

Good Luck with your decision, and feel free to leave a comment if you have and good/bad experiences with any of the particular graduate programs.

NYU Launches New Center For Data Science

New York University has just launched some Data Science programs via the new Center for Data Science.

… to establish the country’s leading data science training and research facilities at NYU.

Part of the announcement is an M.S. in Data Science. Applications for the initial class, starting Fall 2013, are now being accepted. The Center for Data Science also plans to offer Ph.D. degrees via the Mathematics, Statistics, and Computer Science departments. I am not sure if an official Ph.D. degree in Data Science is being planned.

This is great news!

Big Data Education

I recently read, Big Data Education: 3 Steps Universities must take

Here are the 3 steps listed:

  1. Data Science cannot be an undergraduate degree
  2. A graduate degree should contain math, stats and computer science
  3. Research

Step 2 seems obvious. Math, stats, and computer science are some of the key areas for data science. I would add communication and presentation skills to the list because people with just math, stats, and CS skills are not known to be naturally good communicators. I agree with step 3. More research needs to be done, but most of the research will need to be interdisiplinary. Universities need to put more effort into interdisiplinary research.

Step 1 confused me a bit. The argument was data science has too many necessary skills and an applied focus area. Of course a person cannot learn everything about data science in an undergraduate degree. Earning a computer science degree does not mean you will know everything about computer science. It just means you know the fundamentals about algorithms, architecture, and operating systems. You know enough about computer science to understand the field and learn more as you go. I think 4 years should be enough time to do the same for data science.

What are your thoughts?

Coursera Announces College Credit

Yesterday, Coursera announced that students will soon be able to earn college credits for some of the courses. See the blog post with the college credit announcement.

BigData and College

This infographic is somewhat related to the previous infographic I posted. The first paragraph under point #1 is worth noting. Schools now have the ability to collect massive amounts of data; they just need to analyze it to acquire useful information.

How Big Data is Changing the College Experience
Presented By: OnlineDegrees.org

How To Learn Data Science?

Based upon the popularity of a previous post about a certificate program from the University of Washington, it appears that many people are interested in learning the skills necessary to become a data scientist. Thus, I decided to compile a list of some of the possible learning strategies.

Traditional College Education

The most obvious path would be to study at a traditional college or university. Colleges and universities are starting to notice the demand for data science skills, and many colleges are currently offering programs to prepare someone as a data scientist. This path is safe and predictable. Do the homework, complete the courses, and get the degree or certificate. Most people are familiar with the process, and it offers few surprises. The problems here are the costs, lack of flexibility, and time involved.

Corporate Training

Companies are now starting to offer training programs for data science. EMC is leading the way in this category with their data science training program. Cloudera also offers lots of training related to hadoop and big data. Wolfram offers data science training with Mathematica. One of the problems with this category is the cost. Another problem is the companies have the tendency to teach and promote their own products. This may leave the student with numerous gaps in the full data science spectrum.

Your Thoughts?

What are you thoughts about the above approaches? What are the positives and negatives? Also, later this week I will be posting some less-traditional approaches to learning data science.

Data Science Courses

Data Science Courses

This is a nice collection of data science related courses offered at various colleges and universities. It is on a wiki page so you are free to add  links.

College Graduates Not Ready For Big Data

This infographic displays the need for colleges and universities to start preparing more data science graduates.

Do You Need College To Be A Data Scientist?

With some of the top tech entrepreneurs in the U.S. either dropping out of college or not attending, there is some debate about whether college is the right choice or not. This post will focus on college for data science. However, for college in general, if you know what you want to study, then college or graduate school is a great option. If you are going to college because you do not know what else to do, I would say college is too expensive for that.

College?

Most would agree that an undergraduate degree in some highly analytical field (math, CS, economics, physics) is definitely beneficial. Plus college has a strict set of guidelines and a specific order for the learning. A formal degree program often provides the necessary motivation for a person to continue learning. The U.S. college education system is not perfect, but if it keeps a person from quitting, it will help to reach the goal of becoming a data scientist.

All this leads to a second point. Only a few colleges offer undergraduate degree programs for data science. Thus, graduate school or more learning will still be required. College should provide the necessary prerequisites and many employers will pay for the continued learning.

No College?

A highly motivated person could probably learn most if not all the data science skills on the internet for free or very low cost. The key is being a highly motivated person. That person must have the drive to not quit when the learning becomes difficult. Also, there are no classmates or professors to help with difficult concepts. Sure, the internet can help there, but it requires a bit more work to find the help. Plus, knowing what topics to learn and in what order can be challenging. Already, this blog has much helpful content, but it is not organized based upon a sequence of learning. Not attending college presents some obstacles that only the most highly motivated students will overcome. As more and more learning resources appear online, the no college option may become more popular.

What is the Answer?

Strictly speaking, I would say the answer is NO. However, many people will not succeed without the rigor of school, and some companies will not hire a person without a degree. So, college is not 100% essential to being a data scientist, but for many it is probably the best option.