Tag Archives: math

Choosing a Data Science Graduate Program

Due to the large list of Colleges with Data Science Degrees, I receive a number of email inquires with questions about choosing a program. I have not attended any of the programs, and I am not sure how qualified I am to provide guidance. Anyhow, I will do my best to share what information I do have.

Originally, the list started out with 5 schools. Now the list is well over 100 schools, so I have not been able to keep up with all the intricate details of every program. There are not very many undergraduate options, and the list only contains a few PhD programs, so the information here will be focused on pursuing a masters degree.

Start by asking 2 questions:

  1. What are my current data science skills?
  2. What are my future data science goals?

Those 2 questions can provide a lot of guidance. Understand that data science consists of a number of different topic areas:

  1. Mathematical Foundation (Calculus/Matrix Operations)
  2. Computing (DB, programming, machine learning, NoSQL)
  3. Communication (visualization, presentation, writing)
  4. Statistics (regression, trees, classification, diagnostics)
  5. Business (domain specific knowledge)

After seeing the above lists, this is where things get cloudy. Everyone brings a different set of existing skills, and everyone has different future goals. Here are a few scenarios that might clear things up.

Data Scientist

The most common approach is to attempt to build knowledge in all 5 topic areas. If this is your goal, find the topic areas where you are weakest and target a graduate program to help you bolster those weak skills. In the end, you will come out with a broad range of very desired skills.

Specialist

A different approach is to select one topic area and get really, really good. For example, maybe you want to be an expert on machine learning. If that is your goal, then maybe a traditional computer science graduate program is what is best. In the end, you will be well-suited to be an effective member of a data science team or pursue a PhD.

Data Manager

A third and also common approach is from people that want to help fill the expected void of 1.5 million data-savvy managers. These people do not necessarily want to know the deep details of the algorithms, but they would like an understanding of what the algorithms can do and when to use which algorithm. In this case, a graduate program from a business school (MBA) might be a good choice. Just make sure the program also involves coverage from the non-business topics of data science.

Example

I think NYU is the best example of a school that can help a person achieve just about any data science goal. The NYU program is a university-wide initiative, so the program is integrated with many departments (math, CS, Stats, Business, and others). Therefore, a student could possibly tailor a program to reach a variety of future goals. Plus, New York has a lot of companies solving interesting data science problems.

Conclusion

There you have it. It does not narrow the choices down, but it should help to provide some guidance. Other factors to consider are length of a program and/or location.

Good Luck with your decision, and feel free to leave a comment if you have and good/bad experiences with any of the particular graduate programs.

Latex Documents Online

Although not specific to data science, if you write a lot of documents with mathematical notation, you are probably familiar with LaTeX. LaTeX is a typesetting system common for mathematics.

There are now some nice resources to help you write and produce LaTeX documents collaboratively on the web.

Another site, LaTeX Templates is just what you would guess. It contains of bunch of sample templates to make creating documents a bit quicker and easier.

3 Secrets for Aspiring Data Scientists | Software Advice

Michael Koploy wrote 3 Secrets for Aspiring Data Scientists about what it takes to enter a career as a data scientist. He lays out 3 steps:

  1. Sharpen Your Scientific Saw – Hone your math and science skills
  2. Learn the Language of Business – Data Scientists need to explain the data in business terms
  3. Keep Adding to Your Technical Toolbelt – Learn all the tools you can (NoSQL, Excel, Hadoop,…)

The article is a nice read. http://blog.softwareadvice.com/articles/bi/3-career-secrets-for-data-scientists-1101712/

Coursera Adds 17 New Universities

Just Announced, Coursera adds 17 new universities. Those universities include Columbia and Brown, as well as a few international universities.

A few notable courses for data science are: a new machine learning course from the University of Washington, Linear Algebra from Brown, and Natural Language Processing by Michael Collins from Columbia.

See the following pages to seed what other courses are now available.

Learn Math for Data Science

Math is one of the key building blocks of data science. While you cannot do a lot of data science with just calculus and linear algebra, both topics are essential for more advanced topics in data science such as machine learning, algorithms, and advanced statistics. Here are some freely available resources for learning both topics.

Calculus

Matrix Operations/Linear Algebra

Other Math Options

The following 2 courses from Coursera maybe good for a person learning to think mathematically.

Take and Learn Statistics For Free

Last week, Udacity started a course on Introduction to Statistics, Making Decisions Based on Data. This is a beginners level course on statistics, so it should be accessible to everyone. The course consists of seven units, which are intended to last about one week each. Udacity does not enforce any time limits though. Homework problems are also a part of the course, so you will get a chance to practice what you learn.

Udacity is a learning environment similar to Coursera. I would say the presentation is more focused on the web and the experience is a bit more enjoyable. Courses at both sites are taught by professors from top universities and other leading experts in the field. Both sites offer lots of knowledge for free, and I say try them both. Then let you own personal preference decide which you like better.

What do you think about Udacity? Have you tried it?

A Data Science Curriculum

This is not intended to be mapped to a set of college courses. It is intended to be a listing of necessary skills for a data scientist. For a definition of data scientist, see this previous post.

Mathematics

  • Calculus – not directly important to data science, but the knowledge is important to understand the statistics and machine learning
  • Matrix Operations

Statistics

  • Regression – Linear and Logistic
  • Bayesian Statistics

Tools

  • Hadoop
  • R – stats
  • Octave – machine learning

Computing

  • Basic Programming – Java, C/C++, and Python seem to be good language choices
  • Machine Learning
  • Database Knowledge – not limited to just relational databases

Communication

  • Data Visualization – how to make data look good: maps, graphs, etc
  • Presentation – story telling, be comfortable explaining data to others
  • Writing

Do you have anything to add/remove from the list?

STEM Graduates Quit Because The Material Is Difficult

STEM stands for Science, Technology, Engineering and Mathematics. Due to the difficulty of STEM degrees, it appears many students abandon the degrees in college. While this fact is not surprising, it is still concerning. Our country and world need more good people with STEM skills.

A STEM degree is not essential to becoming a data scientist, but many data scientists have STEM backgrounds. Thus, I thought this information fit well with the Data Science Education Week theme.

How do we convince students to not abandon the STEM degrees?

One solution is to put less emphasis on grades. Grades in STEM courses are typically the lowest on campus, and this causes some students to switch degree programs in order to get better grades. Second, tell young people about some of the cool STEM projects available. Lots of people in Science and Math work on really interesting projects. If you can, tell the world about your projects.

What are some other ways to keep students in STEM programs?

Below is a nice infographic with various numbers about STEM students.

Thanks to Online Engineering Degree for the infographic.