I have recently been exploring with creating YouTube Live Videos. Go to the Data Science 101 Facebook page to know more.
Enrolling in a master’s degree program in data science or business analytics is no small feat. It takes a lot of time, determination, and money. It can all be worth it as a more fulfilling and higher paying job might be in your future. However, just earning the degree does not guarantee a job in the future. Here are a few tips to maximize your master’s degree experience and enhance your chances of landing that great job.
Create a Project
This one is big because it helps with all the other tips. Pick a project that is unique to you. It should be interesting and fun. There are tons of open datasets available. The project can be any topic from something big like world education to something smaller like your own coffee consumption (for some of you that might not be small). All that matters is that it involves some data and you work on it. The project will help you learn new things and determine what is enjoyable. It will even give you a good discussion topic for future job interviews.
Determine the portion of data science you enjoy
Is it visualization, programming, modeling or something else (see Getting Started with Data Science Specialties for a list of specialties)? Then tailor as much of your program around that as you can. You will excel more at things you enjoy, and data science needs teams not individuals who think they can do everything.
Attend local meetups or conferences
Depending upon where you attend school, this might be easy or difficult. If your local area does not have a data science group, start one.
If you are ever offered the chance to speak to a group, take it. Whether it is a class, local club, church group, or a backyard barbecue; take advantage of the opportunity. Many people are not good at this skill, and practice will only make you better. Also, university settings are great places to practice. They are safe environments and the worst that is going to happen is a not perfect grade. Don’t wait until the stakes are high to begin your practice.
Make yourself visible to the data science world.
Share the slides from your presentations. Better yet, share the video if available. Make sure when a prospective employer searches for you online (and they will), they can easily see a trail of artifacts that demonstrate your interest in data science. You should probably have a presence on some of the following (you do not need them all): LinkedIn, Twitter, Instagram, Quora, Stack Overflow, GitHub, Youtube, Slideshare, Speakerdeck.
Find some local data science people in your area and connect. Offer to join them for coffee or lunch. Attend their presentations and get to know them. This can be others learning data science as well as more seasoned experts.
What others tips do you have for those currently enrolled in a data science masters degree program?
Renowned data scientist, Kirk Borne will take viewers on a journey through his career in science and technology explaining how the industry-and himself have evolved over the last 4 decades. Starting with skipping lunches in high school to a systematic twitter obsession, Kirk will shed light on his road to success in the data science industry.
Kirk is universally considered one of the most (if not the most) influential voices in data science. If you are interested in a career in data science, this is a webinar you will not want to miss.
The webinar is 5:30 Eastern Time on August 29, 2017, and registrations are currently being accepted. It is free.
Businesses everywhere are racing to extract meaningful insight from their data. Many organizations are spinning up data science teams and attacking problems (some more successful than others). However, one of the challenges is determining the current stage of data science within the organization. Next is determining the desired stage of data science.
Below are 3 stages of a truly mature data science organization.
The beginning stage of data science is dashboards. It is all about answering “How much?” and “What happened?” by looking at reports of historical data. If done well, it might even help an organization answer “Why”. Many organizations will refer to this phase as Business Intelligence.
The dashboard stage can be very expensive for an organization, in terms of people-hours and money. It usually involves investments in:
- Data Warehouse or some other storage environment, for storing the data in a single location for easy reporting
- ETL (Extract Transform Load) Tools for manipulating, combining, and moving data to the data warehouse
- Reporting Tools for displaying the results and allowing users to “explore” the data
Here are some common questions that can be answered via traditional dashboards:
- How many customers live in each region?
- What were the sales on Black Friday?
- How many patients visited the hospital last month?
As you can see, there are large amounts of value that can be gained by this phase alone. It is critical for a business to clearly understand past performance. Unfortunately, this phase is where many businesses stop.
2. Machine Learning
The real “science” of data science does not begin until the second stage which is machine learning. It focuses on estimating quantities that cannot be directly observed. This could be what movies a customer will like, the price of a company’s stock tomorrow, or the causal effect of a particular advertising campaign. Machine Learning uses the data from the first phase and applies statistical or other methods to gain additional insights.
Think of machine learning as answering the following:
- When a customer moves, will he/she spend money at a hardware store?
- When a credit card purchase is made, what is the probability the charge was fraudulent?
- What is the expected lifetime value of a new customer?
- If a hurricane is coming, what will people buy? (pop tarts? it is true).
Notice the connection between an event and some outcome. The value of machine learning comes from estimating the causal outcome of potential events. This phase is filled with terms such as: machine learning, data mining, and statistical modeling. The machine learning stage is all about looking into the future!
Determining the actions to perform, is the third and final phase. It tries to capitalize on the results of machine learning in order to take appropriate actions. The following actions might be suitable for the events identified in the predictive section above.
- When a customer moves, send a “welcome to the neighborhood” packet with coupons to nearby hardware stores.
- Decline the fraudulent charge or deactivate the credit card.
- If the new customer has very high expected lifetime value, provide some special treatment or offers to ensure the customer becomes a customer for life.
- When a hurricane is approaching, place Pop tarts near the front of the store.
As you can see, good machine learning from the second phase can lead to clear actions.
Claiming success in Data Science is all about conquering all three stages. Each stage builds upon the previous stage. If you have put in the effort to complete the first stage, why not continue to the second and third stages?
People love stories. People can connect with stories. People remember great stories. Make your data tell a story. If you can make stories come alive with data, people will pay attention.
There is no magic formula for a great story, data or otherwise. Here are some guidelines for telling a great data science story.
- Clearly state the problem
- Explain the data
- Share the struggles of doing the analysis
- Do not focus on the algorithms
- Show how the analysis progressed, take your listeners on a journey
- Finish with something remarkable
The late Hans Rosling could tell as good of a story with data as anyone. Do a quick internet search for his name, and you can easily find his Ted talks or other videos. He provides an excellent model for telling a story with data. It is worth your time to watch some of his videos.
The entire goal of telling a story with data is to get people engaged in the problem.
Leave a comment if you have others tips for telling an effective data science story.
If you work at a university and are considering starting an undergraduate program in data science, then today’s post is for you.
- A Guide to Teaching Data Science  – focuses on increasing 3 skills (create, connect, compute) within Statistics departments to develop data science
- Teaching the Foundations of Data Science: An Interdisciplinary Approach  – A study and analysis of teaching an introductory course on data science with a cooperation between MIS and CS.
- A Data Science Course for Undergraduates: Thinking with Data  – Overview of an undergraduate course in a liberal arts
environment that provides students with the tools necessary to apply data science. very detailed many great topics plus R and SQL
- Embracing Data Science  – Statistics needs to learn from data science to make courses more relevant
If you know of any other papers, please leave a comment below.
Karl Schmitt, Director of Data Sciences at Valparaiso University, has started a blog to share his experiences with building an undergraduate data science program. The blog is titled, From the Director’s Desk. Karl is regularly posting about textbooks, curriculum, visualizations and learning objectives from the perspective of an educator. Tons of great resources!
Recently, I had the honor of speaking with Dr. Karl Schmitt from Valparaiso University. He is the director of the Data Science undergraduate program at Valparaiso University. We had a very nice discussion, and I thought I would pass along my summary.
What are the Details of the Valparaiso Undergraduate Program?
The program is housed in the Mathematics department and it is designed to be fairly interdisciplinary. It consists of four parts.
- Computer Science
- A Separate Focus Area
The separate focus area can be from nearly any other department and is targeted at building some domain expertise. Although not required, a double major is encouraged.
One of the most unique and excited aspects of the program begins during the first year. Students take Introduction to Data Science, which has few prerequisites and serves as motivation for the remainder of the program. Valparaiso partners with non-profits and government agencies to provide the first year students with hands-on experience solving problems for social good. Examples include Meals on Wheels, mapping with the United States Geological Survey, and a child welfare non-profit. Then, the junior and senior students are involved with a capstone project that can be a continuation of the first year project, some other social good project, or students can serve in a consulting capacity to other departments on campus.
What skills Do You expect Valparaiso Data Science Graduates to Have?
There are a few basics skills that make sense for data science: coding, database skills, statistics, and general math. In addition, Valparaiso grads should also know how to talk, write, and create videos about mathematical concepts. Finally, ethics is an essential portion of the program. According to Dr. Karl Schmitt,
I want my students to graduate with ethics related to data science.
To enforce that statement: ethics case studies are required of all students, it is a key learning objective of the projects, and ethics is integrated into all the classes so students understand the importance. Students need to be able to do the hard data science, communicate the results and care about the consequences.
Why Choose Data Science as an Undergraduate?
It is a utility degree that is in strong demand in nearly every field. As companies continue to understand the usage of data, having data skills is going to get increasingly more crucial. Data Scientist are going to be (currently are) in demand for human resources, supply, sales, technology and many other awesome jobs.
Why Valparaiso for Data Science?
There are a number of reasons:
- Good University Size – It is easy to double major and engage with things outside the major, plus disciplines are very connected which allows for collaboration.
- Writing/Communication is Integrated Throughout – Many people can crunch numbers, but Valparaiso graduates can express discoveries. The students get that from the very beginning.
- Projects – All students will have experience and examples of projects to demonstrate.
- Finally, students have an opportunity to turn their homework into something that matters!
Thank you to Dr. Karl Schmitt for the interview and to Valparaiso University for Sponsoring Data Science 101.
The Data Incubator, a data science fellowship program, is currently running a Data Science in 30 minutes webinar series. Next week features a free webinar with Dr. Becky Tucker of Netflix. Dr. Tucker is a Senior Data Scientist at Netflix where she specializes in predictive modeling for content demand (think what do people want to watch). The full abstract of the webinar is below. The webinar is free; all you need to do is register.
Predicting Content Demand with Machine Learning
Abstract: Netflix is well-known for its data-driven recommendations that seek to customize the user experience for every subscriber. But data science at Netflix extends far beyond that – from optimizing streaming and content caching to informing decisions about the TV shows and films available on the service. The talk will cover work done by Becky and the Content Data Science team at Netflix, which seeks to evaluate where Netflix should spend their next content dollar using machine learning and predictive models.
Update – Below is the Recorded Webinar
While there are a growing number of universities that offer undergraduate data science degrees, for one reason or another those programs may not be perfect for everyone interested in data science. So, what do you do if you attend a school that does not offer a data science degree? This is a question frequently asked of me, so I thought I would elaborate on my typical response.
You Cannot Know It All
First off, you will never know all there is to know about data science. The field is vast and contains many sub-fields. Thus, as an undergraduate, a good plan is to learn the fundamentals. Then expand your knowledge/expertise as your education and career continue. Data Science is evolving rapidly and it requires continual learning. Hopefully, this is one of the reasons you are interested in the field.
My Recommended Approach
A good plan is to major in computer science or statistics and minor in the other. If your school doesn’t have either of those major, then take as many of those classes as you can. Next, choose a domain specific area such as business, chemistry, psychology, etc.; and gear your elective classes toward that domain area. This approach will give you a solid base understanding of the statistical and computational underpinnings of data science. You should also be well-prepared to find a job or continue your studies in graduate school.
Also, somewhat related, taking an art class or two might not be a bad idea. Visualization is very important to data science. Understanding color palettes and usage of space on a canvas are concepts that will serve you well. Plus, many people strong in computer science and statistical algorithms are lacking in artistic skills.
Some Enhancements to Your Education
If your location allows, consider attending local meetups. Finally, get involved with whatever projects you can (Kaggle, internships, open source, …).
Do you have any advice for undergraduates looking to study data science? If so, please leave a comment.
Are you and undergraduate with questions? Please ask in the comments below.