8 Easy Steps to Becoming a Data Scientist

OK, the steps are not that easy. They are all doable, and most of the steps are free or very low-cost. They will just take some time.

Thanks to the fine folks at DataCamp, creator of online data science courses, for the infographic.

8 easy steps to becoming a data scientist infographic
8 easy steps to becoming a data scientist

About Ryan Swanstrom

Creator of Data Science 101

View all posts by Ryan Swanstrom

53 Comments on “8 Easy Steps to Becoming a Data Scientist”

    1. I would say python. It is a good overall programming language. That is one of the reasons it is popular at colleges. http://cacm.acm.org/blogs/blog-cacm/176450-python-is-now-the-most-popular-introductory-teaching-language-at-top-us-universities/fulltext

      I am also a fan of R because of all the libraries. So I would suggest, learn a little python first, then learn a little R and then choose for yourself. As a bonus, you will know 2 languages.
      As for learning tools, I am huge fan of Coursera, and DataCamp is also really nice because of the interactive programming console. Codecademy is great for python http://www.codecademy.com/tracks/python

      I hope that helps,

      1. Thanks Ryan. I’ve learned Python in school and have learned some R as well in my stats classes. I’m currently taking a class on Coursera in Octave as well, I feel like I should focus my efforts on one thing at a time! Thank you for your advice though – will keep working at it.

  1. Hello, Ryan. I like your post, it gave me a good perspective about how to be a data scientist. I am working as data scientist of an e-commerce company for 5 months (I have no experience before), but my background is Statistics. Until now, what I do is extracting data from database with a lot of MySQL and a little bit of MongoDB, then I analyzed and visualized it with Excel, because I still struggle to use R. Can you give me a clue, advice, tutorial, or anything about how to learn R effectively in short time? Thank you, Ryan.

      1. Hi Ryan,

        Thanks for your response!. I don’t have any prior experience on programming but i’m taking online training.

  2. Hi Ryan,
    I am going to be applying for colleges really soon and I’m extremely interested in data science. I’ve been learning some python and statistics. I was wondering what major I should apply to for college because there is no data science degree for undergraduates.

    1. Tina,
      Thanks for reaching out. Actually, there are a few undergraduate data science programs. See the list here, http://101.datascience.community/2012/04/09/colleges-with-data-science-degrees/

      However, if you don’t choose one of the data science undergraduate programs, here is my recommended approach. Pick a major, either Statistics or Computer Science, and minor in the other. Then try to pick up some business classes along the way. If your location allows, consider attending local meetups. Finally, get involved with whatever projects you can (Kaggle, internships, open source, …).

      Thanks I hope that helps,

      1. Hi Ryan,

        Thank you for replying! This is really helpful. I know that data science is mainly a mixture of statistics and computer science. However, in a job what do most data scientists use? Is there something that every data scientist needs to know?


      2. Tina,
        Unfortunately, every data scientist is different but I would say R and Python are very popular right now. And hadoop/Spark.
        Anyhow, I think this would make a good “back to school” blog post for sometime in August/September.


      3. Tina,
        Sorry, I forgot to reply to this comment. Every data scientist is different, so there is not an exact checklist for things to know. That said, I think R and Python would be skills for many data scientists. I think this topic would make an excellent blog post. Stay tuned for later this month, I am going to put up some “back to school for data scientists” blog posts.

  3. Hello Ryan,

    I have been interested in data career, I do not have any background in computer science and my degree is mechanical engineer (currently working as mech designer) but since a year ago I read about this and i would like to change my career but I do not have the time to go to college again and I have been looking for online material and found a lot (Udacity – nanodegree) and this chart, i do not know if it is enough with the moocs material to.get a job in a few years of studying, I would need an advice about how to start? And if it is possible to land in a job without a degree on cs?

    Thank you!

    1. Alejandro,
      I do think it is possible to get a job without a CS degree. However, you would need to demonstrate you can do “data science” stuff. You can demonstrate this with interesting projects or volunteer work you have done for others. There are more than enough MOOCs to teach you all the data science skills you need to get a job; you just have to have the determination to continue doing the MOOCs. You might considering starting with an introductory programming MOOC or an intro stats MOOC.


      1. Hi Ryan,

        Thank you for your feedback.

        What do you think about data nanodegree from Udacity? Is it a good point to start learning? Or maybe a specialization in coursera.

        Let me know your thoughts about ehat would be better.

        Thank you


      2. Hi Ryan,

        Thank you for your feedback.

        What do you think about udacity’s data nanodegree? Is it a good point to start learning? Or maybe a specialization in coursera.

        Let me know your thoughts about ehat would be better.

        Thank you


      3. Alejandro,
        I think both Udacity and Coursera are excellent. I say try them both out and see which learning environment you like best. Either would be a good place to start.


  4. Hi Ryan,
    Very mice posts by you, appreciate that.

    I have been as software programmer using Mianframe, Teradata, DB2 technologies for almost 10 years. Need some guidance like how Data science can change my career line? Apart from R, PYTHON need to have skill like Java/Big Data??


  5. Hi Ryan,

    Thank you so much for such an informative post. I have been looking for some step-by-step information to get back into this career and your post has got it all. I have a M. E. in bioengineering but never got any hands-on experience where I could have used my theoretical knowledge. I am interested to use my skills for some volunteer projects or internships like you have mentioned in your earlier reply to gain some confidence. Could you please suggest how to find such projects?

    Thank you,

    1. Richa,
      DataKind runs hackathons throughout the world. If you are near to one of them, that is a good choice. Plus many larger cities have hackathons, many in need of data people. Kaggle is also an excellent place to practice. DataLook also has many projects to help with. I hope that helps.


  6. Do you believe that the program down at Sophia Antipolis goes a long way of putting someone on the right road? It sure does appear to have more course hour credits than other programs here in California with the potential for internships.

  7. Just wanted to chime in with a few courses of my own created specially for those who are interested in becoming data scientists.

    I created Linear Regression in Python for those interested in learning about general data science concepts like training and test error, bias and variance, objective functions, and optimization:


    I created Logistic Regression in Python as a follow-up and intro do deep learning, to introduce concepts like gradient descent, regularization, sigmoids, and cross-entropy.


    I created Deep Learning in Python (part 1) as a follow-up to Logistic Regression and a deeper look into neural networks, to introduce concepts like backpropagation, softmax, and a very light introduction to TensorFlow.


    Future courses will focus on Restricted Boltzmann Machines, dropout, convolutional neural networks, image classification, word2vec, and LSTMs.

  8. You mentioned about Reporting tool in the infographic. How can one benefit from this from which course? Is it a part of data science course?

    1. Reporting tools are sometimes called Business Intelligence tools (BI). They are used by businesses to share dashboards, reports, and other insights. They are one way of sharing your data science results. Some university programs cover reporting tools, others do not.


  9. Hi Ryan!
    I have done Master in economics and now keen to interest to be Data Scientist. I am eligible for that? Any suggestions from where can i getting start?

    1. Hey,
      A good place to start is finding some data sets online and analyzing them. See if you enjoy that. Then see if you can get involved with any data science projects. Maybe contribute to an open source project, volunteer to help a non profit, find an internship, or volunteer for some extra projects at your current job. An economics background is good for data science. You should be familiar with analytical techniques and statistics. Just make sure you have some programming skills.


  10. Hi Ryan!

    I’m an Engineering graduate with (Information Technology) Specialization. I’m currently pursuing MBA in Digital Marketing Management. It’s been 2 years since I loose my track of IT skills.
    I’m pretty much interested in becoming Data Scientist as well.
    Will that do for me ? I think learning Digital Marketing & Data scientist skills will help in my future. What say ? Should I go for it ?

    1. You can do data science and marketing, but the fields require different training. Do you really want to be a data scientist or a data-driven marketer?


  11. Hey Ryan,

    I’m just curious do you see the fields of data science and marketing ever merging into one? Obviously there are some crossovers with big data, and how to use it so I’d be interested to hear your take on it.

  12. Hi Ryan,

    How are you ? Really great info. Can you please let me know your take on Udacity Nanodegree plus program where they gurantee you a job within six months or your money back. Please let me know . Do you know of other similar programs that comes with a job guarantee . Thanks


    1. Udacity is the only organization (which I know of) that provides a guarantee. Udacity offers good courses, so I do not think you can go wrong with the Nanodegree.

      Thanks for asking,

  13. Dear Rayn,
    First of all i would like to thank you and your staff for creating/introducing such a nice website to teach R and Python for reading, understating , producing information and knowledge from data using programming language R and Python. i am from Ethiopia i have tough different statistics course for under graduate using R and my field of study is Bio statistics now time i am working as programmer/ Analyst position in GEOHealth project Hub. Hence, i want to be data scientist but i can’t get online courses because there is no online payment method in our country. so in what methods you can help me.
    thank you
    Amsayaw Tefera

    1. Amsayaw,
      Many of the courses on the MOOCs (EdX, Coursera, Udacity) are free. I do not believe you have to enter any payment information. You might not be able to obtain a “verified” certificate without payment but you can still gain the knowledge. In addition to that, many free books are available online (many are linked in past blog posts).
      I hope that helps,

  14. Thank you for sharing this , it is a really great guide for getting started with learning Data Science. Like the article says this wont be an easy process but if you are genuinely interested in Data Science then you can follow this guide and you will definitely see results.

  15. Hey, thank you so much for the starter guide to learn Data Science.
    As said, there is no easy process. I am a graduate of Great Learning Institute, India.
    The Data Science Engineering program at Great Learning is designed very well from all the basics to advanced Data Science skills…

  16. Hey there, thank you for this great knowledge, this starter guide is very well explained. I’ve learned Python and R as well during my program at Great Learning(link). The program offered by Great Learning(link) has been beneficial for me to pursue Data Science as a Career.

Leave a Reply