Data Scientist vs Data Engineer

As the field of data science continues to grow and mature, it is nice to begin seeing some distinction in the roles of a data scientist. A new job title gaining popularity is the data engineer. In this post, I lay out some of the distinctions between the 2 roles.

Data Scientist vs Data Engineer Venn Diagram
Data Scientist vs Data Engineer Venn Diagram

 

Data Scientist

A data scientist is responsible for pulling insights from data. It is the data scientists job to pull data, create models, create data products, and tell a story. A data scientist should typically have interactions with customers and/or executives. A data scientist should love scrubbing a dataset for more and more understanding.

The main goal of a data scientist is to produce data products and tell the stories of the data. A data scientist would typically have stronger statistics and presentation skills than a data engineer.

Data Engineer

Data Engineering is more focused on the systems that store and retrieve data. A data engineer will be responsible for building and deploying storage systems that can adequately handle the needs. Sometimes the needs are fast real-time incoming data streams. Other times the needs are massive amounts of large video files. Still other times the needs are many many reads of the data.
In other words, a data engineer needs to build systems that can handle the 3 Vs of big data.

The main goal of data engineer is to make sure the data is properly stored and available to the data scientist and others that need access. A data engineer would typically have stronger software engineering and programming skills than a data scientist.

Conclusion

It is too early to tell if these 2 roles will ever have a clear distinction of responsibilities, but it is nice to see a little separation of responsibilities for the mythical all-in-one data scientist. Both of these roles are important to a properly functioning data science team.

Do you see other distinctions between the roles?


Originally Posted

in

,

by

Last Modified:

Comments

28 responses to “Data Scientist vs Data Engineer”

  1. Sidney Minassian (@SidneyMinassian) Avatar

    Sean, you’re right, there is a difference and organisations pursuing their Big Data ambitions should have this clarity so they can hire, retain and nurture the right people. We see many organisations chase the the unicorn (the all-encompassing, magical person that can do everything) which clearly does not exist. Data Science and Data Engineering pull on different skill-sets and personality types. Like with many things in business, you fist need a clear Big Data strategy and mandate followed by a combination of skills, platforms and tools.

    1. Ryan Swanstrom Avatar

      Thanks for commenting. Yes the all-in-one unicorn does not exist (or at least not very many exist). I agree, these 2 roles can definitely have some separation.

  2. […] As the field of data science continues to grow and mature, it is nice to begin seeing some distinction in the roles of a data scientist. A new job title gaining popularity is the data engineer. In …  […]

  3. […] As the field of data science continues to grow and mature, it is nice to begin seeing some distinction in the roles of a data scientist. A new job title gaining popularity is the data engineer. In …  […]

  4. […] As the field of data science continues to grow and mature, it is nice to begin seeing some distinction in the roles of a data scientist. A new job title gaining popularity is the data engineer. In …  […]

  5. […] As the field of data science continues to grow and mature, it is nice to begin seeing some distinction in the roles of a data scientist. A new job title gaining popularity is the data engineer. In …  […]

  6. John Harrop Avatar
    John Harrop

    Well though out diagram. A couple of thoughts coming out of my experience with what is now called data science in specific areas of industry.

    I think you could add a role and a dimension to the diagram. The dimension is time/size (which for companies with potential to grow significantly in staff size is the same, but some businesses do not grow as much in staff size as they mature). Later in the time/size of an organization there is generally increasing separation of roles. Early on in the time/size there is a what I call a Data Entrepreneur. This is not the same as the “unicorn” since they don’t need to take the depth further than proof of concept and generate the conviction in investors.

    Another view for your Venn diagram would be Data Scientist versus Domain Specialist. That is another interesting split. I find it a little troubling how little attention that gets compared to Scientist/Engineer. Back in “the day” people in high performance computing were more likely to have come in from a physics (or other domain specialist) background than computer science.

    And yes, there definitely were unicorns. I’m not sure if they are becoming extinct (because it is harder to be one now) or just harder to find with so many data scientists (invasive species?) around! If you find one though, hand around and work with them!

    1. John Harrop Avatar
      John Harrop

      Ug, that was “hang around”, not “hand around”!

      1. Ryan Swanstrom Avatar

        Thanks for the comments John. It is always a trade-off between details and simplicity. Feel free to modify my diagram. If you do add the time/size dimension, let me know. I just might share it on the blog.

        Thanks,
        Ryan

  7. […] data scientist is in charge of deriving and delivering insights from the data. The data scientist collects the data, develops models based on the data, and builds a story out of […]

  8. Kapila Avatar

    Yes I can definitely see a data engineer become a data scientist but not the other way around….!

  9. […] Data Scientist vs Data Engineer […]

  10. Data Science Training In Hyderbad Avatar

    Nice Post.Thanks for sharing about Data Scientist vs Data engineer. Your blog looks great.Keep on sharing such a type of information.

  11. Data Science Corporate Training Avatar

    Thanks for sharing information about data science

  12. […] the models need to be deployed. These tasks fall to the data scientist to solve (unless there is a data engineer willing to help). Recently, I have discovered Azure Functions to be an extremely useful tool for […]

  13. […] human team dynamics, it is very important to hire the right workforce to manage the data. Image via Data Science 101 Clickers and Coders: Ages ago, more people were coding like Ms Dos etc. whereas people now […]

  14. Mohamed Avatar

    /*Coming from Coursera*/
    Thank you for this interesting distinction.

  15. […] A data engineer is typically more interested in systems than just the machine learning. Data engineers are typically strong with computer science fundamentals. They love to build things that themselves and others can use. A good data engineer can also spend a lot of time cleaning data as well. […]

  16. jonjones12 Avatar

    Hi,
    Nice informational article on difference between data scientist and data engineer. Very clear information. Learned many things through this site.

Leave a Reply

Discover more from Ryan Swanstrom

Subscribe now to keep reading and get access to the full archive.

Continue reading