Tag Archives: data science

Data Science News for April 29, 2019

Here is the latest data science news for the week of April 29, 2019.

From Data Science 101

General Data Science

What do you think? Did I miss any big news in the data science world?

GoLang for Data Science

While it is not one of the popular programming languages for data science, The Go Programming Language (aka Golang) has surfaced for me a few times in the past few years as an option for data science. I decided to do some searching and find some conclusions about whether golang is a good choice for data science.

Popularity of Go and Data Science

As the following figure from Google Trends demonstrates, golang and data science became trendy topics at about the same time and grew at a similar rate.

The timely trends may have created the desire to merge the two technologies together.

Golang Projects for Data Science

Some internet searching will reveal a number of interesting Golang/Data Science projects on Github. Unfortunately, many of the projects had good initial traction but have dwindled in activity over the last couple years. Below is a listing of some of the data science related projects for Golang.

  • Gopher Data – Gophers doing data analysis, no schedule events, last blog post was 2017
  • Gopher Notes – Golang in Jupyter Notebooks
  • Lgo – Interactive programming with Jupyter for Golang
  • Gota – Data frames for Go, “The API is still in flux so use at your own risk.”
  • qframe – Immutable data frames for Go, better speed than Gota but not as well documented
  • GoLearn – Machine Learning for Go
  • Gorgonia – Library for machine learning in Go
  • Go Sklearn – Port of sci-kit learn from Python, still active but only a couple committers, early but promising
  • Gonum – Numerical library for Go, very promising and active

Golang Data Science Books

There have even been a couple books written about the topic.

Thoughts from the Community

The “Go for Data Science” debate has been discussed numerous times over the past few years. Below is a listing of some of those discussions and the key take aways.

Reasons to use Golang for Data Science

  • Performance
  • Concurrency
  • Strong Developer Ecosystem
  • Basic Data Science packages are available

Reasons Not to use Golang for Data Science

  • Limited support from the data science community for Golang
  • Significantly increased time for exploratory analysis
  • Less flexibility to try other optimization and ML techniques
  • The data science community has not really adopted golang

Summary

In short, Golang is not widely used for exploratory data science, but rewriting your algorithms in Golang might be a good idea.

Finding Azure Updates

Microsoft Azure has an abundance of data science capabilities (and non-data science capabilities). It can be challenging to keep up with the latest updates/releases. Luckily, Azure has a page to let you know exactly what has changed. You just need to know where to find it, and the following video will help you find that page.

Also, if you are still interested in earning a Microsoft Data Science Certification, join the Study Group.

Top Companies to work for if you are a data scientist

LinkedIn’s 2017 report had put Data Scientist as the second fastest growing profession and it’s number one on 2019’s list of most promising jobs. There are three main reasons why data science has been rated as a top job according to research. Firstly, the number of available job openings is rapidly increasing and the highest in comparison to other jobs, data science has an extremely high job satisfaction rating, and the median annual salary base is undeniably desirable.

While data science is unquestionably a fantastic career path regarding the impressive ratings and the fact that it is such an in-demand job, statistics show that there will be no slowing down for the surprisingly rapid increase for the demand of data scientists around the globe.

Checkout the top 5 companies to work for if you are a data scientist based on employee reviews, job satisfaction ratings, and CEO approval.

#1 Dataiku

Dataiku is a top-rated computer software company that was founded in 2013 and its headquarters can be found in New York. This company develops collaborative data science software and according to Glassdoor reviews, 99% of the employees that work for Dataiku would recommend working at this company and 100% approve of the CEO. This shows that the vast majority of the employees are satisfied with the company and they are also a top choice for data science and machine learning positions based on annual pay packages.

Checkout: Dataiku Careers

#2 StreamSets

StreamSets was founded in 2014, its headquarter is located in San Francisco, California. The company develops a DataOps platform that can allow business to manage streaming data flows. An impressive 98% of individuals employed at this company would recommend it to their friends and 100% of the employees here also approve of the CEO. StreamSets is a top option for data management and integration.

Checkout: StreamSets Careers

#3 1010 Data

1010 Data has its headquarter in the New York and the company has over 15 years of experience in handling data analytics with over 850 clients across various industries. It is ranked as the third best company to opt for as a data scientist, 1010 Data is also a great option with 96% of employees recommending the company and 99% of employees approving of the CEO.

Checkout: 1010 Data Careers

#4 Reltio

Reltio is based in Redwood Shores, California and the company was founded in 2011. This top-rated company is recommended by 96% of its employees and a top choice for data management and integration. Even though it is fourth on the list according to statistics, it is still a fantastic company to expand your experience as a data scientist.

Checkout: Reltio Careers

#5 Looker

Looker was founded in 2012 and its headquarters are located in Santa Cruz, California. Looker is suggested as a great company to opt for by 95% of their happy employees and 93% of the employees that work at Looker approve of the CEO. This company is great for business analytics.

Checkout: Looker Careers

How can you get a job as a data scientist?

Having a degree in Data Science, Computer Science, Mathematics, Statistics, Social Science, Engineering with additional knowledge of Python, R Programming, Hadoop increases the possibility of getting a starting position job. Plenty of universities offer specialized data science program both online and offline. In recent times, we have observed a rise in online masters in data science, because of the convenience it offers to professionals, especially those looking to switch careers.

Build a portfolio using real data to complete projects that can showcase your abilities as a data scientist. You could also opt for an internship to further develop your skills and knowledge as a data scientist.

3 Tips for Data Science Interviews

1. Be Honest

Try not to exaggerate your skills. If the job sounds more engineering focused than you are wanting, be honest and say that. Data Science is getting very broad and you don’t want to get in a position that is a bad fit.

You often sound worse when you try to explain something you do not understand. Just be honest and say, “I have not needed to use that yet, can you explain to me when you have done that?”

Find the job you are looking for, not just any position someone is trying to fill.

2. Tell Stories of Your work

Talk about things you have built. If you built something as part of a team or project, tell about why and your involvement.

If you have a side project, talk about that. This is why side-projects are so important. They help you learn a lot, plus they give you something exciting to talk about, which only you can talk about.

3. Practice

Interviews are intimidating. There is really no way to avoid that. The best you can do is be prepared for the interview. Pramp provides a platform for practicing data science interviews. Practice makes perfect.