Today’s ML teams are highly specialized and multi-disciplined. They consist of people from all walks of life, educational backgrounds, and different skillsets.
Although there are a huge bunch of different roles that are needed within typical ML teams, we are going to focus on two that are most asked about: the machine learning researcher, also known as the data scientist, vs the machine learning engineer. Both these roles are very different from one another despite often being conflated.
In this article, we are going to look at the differences that exist between these two roles to hopefully give readers more clarity about what each one does. Before we go any further, however, we want to note that when we are also including other common terms under the ‘machine learning researcher’ and ‘machine learning engineer’ umbrella:
Machine learning researchers and data scientists are the people responsible for working with data and building machine models. They can be seen as the brains behind an ML operation as they are the ones who clean and interpret data before building models using a combination of data and algorithms.
Nowadays, machine learning researchers tend to come from the academic field and have backgrounds in university research roles. Due to academic breakthroughs in the AI and ML space, much more of their research is being applied more quickly to real-world applications, which is leading to the quick development of more complex ML solutions that, not too long ago, were thought of as impossible.
You might be quite surprised to learn just how large of a technological gap exists between research and real-world production systems, too. In theoretical research settings, things that are critical in production ML and require large numbers of data scientists to keep on top of, such as repeatability, record-keeping, and collaboration, play a comparatively tiny role.
This means that machine learning researchers tend to work alone or in much smaller teams than those of you who work in real-world ML teams are used to. It’s often the case that a single researcher will work on a fixed dataset that they use to train a model over several years. Once the researcher is satisfied with the results, they’ll write up their paper, publish it, and probably won’t ever touch the code again or deploy their model for use in the real world (that’s the job of machine learning engineers, and we will cover this shortly.)
As you may well know, real-world machine learning is fundamentally different from the research ML environment in many ways, including:
These differences create a whole range of challenges that require intricate software infrastructure to overcome and support. This is of course not the work for a researcher; it’s DevOps (or, nowadays, MLOps) work that involves building data pipelines, automated testing, monitoring, and more, and it requires a whole different type of role—a machine learning engineer.
Machine learning engineers are the tinkerers, the builders, the hands-on people. They support the work of machine learning engineers and data scientists by taking their theory and applying it to the real world by building machine learning systems with their models and deploying them into production.
The typical machine learning engineer might do anything from setting up and managing data lakes to building easy-to-use computational clusters for training purposes. They will also be tasked with ensuring that there’s a high availability deployment of the models that they work with.
Owing to the practical hands-on nature of this role, ML engineers tend to come from software development and DevOps backgrounds, meaning that they are familiar with concepts such as containers, tools like Docker, running clusters in various compute clouds, and building robust ML deployment pipelines that can help to enable a model to flourish in the real world.
Now that we’ve covered the basics, let’s look at some of the key differences between ML researchers and ML engineers in a little more depth.
We’ve already covered this, so to summarise:
Machine learning researchers are often educated to a Ph.D. level. They tend to be people who have very strong academic and research-focused backgrounds, and they tend to hold advanced degrees in computer science or related subjects as a result.
Meanwhile, although machine learning engineers are often educated to master’s level, ML engineers that hold advanced degrees like PhDs are very much in the minority.
Salary is probably the key differential that many readers will understandably be interested in, particularly those who are still in school and are researching their career options.
We want to preface this by saying that no matter whether you’re a researcher or an engineer, ML practitioners are in super high demand right now, and they likely will be for the foreseeable future. As a result, the remuneration on offer for high-level ML roles reflects this.
The numbers support it, too. According to the online network for tech companies and start-ups Built In, the average base salary in the U.S. for a machine learning engineer is US$145,296. According to salary.com, the average machine learning researcher's salary is around US$133,871.
Now, keep in mind that these are two numbers reported by two different sources. A quick Google search will throw up all kinds of numbers, so it’s hard to say for certain which are right. There are also outliers and the top 1% who will earn substantially larger sums.
The takeaway? Regardless of whether you’re a researcher or an engineer, you’re going to be handsomely compensated. With that in mind, follow your passions… not the money.
We briefly touched on this earlier.
For a machine learning researcher, the final product or deliverable tends to be a well-written piece of research that has been completed over the course of several years. It will be a highly detail-orientated exploration of a newfound discovery and include information pertaining to experiments and investigative studies carried out to achieve a specific advancement in their niche area of machine learning.
Meanwhile, a machine learning engineer’s deliverable is typically an engineered solution that uses a model created by a machine learning researcher. This could be a piece of automated software that has functionalities powered by a particular method of machine learning.
Software engineering is a discipline that requires practitioners to have an intricate understanding of the components that are associated with a product, pipeline, or process. During a project, an ML engineer may therefore be tasked with:
Meanwhile, an ML researcher’s scope of work is much narrower and more defined. For example, a typical ML researcher doesn’t need to be concerned about how their model or algorithms perform in production environments. Instead, their work focuses on finding new methods of addressing a problem and leaving it to the ML engineers to figure out how to make it work in the real world.
Although the two roles are very different, they do overlap. ML researchers can and do engineer, while ML engineers can and do research.
The reality is that ML is a multidisciplinary field and being a modern-day ML practitioner means being well-versed in both research and engineering aspects. Anyone who is looking to enter the field needs to be mindful of this and accept that, in reality, whatever role they end up in won’t involve 100 percent engineering or 100 percent research; there will always be some overlap.
In addition, regardless of which side of the coin an ML role falls on, it will require a tremendous amount of time and effort—but the rewards on offer are unlike those in any other field.
If you would like to learn more about modern machine learning practices, concepts, tools, and more, check out the Qwak blog where we regularly publish new and exciting content that covers everything from the basics such as ML data catalogs to advanced concepts like training-serving skew and deep Q-learning.