CompanyMarch 8, 2022

Monitoring for Machine Learning: A New Kind of Software Service

Monitoring for Machine Learning: A New Kind of Software Service

“Data-driven” has become something of a buzzword in many industries today. However, not every company that claims to be data-driven is actually making decisions based on their data. And many of those who are, risk making those decisions on unreliable data. This is the root of the problem that Elena Samuylova is trying to solve with her company Evidently AI.

Evidently AI builds tools to analyze and monitor machine learning (ML) models. These tools are all open source and designed to be used by experts as well as casual users. For Samuylova, the casual users pose “a very interesting design problem” and one that has enormous potential in the market. In order to solve it, Evidently AI has joined the startup accelerator Y Combinator.

Samuylova is a strong proponent of open source. On the Open Source Data podcast, she noted that one of the things she likes the most about open source is the way the community helps solve big problems that affect all of us.

How ML monitoring differs from other types of software monitoring services

With the usual software-as-a-service (SaaS) solution, monitoring is a key part of ensuring that you’re meeting your service-level objectives like availability for example. However, monitoring ML processes adds an extra layer of complexity to monitoring. 

It is still a software service. But there’s this extra layer on top related to the ML system. Every time you talk about monitoring, you talk about things that can go wrong. And with ML systems, there are particular types of failures that might happen beyond deliverability. – Elena Samuylova

Samulova noted that ML errors can be a lot less obvious because they aren’t necessarily a lack of input or data. “You can still return a request, but this can be something that you should not trust because maybe your input data was wrong,” she explained, adding that “recently with the pandemic you would have people that were shopping with a completely different pattern. So your demand forecasting models probably wouldn’t pick that up.” 

Types of errors in machine learning

Drift (also known as model decay) is one of the problems that Evidently AI works to solve. Drift refers to the degradation in a model’s predictive power. The two types of drift the Evidently AI team encounters most are:   

  • Data drift – This type of drift occurs when the input data the model was trained with has changed. If the distribution of the variables is meaningfully different, the trained model will no longer be relevant for the new data.
  • Concept drift – When patterns that the model were trained on – the relationships between the model inputs and outputs – have changed, the  meaning of the model changes. As a result, the modeled predictions may not provide the answers the model was designed to give.     

Some models might be robust to drift. Particularly if they are trained with some sort of domain expertise. Most machine learning models are trained on statistical relationships, so in practice you’d need to add some kind of guidelines. Samuylova calls these guardrails and puts them in place to help the AI stick to what it knows. 

And this was where our discussion became more philosophical.

As Samuylova correctly points out, AI doesn’t know what it doesn’t know. This is also the reason you don’t get an error message from your machine learning model that tells you drift has occurred. Instead, you have to monitor the inputs and outputs to test both the model and the ML process.

So how do you monitor and test the inputs and outputs of your model? At Evidently AI they work with two parts of this process:

  1. Clearly define what you want to build and what can go wrong, so you can prepare for some of the expected errors
  2. Make sure that you can be alerted when things go wrong

Right now there’s no standard practice to monitor models in production. Samuylova hopes that this will change, so that in the near future, it’ll be less of a struggle to put ML models into production and make sure their output is reliable.

If you look at the DevOps domain, there are companies like New Relic and Datadog that helped shape practices for software systems. I hope, five years from now, we’ll have it a bit more streamlined for data products. – Elena Samuylova

The future of machine learning with model visibility

Samuylova expects that in just a few years, dashboards for ML processes will be the standard. She thinks ML will have gone through the same development as software development has over the last decade. 

So, just like it’s standard practice to attach Google Analytics to a website and an application like Mixpanel to a data product, some kind of system will be in place for ML.

This might seem like Samuylova is stating the obvious since she’s developing tools for precisely this purpose. However, she doesn’t expect to be the only provider in this space. And, she believes the fact that her solution is built on open source technologies and is able to leverage the many advantages an open source approach provides will earn Evidently AI a solid position in the market.

When you’re sharing what you’re doing, people are so kind in giving feedback. It’s incredibly helpful when you want to prioritize needed features. – Elena Samuylova

If you want to keep an eye on Evidently AI and their continuing development of ML  monitoring solutions, check out their blog, where you can also find a retrospective of an exciting year for machine learning monitoring in 2021.

Enjoy the conversation? Subscribe to the Open||Source||Data podcast so you never miss an episode. 

About Elena Samuylova

Elena Samuylova spent some years in market research, business development and supporting startups before founding her first AI startup targeting the manufacturing industry in 2018. She’s helped turn tons of sensor data into actionable insights. In February of 2020 she co-founded and became the CEO of Evidently AI with the goal of building the first open source tool to analyze, monitor and debug machine learning models in production.

Resources

  1. Open Source Data Podcast
  2. Podcast and Transcript
  3. Evidently AI

One-Stop Data API for Production GenAI

Astra DB gives developers a complete data API and out-of-the-box integrations that make it easier to build production RAG apps with high relevancy and low latency.