Machine Learning Models: Journey to Production
Updated: May 23, 2022
Machine learning (ML) models for real-life applications are built after long iterations of exploratory analysis, data processing, feature engineering, hyper parameter tuning, and validation testing. But the cycle doesn’t end there. After reaching satisfactory model performance, it’s an uphill battle to get this model not only up and running in a production environment, but also maintained and seamlessly integrated into business operations.
This blog will highlight the key components that are needed to deploy a machine learning model to production, as well as the added functionalities needed to maintain it.
Typically, the Data Science/Machine Learning team spend some time in an experimentation environment, testing out different model configurations before delivering the final model to be used in production. Then it’s up to a bunch of different teams to take it from there.
The backend team builds an API for the model to make it servable, i.e., able to respond to prediction requests. In order to set up the correct environment for the model to work properly, this process requires a level of coordination between the data scientist and the backend team in agreeing over the API required inputs and expected outputs, as well as identifying all the model dependencies.
The models need to be deployed on a server, and depending on the type, size, and requirements of the model, this server needs to meet certain specifications. For example, some deep learning models require GPUs for processing. To serve its purpose, the model might also require access to a database for extracting or storing features, storage to access files, and other infrastructure resources. Usually, it’s up to the DevOps team to provision the necessary resources.
Sometimes visualization is required to display the model output in a user interpretable way, like visualizing a sales forecast for the upcoming month or displaying the acceptance and rejection rate of a credit scoring model. This is sometimes the responsibility of a frontend engineer.
After all those components are successfully built and integrated, we can finally have our model deployed in a production environment and embedded within the business processes, right? Well, not quite. All the above components are necessary for the deployment process, but what about maintainability? Surely we don’t models going stale and becoming irrelevant frequently. The key component that helps mitigating this is systematic monitoring and complete observability over the deployed model.
Monitoring is a crucial component in the life cycle of a model in production. It gives visibility on the current state of the model in terms of its performance and API health. Monitoring might mean different things to different users. For example, a data scientist might expect monitoring to focus on model performance, showing relevant error metrics, data drifts, and concept drifts, all of which provide useful indications of when a model might be going stale and requires retraining. A software engineer might expect monitoring to show the deployment’s request logs, CPU/RAM usage, latency, and API related metrics. Business users would expect to see metrics that would enable them to take business related decisions.
This calls for further requirements to add in the ML system, like retraining scripts, drift detection functionality, API and infrastructure health monitoring, business-tailored dashboards, and much more.
All of this is a great deal of work on top of the already tedious task of creating a successful deployment, which is why there is no escape from automating and streamlining common components - especially when there would be a need for multiple ML deployments. The increase in the adoption of AI and ML over the past few years has given rise to MLOps, a set of practices which focus on streamlining the deployment and maintainability of ML systems in production.
KONAN: ML Deployments Made Easy
There is no shortage of tooling when it comes to MLOps. From major open-source projects like Kubeflow to end-to-end solutions like Data Robot, each with their own perks and complexities. Being a Data Science and AI company, we’ve found the need for integrating MLOps in our own ML lifecycle. We’ve created Konan, an MLOps platform that makes deploying, monitoring, and maintaining ML models in production a breeze. Konan provides out-of-the-box APIs, monitoring dashboards, infrastructure auto-scaling, performance monitoring, and much more, addressing the pain points of ML models in production.
Head over to Konan’s page on our website to know more or sign up for a free trial.