Our experience of working on AI and ML projects means that we understand the importance of establishing best practices when using MLOps to test, deploy, manage and monitor ML models in production.
Considering that 87% of data science projects never make it into production, it’s vitally important that AI projects have access to the right data and skills to solve the right problems, using the right processes.
Below, we outline six fundamental principles of MLOps that should be at the heart of your AI strategy.
1 – Build solid data foundations
Your data scientists will need access to a store of good quality, ground-truth (labelled) historical data. ML models are fundamentally dependent on the data that’s used to train them, and data scientists will rely on this data for monitoring and training.
It’s common to create data warehouses, data lakes or lake houses with associated data pipelines to capture this data and make it available to automated processes and data teams. Our data pipeline playbook covers our approach to providing this data. Make sure to focus on data quality, security, and availability.
2 – Provide an environment that allows data scientists to create
Developing ML models is a creative, experimental process. Data scientists need a set of tools to explore data, create models and evaluate their performance. Ideally, this environment should:
- Provide access to required historical data
- Provide tools to view and process the data
- Allow data scientists to add additional data in various formats
- Support collaboration with other scientists via shared storage or feature stores
- Be able to surface models for early feedback before full productionisation
3 – ML services are products
ML services should be treated as products, meaning you should apply the same behaviours and standards used when developing any other software product.
For example, when building ML services you should identify and profile the users of a service. Engaging with users early in the development process means you can identify requirements that can be built into development, while later on, users can help to submit bugs and unexpected results to inform improvements in models over time.
Developers can support users by maintaining a clear roadmap of features and improvements with supporting documentation, helping users to migrate to new versions and clearly explaining how versions will be supported, maintained, monitored and (eventually) retired.
4 – Apply continuous delivery of complex ML solutions
ML models must be able to adapt when the data environment, IT infrastructure or business needs change. As with any working software application, ML developers must adopt continuous delivery practices to allow for regular updates of models in production.
We advise that teams should use techniques such as Continuous Integration and Deployment (CI/CD), utilise Infrastructure as Code and work in small batches to have fast, reasonable feedback.
5 – Evaluate and monitor algorithms throughout their lifecycle
It’s essential to understand whether algorithms are performing as expected, so you need to measure the accuracy of algorithms and models. This will add an extra layer of metrics on top of your infrastructure resource measurements such as CPU and RAM per Kubernetes pod. Data scientists are usually best placed to identify the best measure of accuracy in a given scenario, but this must be tracked and evaluated throughout the lifecycle, including during development, at the point of release, and in production.
6 – MLOps is a team effort
What are the key roles within an MLOps team? From our experience we have identified four key roles that must be incorporated into a cross-functional team:
- Platform/ML engineers to provide the hosting environment
- Data engineers to create production data pipelines
- Data scientists to create and amend the model
- Software engineers to integrate the model into business systems
Remember that each part of the team has a different strength – data scientists are typically strong at maths and statistics, while they may not have software development skills. Engineers are often highly-skilled in testing, logging and configuration, while data scientists are focused on algorithm performance and accuracy.
At the outset of your project consider how your team roles can work together using clear, defined processes. What are the responsibilities of each team member, and does everyone recognise the standards and models that are expected?
To learn more about MLOps principles and driving better, more consistent best practices in your MLOps team, download our Operationalising Machine Learning Playbook for free.