Current ML model evaluation techniques are falling short. Model evaluation using only global metrics like accuracy or F1 score produces a low-resolution picture of model performance and fails to describe performance across types of cases, attributes, and scenarios.
It is rapidly becoming vital for production ML teams to have a full understanding of when and how their models fail and to track these behaviors across different model versions to be able to identify regressions.
We’ve seen great results from teams implementing unit and functional testing techniques in their model testing. In this talk from PyData's March 2022 Meetup, Kolena's Head of Product, Gordon Hart, will cover why systematic unit testing is important and how to effectively test ML system behavior.
Kolena is the MLOps platform for robust, end-to-end ML model testing. Built for all internal teams, see how we can help you ship your ML models with confidence.
Want to learn more? Head to our schedule a demo page to see how we can help.