Behavioral Testing of ML Models (Unit tests for machine learning)

Показать описание

How can we empower machine learning models with powerful software engineering techniques like unit testing?

Evaluating ML models using a single metric (like accuracy or F1-score) produce a low-resolution picture of model performance. Behavioral tests can give us a much higher resolution evaluation of a model's capabilities. By creating tests (which are small targeted test sets), we can better compare models or observe how model performance changes after re-training a model (or fine-tuning it). We discuss the paper 'Beyond Accuracy: Behavioral Testing of NLP Models with CheckList', which was selected as the ACL 2020 Best Paper.

Introduction (0:00)
Comparing models using capabilities (0:33)
Behavioral test of NLP models (3:06)
Test Type 1: Minimum Functionality Tests (4:22)
Test Type 2: Invariance Tests (7:04)
Test Type 3: Directional Expectation Tests (7:32)
Summary and Conclusion (10:00)

------

Paper: Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Code:

------

More videos by Jay:
Language Processing with BERT: The 3 Minute Intro (Deep learning for NLP)

Explainable AI Cheat Sheet - Five Key Categories

The Narrated Transformer Language Model

Jay's Visual Intro to AI

How GPT-3 Works - Easily Explained with Animations

Jay Alammar

Рекомендации по теме

Комментарии

This is a great topic! Thanks for presenting it so nicely! Well spoken and visualized! 💪

AICoffeeBreak

Nicely explained Jay💯. I look forward to more of these

katnoria

Your videos, contents and explanations are really good. Thanks for making quality content.

It will be more nice, if you speak with same pitch as a start till the end of the sentence. Because words at the end of the sentences are low in volume.

Thanks again for the great videos

SreeramAjay

Great Video, But using a small test set for QA should be done carefully as with time model can over-fit on those datasets.

manavmadan

What a great presentation. Can I say Behavioral testing is somehow similar to Metamorphic testing in ML-Based Systems?

haftamuhailu

This is a very interesting approach that can be extended to vision models as well!

ramandutt

Thank you so much for the nice expanation.

abrar-tech

Really cool video Jay. Have you come across any equivalent approaches for tabular data?

jsnctl

Can same concepts be applied to supervised models .. like regression or classification models?

saurabhatwipro

Jay, are you aware of any other code examples of these tests?

AZ

What should I get from this? That AI in Natural Language Processing is still in its infancy??
Have you heard about Duolingo. I need to know whether AI can successfully be implemented in Language Learning. It seems to me Duolingo corrects homework based on the order of the tiles.

Gives you [boy] [am] [I] [a] (which he reads as [4] [2] [1] [3]. Just expects a correct order.)

[I] [am] [a] [boy] >>[1] [2] [3] [4]

That explains why (when it asks you to type) " I am a boy " is wrong but " I am a boy. " is correct! Just because of a dot! Sometimes it penalises FOR writing a dot. I'm guessing it checks the database for the exact sentence as opposed to language recognition.

oosmanbeekawoo

Behavioral Testing of ML Models (Unit tests for machine learning)

Behavioral Testing of ML Models (Unit tests for machine learning)

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList | AISC

All Machine Learning Models Explained in 5 Minutes | Types of ML Models Basics

Beyond Accuracy: Behavioral Testing of NLP Models with Sameer Singh - #406

Marco Tulio Ribeiro Beyond Accuracy Behavioral Testing of NLP Models with Check

Mikhail Korobov - Explaining behavior of Machine Learning models with eli5 library

Beyond Accuracy Behavioral Testing of NLP Models with CheckList

MLOps Tutorial #6: Behavioral tests for models with GitHub Actions

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList | NLP Summit 2020

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList (Best Paper ACL 2020)

Designing and Analyzing Behavioral Experiments with Machine Learning

Creating Automated Testing for ML Models #ai #artificialintelligence #machinelearning #aiagent

Predicting and optimizing the behavior of large ML models

Beyond Accuracy: Behavioral Testing of NLP Models with CheckList - Presented BbySaad Daher

Behavior of Machine Learning 🤖 | Class 02 – How ML Models Learn

AKBC 2021: Paper: Behavioral Testing of Knowledge Graph Embedding Models for Link Prediction

Test Edge Cases #softwaretesting #machinelearning

AI Safety Testing: The Future of Intelligent Models

Perform A/B Testing #softwaretesting #machinelearning

Part 1: Overview of Underfitting & Overfitting | Machine Learning Model Behavior Explained

Behavior Testing of Load Forecasting Models using BuildChecks

Machine Learning | What Is Machine Learning? | Introduction To Machine Learning | 2024 | Simplilearn

Algo Hour - Behavioral Testing of Recommender Systems with RecList | Jacopo Tagliabue, Coveo