filmov
tv
Humans and ML collaboration in data labeling
Показать описание
Data-Driven AI Meetup 7: Humans and ML collaboration in data labeling
Evgenii Sorokin, Senior ML Engineer, Toloka
This talk looks at a scalable crowdsourcing approach to forming high-quality data. In modern AI pipelines, data quality is just as important as computing power (like GPU and TPU) and algorithms for data transformation (which are essentially machine learning methods). However, striving for stable quality of data labeling is a complex and unscalable process. Evgenii shares insights on how to achieve quality without sacrificing scalability: combine knowledge from current state-of-the-art models to aggregate and construct the correct data labels, create dynamically calculated prior knowledge about the data, and apply human verdicts to improve and accelerate the receipt of correctly annotated data.
Get in touch with us
Follow us in social networks to make sure you won't miss any updates.
Evgenii Sorokin, Senior ML Engineer, Toloka
This talk looks at a scalable crowdsourcing approach to forming high-quality data. In modern AI pipelines, data quality is just as important as computing power (like GPU and TPU) and algorithms for data transformation (which are essentially machine learning methods). However, striving for stable quality of data labeling is a complex and unscalable process. Evgenii shares insights on how to achieve quality without sacrificing scalability: combine knowledge from current state-of-the-art models to aggregate and construct the correct data labels, create dynamically calculated prior knowledge about the data, and apply human verdicts to improve and accelerate the receipt of correctly annotated data.
Get in touch with us
Follow us in social networks to make sure you won't miss any updates.