How to Use DVC for Applications in ML Drug Discovery Pipelines| Estefania Barreto-Ojeda | PyData NYC

Показать описание

Community member Estefania Barreto-Ojeda shares how they use DVC at Cyclica for Applications in ML Drug Discovery Pipelines. This talk was originally given @PyDataTV NYC in the Fall of 2022.

Development of Machine Learning (ML) pipelines in drug discovery faces different challenges from those in traditional software development. In addition to unique challenges during the data engineering stage, drug discovery pipelines require not only the standard Git tracking for source code but also make versioning of data and ML models necessary. In this talk, we will discuss some of the main challenges when working with biological data and how Data Version Control (DVC) tools help to facilitate data- and model-tracking during the development of ML drug discovery pipelines.

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced-level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

00:00 Welcome!
01:18 Overview.
02:03 Part I: Biological data.
03:13 Complexity of biological data. What makes biological data different?
12:49 Overview ML drug discovery pipelines.
15:20 Challenges in ML drug discovery pipelines.
17:12 Part II Implementing data version control in Drug Discovery pipelines.
17:20 Introduction to DVC.
19:13 Installing and inititalizing DVC.
21:24 Set DVC remote.
22:36 Versioning files with DVC. What does dvc add do?
25:21 Implementing DVC in Drug Discovery pipelines - Demo.
27:47 Data versioning.
28:17 Build a DVC ML pipeline.
28:30 Build a DVC ML pipeline - Featurization stage.
32:28 Initial Directed Acyclic Graph (DAG).
32:50 Build a DVC ML pipeline - Processing stage
34:12 Running ML pipelines with DVC repro.
35:48 Build a DVC ML pipeline - Training+Metrics stage
38:42 Final DAG.
40:07 Highlights.

To learn more about Iterative's open-source and SaaS tools please visit:

#dvc #machinelearning #datascience

Рекомендации по теме

How to Use DVC for Applications in ML Drug Discovery Pipelines| Estefania Barreto-Ojeda | PyData NYC

Versioning Data with DVC (Hands-On Tutorial!)

Learn DVC In 20 Minutes | What Is DVC | DVC Tutorial For Beginners (Hands-on Tutorials)

DVC Explained - What is Disney Vacation Club and How Does it Work? - DVC 101

DVC Basics: Understanding DVC Use Years, Banking, and Borrowing Points!

Version Control with DVC in a nutshell 🥜 (No Code!)

Booking a Disney Vacation Club Trip | DVC Member Portal | Trip Reservation & Modification How-To

How To Maximize Your DVC Points

The TRUTH About Disney's HUGE Money Maker -- Disney Vacation Club

Professorial Inaugural - Prof Sunday Samson Babalola

❓Disney Vacation Club Explained | What is DVC? | How does DVC Work?

My Favorite Way To Use DVC Points :: Disney Vacation Club

DVC Booking Window and Resale Restrictions Explained

How to Use DVC for Applications in ML Drug Discovery Pipelines| Estefania Barreto-Ojeda | PyData NYC

How to Get the DVC Reservation You Want | Disney Vacation Club Booking Strategies!

Machine Learning Experimentation in VS Code: Introducing our DVC Extension for VS Code!

How to Check DVC Availability | Disney Vacation Club Website How-To

DVC 101: Disney Vacation Club Explained

Disney Vacation Club Secrets | Saving Money with DVC and More!

How To Rent Disney Vacation Club (DVC) Points With David’s Vacation Club Rentals

How 'Non-Members' can rent DVC Rentals & SAVE $$$$ | Disney Vacation Club | Full Tour ...

How to do data versioning using dvc | MLOps | #dvc #dataversion #machinelearning

How does Disney Vacation Club work? Should I join DVC? Disney Vacation Club explained! Part 1

Disney Vacation Club Resale Websites Overview | How I Browse DVC Resale Sites

DVC Last Minute Cancellations Holding Points