Agile Data Science - John Sandall | PyData Global 2021

Показать описание

Agile Data Science: How To Implement Agile Workflows For Analytics & Machine Learning
Speaker: John Sandall

Summary
"Agile doesn't work for data science." Or does it?

In this talk we provide a gentle introduction to implementing an agile workflow for a data science team. We will demystify the terminology, tools and processes, and provide practical tips from our experience moving all of our client teams and projects to agile workflows in 2021.

Description
"Agile doesn't work for data science." Or does it?

Sprints, Scrum, Kanban, Stories, Epics, Retrospectives, Extreme Programming, Velocity...Agile's opaque terminology and practices, plus the zeal of its advocates, can be off-putting to newcomers. Can it even be applied to data science, analytics and machine learning projects?

In this talk we provide a gentle introduction to implementing an agile workflow for a data science team. We will demystify the terminology, tools and processes, and provide practical tips from our experience moving all of our client teams and projects to agile workflows in 2021.

We've seen an increase in measurable output, better communication and a higher value-per-effort on work delivered. We've found it works especially well for managing research projects with a high level of uncertainty, such as developing machine learning models.

Agile's focus on measurable results aligns well with other goal-setting paradigms such as OKRs, but when applied to data scientific projects it encourages best practices such setting clear expectations on how a team validates their work.

This light-hearted talk is beginner-friendly with no prior knowledge required. Whilst it may be especially relevant for leaders of data science teams, moving to an agile workflow requires the whole team to understand and buy into the concept. We hope this talk proves a useful resource in this endeavour.

John Sandall's bio
John Sandall is the CEO and Principal Data Scientist at Coefficient.

His experience in data science and software engineering spans multiple industries and applications, and his passion for the power of data extends far beyond his work for Coefficient’s clients. In April 2017 he created SixFifty in order to predict the UK General Election using open data and advanced modelling techniques. Previous experience includes Lead Data Scientist at YPlan, business analytics at Apple, genomics research at Imperial College London, building an ed-tech startup at Knodium, developing strategy & technological infrastructure for international non-profit startup STIR Education, and losing sleep to many hackathons along the way.

John is also a co-organiser of PyData Bristol, co-founded Humble Data in 2019 to promote diversity in data science through a programme of free bootcamps, and in 2020 was a Committee Chair for the PyData Global Conference. He is currently a Fellow of Newspeak House with interests in open data, AI ethics and promoting diversity in tech.

PyData Global 2021

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.