Maturing Machine Learning in Enterprise // Kyle Gallatin // MLOps Coffee Sessions #43

preview_player
Показать описание
Coffee Sessions #43 with Kyle Gallatin of Etsy, Maturing Machine Learning in Enterprise.

//Abstract
The definition of Data Science in production has evolved dramatically in recent years. Despite increasing investments in MLOps, many organizations still struggle to deliver ML quickly and effectively. They often fail to recognize an ML project as a massively cross-functional initiative and confuse deployment with production. Kyle will talk about both the functional and non-functional requirements of production ML, and the organizational challenges that can inhibit companies from delivering value with ML.

// Bio
Kyle Gallatin is currently a Software Engineer for Machine Learning Infrastructure at Etsy. He primarily focuses on operationalizing the training, deployment, and management of machine learning models at scale. Prior to Etsy, Kyle delivered ML microservices and lead the development of MLOps workflows at the pharmaceutical company Pfizer. In his spare time, Kyle mentors data scientists and writes ML blog posts for Towards Data Science.

--------------- ✌️Connect With Us ✌️ -------------
Follow us on Twitter: @mlopscommunity

// Takeaways
Data science is still poorly defined and there is a large variance in organizational maturity
Basically, everything we need for mature ML in modern organizations exists technically except for the strategy, mentality, organization, and governance
Organizations who poorly define data science often overburden their data scientists, but there are expectations that data scientists know some engineering
Operationalizing data science is not that different from software engineering, and software engineering can be one of the most valuable skillsets for a data scientist.

// Q&A with Kyle as a data science mentor:

Timestamps:
[00:00] Introduction to Kyle Gallatin
[01:00] Kyle's path into tech
[02:04] Horror stories from data analyst to full-fledged engineer
[03:45] "I'm very happy with the path I took, wish I have taken more classes, had more that knowledge, paid attention to my CS class."
[04:04] SAS app with heavy ML services
[05:13] "Python is so much support for so many different things, especially within the machine learning space."
[06:43] Tips in terms of working with YAML in general.
[07:10] "Find a technology that you're comfortable with. Second, no need to go light on the plug-ins to your IDE."
[08:43] "Please take the easy way. Don't try and do everything in them. You have these tools available for you. There are no extra points for doing it that way even if it might save you once or twice in some random scenarios."
[09:00] Other plug-ins Kyle likes.
[09:07] Visual Studio Code, Remote SSH plug-ins
[09:44] Future of ML in the current climate
[11:12] "MLOps is again, ubiquitous everywhere as a buzzword. And it's, it's awesome to see it exploding so much in terms of from open-source support to the enterprise platform offerings."
[12:08] Heuristics or Further to go
[15:19] "We don't fully know what the best way to go about that is once you get to that level, it is kind of unknown territory."
[15:33] Monitoring and Observability
[16:21] Specialized and Customized
[17:43] "I think there are a lot of commonalities, but the subtle differences, if you try and build for everything, then, then you're going to, then you would fail."
[17:54] Integrations
[20:00] "I think a common KPI for a lot of teams is time to production for a new data scientist or something like that."
[20:22] Data scientists fit in
[21:34] "One size doesn't fit all. It really depends on the data scientists."
[22:40] How you build depends on the people working
[23:40] "Data science still being so poorly defined. And so in flux, in terms of a definition of a domain and as a role and everything."
[24:00] Platform engineering realm
[25:00] "How do we continuously optimize model serving to the point where it's feasible and actually generates a value to serve these models at scale?"
[25:21] Model serving platform
[27:13] Standardization
[29:00] Judgement
[29:57] "The better that you can break down that work into smaller pieces, the more accurate you're going to be."
[30:30] Data access regulations
[33:32] Technical standpoint
[34:37] Clear definition of use case
[36:04] Next big thing in MLOps
[36:25] Governance
[37:50] Modern way scaling
[38:46] Nontechnical companies to step up
[41:18] Defining the problems
[42:38] Considering the value of needs
Рекомендации по теме