Data Engineering Vs Machine Learning Pipelines - What Is The Difference

preview_player
Показать описание
Data engineering and machine learning pipelines are both very different but oddly can feel very similar. Many ML engineers I have talked to in the past rely on tools like Airflow to deploy their batch models.

So I wanted to discuss the difference between data engineering vs machine learning pipelines.

To answer this question, I pulled in Sarah Floris, who is both an experienced data engineer as well as the author behind the newsletter The Dutch Engineer.

So let’s dive in.

If you enjoyed this video, check out some of my other top videos.

Top Courses To Become A Data Engineer

What Is The Modern Data Stack - Intro To Data Infrastructure Part 1

If you would like to learn more about data engineering, then check out Googles GCP certificate

If you'd like to read up on my updates about the data field, then you can sign up for our newsletter here.

Or check out my blog

And if you want to support the channel, then you can become a paid member of my newsletter

Tags: Data engineering projects, Data engineer project ideas, data project sources, data analytics project sources, data project portfolio

_____________________________________________________________
_____________________________________________________________
About me:
I have spent my career focused on all forms of data. I have focused on developing algorithms to detect fraud, reduce patient readmission and redesign insurance provider policy to help reduce the overall cost of healthcare. I have also helped develop analytics for marketing and IT operations in order to optimize limited resources such as employees and budget. I privately consult on data science and engineering problems both solo as well as with a company called Acheron Analytics. I have experience both working hands-on with technical problems as well as helping leadership teams develop strategies to maximize their data.

*I do participate in affiliate programs, if a link has an "*" by it, then I may receive a small portion of the proceeds at no extra cost to you.
Рекомендации по теме
Комментарии
Автор

It is messy to directly compare feature engineering (FE) to the transform step in ETL; they exist on different levels of abstraction. A "traditional" ETL pipeline looks more like ETTTTL in practice because data is piped between multiple tables before it ends up in something like a dashboard or ML model. In the non-ML use case, we design that last "T" by consulting analysts and stakeholders to understand what subset of the data they need at which cadence for reporting/tracking/etc. In the ML use case, we design the last "T" using feature engineering to match the data to the requirements of the algorithm / ML model we are using.

firefoxmetzger
Автор

I found this video a bit abstract. Does anyone know a good comparison between Airflow and Kubeflow (or TFX)? That may help provide concrete examples of Data Engineering vs ML pipelines

tonghongchen
Автор

I can tell all of that definitions you’ve put up there in the video is generated by chatgpt 😅

mahmudhasan
Автор

Is it possible to become both a backend developer and a data engineer at the same time

Onuorahh
Автор

What is your opinion on Godfather of ai statement after resigned Google

Nick-duss
welcome to shbcf.ru