AWS Tutorials - AWS Glue Pipeline to Ingest Multiple SQL Tables

preview_player
Показать описание

There are scenarios where one has to ingest data from multiple SQL tables to the data lake. It raises the debate about whether to use individual glue job and pipelines or use single glue job and pipeline. This tutorial discusses the debate in detail and also shows demo for single pipeline single job scenario.
Рекомендации по теме
Комментарии
Автор

Thanks for great content. The videos in you content is relatable in terms of real world problems which is great. Looking forward to get more of like this and if possible put all these steps on your website as well as easy to compare during practice session. 😀

terrcan
Автор

Hello sir.
Do you have any content about how to ingest from a external DB for the GLue ingestion job, using VPC (such as using a connection - MySQL or SQLServer datasource instead an AWS Redshift source)

cassianocalimansantos
Автор

How to create parameterized AWS Glue Job but with CDC injestion, because in this case the job will be run continuously every 5 minutes to update data (or doing an Upsert). Is there a way to make upserts in a generic way (or parameterized way)?

deveshv
Автор

very nice explanation and implementation... thank you so much !

durgarasane-kolapkar
Автор

Hi if orderdata failed to write into destination others will fail or flow is running

nagarjunau
Автор

Can you pls do concurrent run on workflow also

veerachegu
Автор

Is that cost effective to have a single job running multiple times or multiple job runs once?

debaratiaich
Автор

Great content from new sub. Please do more big data stuff!

khandoor
Автор

Nice content. A video on CDC would be great.!

sonynavi
Автор

I crrated a step function just like yours, but my step function is running forever

IsmaelRDeMelo