Big Data Processing Using Distributed Maps and AWS Step Functions (S3 + Lambda)

preview_player
Показать описание
AWS Step Functions are a powerful orchestration or workflow service. Distributed Maps are a feature that help you implement a series of tasks in parallel. They offer much higher scale over normal Step Function Maps and have useful features like S3 CSV / JSON readers for your event source. In this video, I show you how to use the distributed map feature by processing records in a CSV file located in S3 using a Lambda Function.

📚 My Courses 📚

🎉SUPPORT BE A BETTER DEV🎉

📚 MY RECOMMENDED READING LIST FOR SOFTWARE DEVELOPERS📚

🎙 MY RECORDING EQUIPMENT 🎙

💻 MY DESKTOP EQUIPMENT 💻

🌎 Find me here:

#stepfuctions
#aws
#serverless
Рекомендации по теме
Комментарии
Автор

Thanks pal, I've been loving your videos for years and this one helped me to quickly solve a current task at my job!

Langstonrocks
Автор

Can't wait for you Step Function Course, Daniel. Thanks a bunch for this video.

andyweeks
Автор

I was just researching on this topic for a project and saw you uploaded newest video about it 4 hours ago :D

haiderh
Автор

From 20:15 onwards you mix up inline vs distributed with standard vs express. Great vid tho!

messibarca
Автор

It is so complicated to use STEP function to do relatively simple tasks which can be done in other way.

dianad
Автор

Is there a way to preserve order of execution here? Suppose I need to aggregate results from the CSV and I need to maintain the original order of items from the input CSV.

WiredMartian
Автор

Is there a way to overcome the overhead what map run adds to the overall state machine execution? The execution time seems to be around 8s in your video but the individual lambda executions seem to be ready around 2-300 ms. Is there any recommendation for an alternative solution if latency is critical (around 5s)?

tamaskiss
Автор

How to store the data in csv after modification?

vinodreddy
Автор

How could we show this in github? I know the first step would be to create a design doc about the architecture but I would like to know if you have any examples. I want to put together a portfolio to showcase my work but I would like to explain it effectively on my github.

tello
Автор

A great channel. Thank you! I have tried this code on a large csv file with half a million records. Unfortunately, it takes forever. I am not sure what is wrong. I hope anyone can provide some help.

MrAbdel
Автор

Hi.Tried to reproduce and stuck with specified tolerated failure threshold was exceeded. CSV file was an issue. Initially I saved excel file with test data as CSV UTF-8 and after error I saved as CSV. Execution succeeded.

artbart