End to End Streaming Data Pipeline Using AWS MSK & AWS Serverless Services

preview_player
Показать описание
In this video, we will execute an End-To-End Data Engineering Project on External Application Response capture in real-time in s3 Datalake using AWS MSK & AWS Serverless Services.

Prerequisite:
-----------------------------
Learn AWS VPC , Subnet , Route Table, Internet Gateway with Hands-on Demo
VPC with Public and Private Subnets in-depth intuition with Hands On
AWS VPC NAT Gateway In-depth intuition
AWS VPC ENDPOINT in depth intuition & hands on
Using Amazon VPC Endpoints to Access DynamoDB in-depth
Getting Started with AWS Managed Streaming for Kafka with in-depth service setup
Creating a Serverless Apache Kafka(MSK) publisher using AWS Lambda
Using Amazon MSK as an event source for AWS Lambda

Steps for the Lab:
-------------------------------

Check this playlist for more Data Engineering related videos:

Apache Kafka form scratch

Snowflake Complete Course from scratch with End-to-End Project with in-depth explanation--

🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY YOUTUBE CHANNEL
Рекомендации по теме
Комментарии
Автор

I have covered most aspects of Data Engineering, Airflow, Databricks, Mlflow, etc.
Kafka is one tough nut. This channel helped me overcome that too :)

adityanjsg
Автор

Thanks a lot! We need more data engineering content on youtube!

viditjain
Автор

Better explanation than AWS documentation 🔥🔥🔥⚡⚡⚡

carlpei
Автор

Great job. And excellent presentation!

naveenkumarmurugan
Автор

Good work, thanks for your hard work. It is remarkable.

sunnydrall
Автор

Any other alternatives for sqs? Since sqs is also can be used as an alternative for Kafka so why using msk cluster when we can directly send it to the consumer from the lambda function! ?

nainaarabha
Автор

Hi, you are doing a great job. 👍

I have a question, why didn't you use serverless Kafka in aws?

sunainakhanna
Автор

Can you please explain why we need to have SQS? Why not just have a lambda function that triggers for each event passed to the API endpoint in AWS API Gateway that writes to Kafka?

Conner
Автор

Is there a way as of today to eliminate those two(Producer Lambda-SQS, Consumer Lambda) and integrate APIGW directly to MSK followed by MSK directly to Kinesis Firehose?

praneethvvs
Автор

from where did you get the ARN layer for kafka that you configured on lambda layer??

adityagaikwad
Автор

Please allow me to ask, it's just a micro-batch processing architecture, not a near real-time one, right ? We could replace lambda+firehose with msk connector to make it near real-time at consumer end, so what we could do to make the architecture near real-time at producer end? Please kindly shed some light on it ? Thank you!

PMDinh
Автор

is this possible to replace (Lambda consumer + Kinesis Firehose) with Kafka connect to sink data to s3 (and maybe the same with the source)

hoangminhninh
Автор

Hey man Thanks a lot for this ❤️
Can you plz help me with one how can we get the lambda codes for csv, parquet??

nishant
Автор

Hi, can you please help to understand how we can do the same setup using Java based application. I’m facing lot of issues using Java to connect to the AWS MSK cluster.

tanusreechatterjee
Автор

Can we implement this project on aws free tier for practice please reply🙏

krishnakumarkumar
Автор

thank you for the great video ❤ but i am getting issue when i am adding sqs trigger to my lambda the message is getting in flight mode it is not sending message when i remove the trigger it is successfully sending the message pls help me out on this

abhishekdubey-pn
Автор

Connection timed out in step: Enter in public subnet, from there enter in private subnet
i cant connect to my private instance. Help me

ducvietnguyen