AWS Tutorials - Using Glue Job ETL from REST API Source to Amazon S3 Bucket Destination

preview_player
Показать описание

In many scenarios, you are required to build an AWS Glue job which calls a REST API to fetch data for the ETL purpose. Such jobs can be configured to run either with a schedule or an event. The REST API could be deployed within the AWS Account or outside. In this workshop, you create an AWS Glue Job which calls the REST API hosted outside AWS Account. With little changes, you can create job which can call APIs hosted within the AWS Accounts as well.
Рекомендации по теме
Комментарии
Автор

NIce Work !! Your Channel is unique. You deserve have more Suscribers.
I have tested this with Lambda and it works Great.

ganeshmaximus
Автор

Thank you and very nice tutorial to understand

muqeempashamohammed
Автор

Is it possible to pull data from S3 json body ..use that json body to trigger (post json data into the external Api)
and load data into my Api via AWS glue?

Could you share some sample /reference?

savirawat
Автор

How you have used the Public subnet in this demo ?

devopsdigital
Автор

Hello there. I'm new to aws. We have to create something like this at my new job. We need an etl to extract data from an API on the internet and save the data to S3. My question is about the pricing. Since we will only use the ETL once a month, what do you suggest to do about the vpc, allocated ip, nat that are paid per hour. Sorry if I'm misunderstanding some things. These are all pretty new to me. Your help will be greatly appreciated. Thanks.

drew
Автор

Same process is applicable while I am pulling the on prem air table to Redshift via API

veerachegu
Автор

do you have any videos about APIGEE integrating it with aws ??

josemanuelgutierrez
Автор

Thank you for the tutorial. I am new to Glue and have a question. Is it possible to insert the data directly into the database instead of storing in s3 in the same script?

gowthamavinash
Автор

Thank you for the tutorial. I have a question on What connection type we should use if we want to connect external Kafka such as Confluent Cloud Kafka?

hsz
Автор

Hello, after completing the workshop, it tried to build it with cloudformation. I however keep getting Validation for connection properties failed (Service: AWSGlue; Status Code: 400; Error Code: InvalidInputException. See below my glue connection resource code Can you advise on how to correct this?

MyGlueConnection:
Type: AWS::Glue::Connection
Properties:
CatalogId: !Ref AWS::AccountId
ConnectionInput:
ConnectionProperties:
ConnectionType: JDBC

AvailabilityZone: us-east-1b
SubnetId: !Ref GlueJobPrivateSubnet

sodiqafolayan
Автор

Hello, I have an error "Error 110 time out ". Can you help me ?
Thanks for the video

luisg
Автор

Hi
Can you give some code example to call POST Api instead of GET
We have requirement to call couple of rest post call (external) --
1. OAuth2 API
2. Service API

toshitmavle
Автор

Hi, Thank you for this cool workshop. However, i recreated this on my own using a custom VPC but the job never ran successfully. Below are the steps i followed and i will be glad if you can point out what else i need to do

1. I created EIP (to be used by NAT)
2. I created a VPC
3. I created Public and Private subnet
4. I created IGW and attach it to the VPC i created in step 2
5. I created a route table, add IGW as route and associate it with Public Subnet
6. I created NAT Gateway in the Public Subnet and associated the EIP created in step to it
7. I created another route table, open 0.0.0.0/0 route and made NAT the target, then i associate the route table with Private Subnet
8. I created IAM Role for Glue to access S3
9. I created s3 bucket
10. I created a dummy jdbc connection and put it inside the vpc and private subnet that i created above.
11. I created glue job accordingly but after running the job, it failed. Unfortunately, it did not give me any log and i can't understand the reason it failed.

Note that i edited the python script, used my created s3 bucket and changed the Region to the region i was working in but yet it did not work.

Obviously there is something i am not doing right but i couldn't figure it out.

I will appreciate your feedback

sodiqafolayan
Автор

I am getting error at the last step when running the Glue job although all steps i have performed in US-WEST 2 region . Even logs i am not able to see as it is showing log group is not available in the mentioned region. So i am not able to see the error logs also .

pragtyagi