AWS: How to use AWS Glue ETL to convert CSV to Parquet - Tutorial

preview_player
Показать описание

Video: AWS Glue is a managed ETL platform, and can be used for storing your data Schemas, as well as ETL tasks in Python, or Java. A common ETL use case is to convert CSV files to the much more efficient Parquet files. Glue makes this easy, and can automatically handle this transition from your objects stored in S3.

Learning Objectives:
- Updating IAM policies to allow access to new prefixes in S3
- Creating a AWS Glue ETL job
- Configuring a AWS Glue ETL job to convert to Parquet Format
- Querying Parquet files using Amazon Athena

***
Full AWS Playlist:

Find out more about Firemind:

#AWS
Рекомендации по теме
Комментарии
Автор

Very Nice. Earlier, I was keep getting error but after watching your video its resolved. Thanks :)

puneetsharma
Автор

Thank but I want to convert multiple files CSV to parquet from the same folder target s3 output s3 pls help me out

AbhishekDubey-tdks
Автор

good video, short and to the point - quick question though, I noticed your timestamp fields were set to string datatypes - have you had any success converting them to timestamp? thank you

mattphb
Автор

👌 excelent sorry are you pre created crawler? Thanks!

MegaLobo