filmov
tv
AWS Glue 101 | Lesson 1: The Glue Data Catalog And Crawlers

Показать описание
00:00 - Intro
00:24 - What is the AWS Glue Data Catalog?
00:36 - What is a metadata repository?
00:53 - What is metadata information?
01:18 - How do we collate the metadata?
01:43 - AWS Crawler
02:01 - When do we use the Data Cataalog?
03:32 - Interacting With The Glue Data Catalog
04:12 - What the tutorial will cover
04:34 - Hands on Tutorial
04:52 - S3 configuration
08:16 - Creating a database
08:56 - Setting up a crawler
12:28 - Recap
12:59 - Bonus: Athena
In this series of videos we take a look at AWS Glue. We mix the theory with the practical as we build a functioning ETL application using the Glue Data Catalog, Crawlers, Glue ETL, Triggers, Workflows and Dev Endpoints
In this video we configure our S3 bucket to act as our data repository, ingest data, register that data using a crawler with the Glue Data Catalog and finally use Athena to query the newly ingested data.
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all of the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months.
Data integration is the process of preparing and combining data for analytics, machine learning, and application development. It involves multiple tasks, such as discovering and extracting data from various sources; enriching, cleaning, normalizing, and combining data; and loading and organizing data in databases, data warehouses, and data lakes. These tasks are often handled by different types of users that each use different products.
AWS Glue provides both visual and code-based interfaces to make data integration easier. Users can easily find and access data using the AWS Glue Data Catalog. Data engineers and ETL (extract, transform, and load) developers can visually create, run, and monitor ETL workflows with a few clicks in AWS Glue Studio. Data analysts and data scientists can use AWS Glue DataBrew to visually enrich, clean, and normalize data without writing code. With AWS Glue Elastic Views, application developers can use familiar Structured Query Language (SQL) to combine and replicate data across different data stores.
😎 About me
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies. My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a career in data and computing. This journey culminated in the study of a Masters degree in Software development. Alongside many a professional certification in AWS and MS SQL Server.
Комментарии