Using Glue schema registry for Apache Kafka with Python

Показать описание

This video explains a Python streaming data pipeline which leverage schemas for data validation using Kafka with AVRO and AWS Glue Schema Registry.

Documentation Links:
-----------------------------------

Prerequisites:
---------------------------
Introduction to Schema Registry in Kafka | Part 1
Introduction to Schema Registry in Kafka | Part 2

Avro Schema Used:
------------------------------------
{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "Age", "type": "int"}
]
}

Python Code:
--------------------------
#pip3 install boto3 -t.
#pip3 install aws-glue-schema-registry --upgrade --use-pep517 -t .
#pip install kafka-python -t .
import boto3
from time import sleep
from json import dumps
from kafka import KafkaProducer
from aws_schema_registry import DataAndSchema, SchemaRegistryClient

session = boto3.Session(aws_access_key_id='{}', aws_secret_access_key='{}')

# Create the schema registry client, which is a façade around the boto3 glue client
client = SchemaRegistryClient(glue_client,
registry_name='my-registry')

# Create the serializer
serializer = KafkaSerializer(client)

# Create the producer
producer = KafkaProducer(bootstrap_servers=['127.0.0.1:9092'],value_serializer=serializer)

# Our producer needs a schema to send along with the data.
# In this example we're using Avro, so we'll load an .avsc file.

# Send message data along with schema
data = {
'name': 'Hello',
'Age':45
}
#data={'Partiiton_no':2}

Check this playlist for more Data Engineering related videos:

Apache Kafka form scratch

Snowflake Complete Course from scratch with End-to-End Project with in-depth explanation--

🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY YOUTUBE CHANNEL

Рекомендации по теме

Комментарии

Great explanation through code walkthrough as well as console demo. Best part is the explanation is very detailed for anyone to understand.

harvestingdata

thanks a lot, are you using CONDUCKTOR in local system ?

SpiritOfIndiaaa

Thanks alot sir, because of you I learned glue schema registry....keep going

mranaljadhav

how can i set environment variables in ubuntu to set aws cred when consuming messages in conduktor

SreshthBhatt

weird I have an error * SchemaRegistryException: Exception occured while fetching or registering schema definition *

MrMadmaggot

Can we just validate the keys of the message, regardless of their values?

luckyratnawat

Can you explain how the schema id is getting produced and how it is used by producer and consumer

dibyangsumajumdar

Does schema registry induce any kind of performance issues? As producer will always perform schema validation before sending the data to kafka broker.

Also, thanks for your videos. These are really helpful!

roshankumargupta

How we can use confluent kafka for same ?

ayushmandloi

HI, I am a Korean student. First of all, thank you for providing a great quality video!!

One thing I'm curious about is what is the blue icon UI application in the video (shown in the process of consumming)
Your reply will be of great help to my work. :)

kkw_on

Thanks, but I'm problem when install aws-glue-schema-registry. Follow message bellow about problem:

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for orjson
Failed to build orjson
ERROR: Could not build wheels for orjson, which is required to install pyproject.toml-based projects

EmersonSousa-sjwy

Using Glue schema registry for Apache Kafka with Python

Integration of AWS Glue Schema Registry & Kafka Consumer using Python

Using Glue schema registry for Apache Kafka with Python

AWS Glue : What is Glue Schema Registry

Announcing Schema Validation With AWS Glue Schema Registry

Apache Kafka 101: Schema Registry (2023)

Using glue schema registry for apache kafka with python

Amazon EventBridge - Using the Schema Registry

Identify source schema changes using AWS Glue On Datalake AWS S3 | Demo

How does the Confluent Schema Registry work with Apache Kafka?

Key Concepts of Schema Registry | Schema Registry 101

What is AWS Glue? | AWS Glue explained in 4 mins | Glue Catalog | Glue ETL

Avro serialization with Kafka and AWS glue - Mordechai Worch, IronScales

Schema Compatibility | Schema Registry 101

Schema Registry in Kafka

Schema Registry | Streaming Data Governance

Connecting Apache Spark to Apache Kafka Schema Registry with ABRiS

Schema Subjects | Schema Registry 101

Data lake AWS S3 | How to avoid schema Changes | Duplicates | Discussion Topic |

AWS Tutorials - Using Schema in AWS EventBridge

Schema Evolution in AWS Glue using Glue Crawler | AWS Athena

Spring Boot | Kafka Schema Registry & Avro with Practical Example and Implementation | #JavaTech...

Kafka Schema registry and connectors

Explained to read Streaming Kinesis Data in AWS Glue and write the data into S3 , Glue Catalog Table

Schemas and the Schema Registry | Apache Kafka for .NET Developers