filmov
tv
Using Glue schema registry for Apache Kafka with Python

Показать описание
This video explains a Python streaming data pipeline which leverage schemas for data validation using Kafka with AVRO and AWS Glue Schema Registry.
Documentation Links:
-----------------------------------
Prerequisites:
---------------------------
Introduction to Schema Registry in Kafka | Part 1
Introduction to Schema Registry in Kafka | Part 2
Avro Schema Used:
------------------------------------
{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "Age", "type": "int"}
]
}
Python Code:
--------------------------
#pip3 install boto3 -t.
#pip3 install aws-glue-schema-registry --upgrade --use-pep517 -t .
#pip install kafka-python -t .
import boto3
from time import sleep
from json import dumps
from kafka import KafkaProducer
from aws_schema_registry import DataAndSchema, SchemaRegistryClient
session = boto3.Session(aws_access_key_id='{}', aws_secret_access_key='{}')
# Create the schema registry client, which is a façade around the boto3 glue client
client = SchemaRegistryClient(glue_client,
registry_name='my-registry')
# Create the serializer
serializer = KafkaSerializer(client)
# Create the producer
producer = KafkaProducer(bootstrap_servers=['127.0.0.1:9092'],value_serializer=serializer)
# Our producer needs a schema to send along with the data.
# In this example we're using Avro, so we'll load an .avsc file.
# Send message data along with schema
data = {
'name': 'Hello',
'Age':45
}
#data={'Partiiton_no':2}
Check this playlist for more Data Engineering related videos:
Apache Kafka form scratch
Snowflake Complete Course from scratch with End-to-End Project with in-depth explanation--
🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY YOUTUBE CHANNEL
Documentation Links:
-----------------------------------
Prerequisites:
---------------------------
Introduction to Schema Registry in Kafka | Part 1
Introduction to Schema Registry in Kafka | Part 2
Avro Schema Used:
------------------------------------
{
"type": "record",
"name": "User",
"fields": [
{"name": "name", "type": "string"},
{"name": "Age", "type": "int"}
]
}
Python Code:
--------------------------
#pip3 install boto3 -t.
#pip3 install aws-glue-schema-registry --upgrade --use-pep517 -t .
#pip install kafka-python -t .
import boto3
from time import sleep
from json import dumps
from kafka import KafkaProducer
from aws_schema_registry import DataAndSchema, SchemaRegistryClient
session = boto3.Session(aws_access_key_id='{}', aws_secret_access_key='{}')
# Create the schema registry client, which is a façade around the boto3 glue client
client = SchemaRegistryClient(glue_client,
registry_name='my-registry')
# Create the serializer
serializer = KafkaSerializer(client)
# Create the producer
producer = KafkaProducer(bootstrap_servers=['127.0.0.1:9092'],value_serializer=serializer)
# Our producer needs a schema to send along with the data.
# In this example we're using Avro, so we'll load an .avsc file.
# Send message data along with schema
data = {
'name': 'Hello',
'Age':45
}
#data={'Partiiton_no':2}
Check this playlist for more Data Engineering related videos:
Apache Kafka form scratch
Snowflake Complete Course from scratch with End-to-End Project with in-depth explanation--
🙏🙏🙏🙏🙏🙏🙏🙏
YOU JUST NEED TO DO
3 THINGS to support my channel
LIKE
SHARE
&
SUBSCRIBE
TO MY YOUTUBE CHANNEL
Комментарии