Introduction to Datastream for BigQuery

preview_player
Показать описание

Datastream is a serverless and easy-to-use change data capture and replication service that makes it easy to replicate data from operational databases into BigQuery reliably and with minimal latency. In this video, Gabe Weiss, Developer Advocate at Google, discusses setting up real-time replication from Cloud SQL to BigQuery. Watch along and learn how to get started with Datastream for BigQuery!

Chapters:
0:00 - Intro
0:10 - What is Datastream?
0:59 - Demo: getting started with Datastream
4:47 - Wrap up

#GoogleCloudTech
Рекомендации по теме
Комментарии
Автор

I managed to play around with Datastream with BigQuery as destination. The problem with this approach is that the tables created are not partitioned. For those of as who do incremental load from our BigQuery replica to our reports, will always have to scan the whole table which comes with a cost, compared to scanning and querying only new data in the BigQuery replica

yunusarif
Автор

Hey! on 2:31, do you have example on how did you create or set up that connection? Thanks.

dexterkarinyo
Автор

if it works as described, it is really thanks a lot. @Gabe Weiss, some questions: 1. any limitations for Datastream for BigQuery? 2. I am using Cloud SQL so it would be great to have a tutorial for this combination. 3. Looks like AlloyDB competitor, isnt it?) what are the core differences? (I am thinking about AlloyDB in new verison of our project to avoid streaming analytic data to Bigquery)

vitamin
Автор

When will postgres cloudsql datastream be available?

ShahnewazKhan
Автор

Its already en preview for Postgres??? 😮😮

felipe.veloso
Автор

Is there a way to let CloudSQL IAM user (or a ServiceAccount User) be accepted as a way to connection to the CloudSQL db?

FuyangLiu
Автор

what if there are records updated/deleted in source system(mysql) does it also perform update/delete in bigquery or it works in append only mode.

darshanparmar
Автор

Can I stream a subset of columns from my source? the cli help (gcloud datastream streams create --help)suggests yes, but when i specify mysql_columns in the suggested format gcloud errors out with ERROR: gcloud crashed (ValidationError): Expected type <class 'str'> for field column, found {} (type <class 'dict'>)

danielcarter
Автор

can MariaDB be used instead of MySQL as source to stream data to bigquery?

analyticshub
Автор

What happens if I accidentally delete destination table in BigQuery? How can I restore the table?

HoaTran-rpkf
Автор

Does it support Customer managed encryption keys ?

apvyas
Автор

Through the GUI, when selecting the source objects to replication i can use wildcards such as "*.mytable". How do i do this with the CLI? When i describe a stream created through the GUI (gcloud datastream streams describe) the database field is simply missing, but when i try to create a new stream using the same format gcloud bombs out with "ERROR: gcloud crashed (ParseError): Cannot parse YAML: missing key "database"."

danielcarter
Автор

Can this feature be used to load from Bigquery to CloudSQL(Postgres) and have realtime streaming for operational purposes.
@googlecloudtech

rameshyadav