GUI Parquet file reader with SQL execution using DBeaver and DuckDB

preview_player
Показать описание
Open and view parquet, CSV files using GUI tool DBeaver(leveraging DuckDB features) and run SQL queries on them.

This video answers:
- How to connect to DuckDB using DBeaver?
- How to open Parquet files?
- How to run SQL queries on Parquet files?
- How to see the Parquet Schema , metadata and Statistics?
- How to read Parquet file in Windows?

Docs:
Рекомендации по теме
Комментарии
Автор

Few important notes on DuckDB:
1) DuckDB is an embedded database which always runs in-process. The keyword :memory: tells DuckDB to store data files in-memory(ephemeral) rather than on disk. (This was not correctly described in the video)
2) DuckDB is similar to SQLite. Both in-process database that supports in-memory and on disk storage of data. DuckDB is built to run fast OLAP style queries whereas SQLite is for OLTP use cases.
3) It supports direct queries on Parquet files, CSV files, AWS S3 etc.. Data can also be loaded from these files to DuckDB SQL table and queries can be run on the table (This should give better performance especially for CSV files).
4) DuckDB gives great performance. It can handle larger than memory data and can use all of your CPU cores for parallel processing.

devcentral
Автор

Exactly what I needed. Thank you for putting this together.

JhonPereda
Автор

Thanks so much. I get many requests to read .parquet files--now I can!

marioanzaldua
Автор

thanks a lot. was struggling with making a connection for duckdb to read a parquet file. this resolved!

ankitamathur
Автор

Thanks a lot. Precisely what i was looking for

ranganatteri
Автор

Just what I needed, thank you very much!

d.s.
Автор

Thank you! It's super helpful! :)

morelo
Автор

Very good information. But Looks like in the video there is some sensitive/personal data of customer is being exposed. Not sure if it is dummy. Please mask it.

prashubangera
Автор

Thank You. I really want to have this working. What version of DBeaver do you use/ I have DBeaver Lite and it crashes when running "Select From parquet file". Any ideas why?

tzelichonok
Автор

Is this information being sent to to somebody or is it safe

FA-srlx
Автор

Can I have the parquet files and DuckDB on one server and run DBeaver on a different server?

william-s
Автор

I used your video the first time I needed to use duckdb in memory with an IDE and it is a great resource, thank you!
I made a similar tutorial just now for a use case where you have parquet on Azure and you would like to access the data where it rests, lakehouse style, using dbeaver as shown here.

Pretty much you build views in duckdb files and then embed your credentials in the connection.

Gavguy