Use dbt Seeds to work with static CSV data

preview_player
Показать описание

Seeds are CSV files in your dbt project (typically in your data directory), that dbt can load into your data warehouse using the dbt seed command.

Seeds can be referenced in downstream models the same way as referencing models — by using the ref function.

Because these CSV files are located in your dbt repository, they are version controlled and code reviewable. Seeds are best suited to static data which changes infrequently.

Timestamps:
0:00 - Intro
0:21 - How dbt Identifies Seeds
1:08 - Background on Demo Example
1:27 - Add a Seeds to the Project
1:53 - Deploy With dbt seed Command
2:37 - Reference Seed Model
4:00 - Review Compiled Code

Title & Tags:
How to Use DBT Seeds to Hold Static CSV Data | Data Build Tool (dbt) Tutorial
#kahandatasolutions #dbt #dataengineering #kahandatasolutions
Рекомендации по теме
Комментарии
Автор

When I downloaded the dbt_project sample project, I don't see the example_teams.sql that you had referenced at 3:36 . pls clarify

kondalajjarapu
Автор

Hey, thanks for your great guides, just an update, that the CSV file needs to add under the "seeds" directory.

Ben-ZionMegido
Автор

Hi Kahan, I can't find where you created "example_teams.sql" Do you mind sharing the query? So I can get my dbt running cos it's flagging an error message:

"Compilation Error in model example_team_details
Model depends on a node named 'example_teams' which was not found"

dataanalysiswithmuhammad
Автор

Does {{ref('example_teams')}} refer to the original csv file or to the imported table?

juanete
Автор

How to specify a different delimiter than comma? I have a csv file using semicolon as field delimiter.

guangfanxu
Автор

Hi, some columns in my .csv file are blank so it gives errors how to accept null or blank values in .csv in seed

amrutakothari
Автор

Looks like the data-paths config has been deprecated to seed-paths. Might consider updating for a future video.

Do you know how to configure what the location will be for the files in the data folder? I have played around in the projects.yml file, but so far cannot seem to do the same location configurations as I have done with sources, staging, and mart folders. In my case, the country_code_mapping file I'm trying to load is getting deposited into a different database and schema than the rest of my dbt run output tables

jeremiahworkman
Автор

Is it possible to build seed dataset/table over multiple files?
Say in this scenario I have files like team_locations_1.csv and team_locations_2.csv, but it should create only one table team_locations which should data from both the csv files.

pravinsingh
Автор

Can you explain how to materialize model results as parquet files into AWS s3 with dbt?

sararodriguez
Автор

Hi Kahan, how can I change the default schema location of the seed file to other schema instead of PUBLIC on dbt?

stevelo