Add raw 'sources' to your dbt project

preview_player
Показать описание
Get a FREE checklist to build simple, reliable data architectures (without the mess)

Learn how to add sources to your dbt project and add the to your model scripts using the Jijna templates.

Sources make it possible to name and describe the data loaded into your warehouse by your Extract and Load tools. I think of these as references to the foundational "raw" data tables from which everything else in your project will be built upon.

Timestamps:
0:00 - Intro
0:18 - What are sources in dbt?
0:54 - Create a new source
6:09 - Use in a SQL model

Title & Tags:
How to Add Sources (YML files) to a dbt project | Data Build Tool (dbt) Tutorial
#databuildtool #dataengineering #kahandatasolutions
Рекомендации по теме
Комментарии
Автор

Get a FREE checklist to build simple, reliable data architectures (without the mess)

KahanDataSolutions
Автор

Really well structured and explained, this really is what dbt's own tutorials should be like.

HansHelmutKohls
Автор

Thanks for the video, i took over a DBT project and it seems that whoever I took over from saw this video also. Lucky for me

billalnulnul
Автор

Great, I was searching for some dbt tutorial and came across your video. It's really informative 👍👍

MrChalak
Автор

I got below error, cloud you please advice whats was it
"depends on a source named which was not found"

krishnachinta
Автор

This is great !! Keeping doing more of dbt with snowflake.. Ty!!

_saaskey
Автор

Hello david, after following your process bit by bit, this error pop up after dbt run:

(1:04:27 Running with dbt=1.5.0
11:04:28 Unable to do partial parsing because saved manifest not found. Starting full parse.
11:04:28 Encountered an error:
Parsing Error
Error reading adetola: staging/source.yml - Runtime Error
Syntax error near line 4

1 | version: 2
2 | sources:
3 | -name: northwind
4 | database: analyticproject1
5 | schema: stg_northwind
6 | tables:
7 | -name: customer

Raw Error:

mapping values are not allowed in this context
in "<unicode string>", line 4, column 12

please kindly advice on how to solve this problem

adegbiteogunbode
Автор

Thanks for the clean explanations. How can we use dbtspark to read from and write to postgresql database schema instead of hive warehouse, is it possible to leverage spark distributed processing instead of native postgresql compute. Reason being, all transformations may not be enough within postgres servers, and will need to scalable.

practicalhead
Автор

Hi, .... When I run the dbt after created snowflake_sample_data_store_sales, I get the below error
10:26:17 Running with dbt=1.1.0
10:26:18 Found 3 models, 4 tests, 0 snapshots, 0 analyses, 181 macros, 0 operations, 0 seed files, 1 source, 0 exposures, 0 metrics
10:26:18
10:26:22 Encountered an error:
Runtime Error
Database error while listing schemas in database "DEMO_DB"
Database Error
002043 (02000): SQL compilation error:
Object does not exist, or operation cannot be performed.

Please help

sunny
Автор

Hey, great explanation! Will the new model inherit all the constraints from source data?

prudhvik
Автор

Hi, Is there any way to reference user defined functions, just like we add tables in the source files.

abhinavjain
Автор

Hi, don't we need to put the model details along with source in the schema.yml file.

surjeetkarmakar
Автор

Unfortunately could not get through this one. Followed instructions as is, but get an error on the execution of the sql file

Object does not exist or not authorized.
compiled SQL at

Any pointers on what could be going on?

TubeDirektor
Автор

thanks for the video, is really helpful! May be one little comment: It makes no sense to configure once again the source database in the source yaml file, since we can use only one db in a project and it is already defined in the profile yaml file.

emrea
Автор

The series of this video is very helpful.
Thanks for the video.

I have a question, using templates can we loop over the multiple tables under same schema? If yes pleases suggest how can we do that.
Thank you.

megabyte
Автор

From previous videos, I had renamed the models\example folder to \staging. I then created a new folder under models like you did in this current video. However, when I run a dbt run, it says I have 3 models but it only runs the "my_first_dbt_model" and "my_second_dbt_mdoel" It doesn't run my new one under snowflake_sample_data. I have the correct permissions. I thought it might be because I had been trying out the materialized settings like you did in your previous videos in the dbt_project.yml. However, I've tried every combination I can & even commented it out. models:
Course_Instance:
# Config indicated by + and applies to all files under models/staging/
staging:
+materialized: table
+schema: staging
snowflake_sample_data:
+materialized: view
+schema: staging

leiaduva
Автор

Thanks for the great video. One quick question. For the target database(dev or prod), you have specified the connection info in profiles.yml. But for the source database, where do we specify the connection information? Thanks!

imohammd
join shbcf.ru