What tools should you know as a Data Engineer?

preview_player
Показать описание
Want to build a reliable, modern data architecture without the mess?

The modern data engineering stack is overwhelming...

Everyday there seems to be a new tool and it can be hard to know what to learn.

While it’d be impossible to share every single possible option...

in this video I want to share some of the most commonly used ones across the modern stack.

Use the tools mentioned in this video to guide your learning.

Figure out what you want to learn next and overall become more aware of what's out there.

**Important Note**
We'll be going through a lot of tools in this video. I want you to know that you don't need to learn how to use every single platformed mentioned to be a great data engineer.

I personally have not used all of them in depth and I've never met anybody who is an expert at them all either.

Ultimately the goal is to give you a perspective of the overall data landscape and help guide your journey to becoming a great engineer. Pick out one tool you don't know today and start learning it. Once you master that, move on to another.

Timestamps:
0:00 - Intro
0:37 - Databases
2:32 - ELT Components
4:52 - Version Control & CICD
6:31 - Infrastructure
8:00 - BI & Analytics

Title & Tags:
What tools should you know as a Data Engineer?
#kahandatasolutions #dataengineering #analytics
Рекомендации по теме
Комментарии
Автор

Want to build a reliable, modern data architecture without the mess?

KahanDataSolutions
Автор

All the names of the tools talked in the video:
*Coudbased db
Amazon Redshift
Google BigQuery
Snowflake
Azure Synapse

*Traditional row-based db
SQL Server
MySQL
PostgreSQL

*NoSQL db
MongoDB
elastic
cassandra
cosmosDB
amazon DynamoDB

*Extract & Load
Batch
Fivetran
Stitch
Airbyte
Azure Datafactory
Amazon Glue

*Streaming
Apache Kafka
Amazon Kinesis

*Transform
dbt - data built tool

*Reverse ETL
Census
hightouch
rudderstack

*Version Control & automation
GitHub
GItLab
CI/CD

*Task Orchestration & Scheduling
Apache Airflow
Jenkins
Luigi

*Infrastructure
Management
Terraform
Ansible

*Containers
Docker

*Container Orchestration
Kubernetes

*BI & Analytics
Reporting
Power BI
Tableau
Looker

*Open Source
Metabase

*Spreadsheets

mrviper
Автор

You have a new subscriber! I love the way you explain data engineering. You and Seattle Data Guy are my faves when it comes to Data Engineering Content Creators.

TNTsGOboom
Автор

This is really helpful, Bro. Thanks a lot.

ZawmyoHtet-lgjn
Автор

Yesterday
I said in your post
That its overwhelming with so many tools and today got a video :D

hamsansari
Автор

Very good video. I think we can also add the cloud functions to this list.

mohammedaminelachhabe
Автор

You’ve got a new subscriber. Thank you

AlexKashie
Автор

What an absolutely power video. Please keep such good content coming!

robertoferro
Автор

thanks for an overview of the landscape!

kevon
Автор

I'm really interested in this field and currently leaning Python. I must say this list is great but I'm really overwhelmed by the amount stuff one has to learn to transition in this field! I'm gonna stick with it and hopefully come through from the other end 😁

adamo
Автор

Thank you so much for this video! Really helpful!

ligiaimusic
Автор

Good stuff bro. I'd add prefect to orchestration/task flow.

cyclonus
Автор

Some other alternatives for scheduling and orchestration are:

Dagster
Prefect
Oozie

Or whatever your cloud offering might have, I know Google Cloud has Cloud Scheduler.

If you suggest Jenkins as a job scheduling tool in this day in age, I will hunt you down...

DjBaxter
Автор

This was a very informative video - very useful to "get the lay of the land" so to speak.

Rex_
Автор

Apache Superset is one of the promising BI tools in my opinion, Can you share your opinion on this, if possible

adityalakkad
Автор

Could you please make a complete series on Apache Airflow ❤

__shaikmalikbasha__
Автор

Phenomenal video. What tool(s) do you recommend for documentation and/or data dictionaries?

nickriebe
Автор

Hi, thank you for your video. I know that this is old now but I wish you would put the names of each tool you listed under the tool. If you aren't familiar with the specific tool it can be hard to know how to spell it. I know I can Google but I was taking notes as I was following along. Thank you.

Swelouise
Автор

Hi can you tell me where exactly apache spark fit in this picture

himanshuagrawal
Автор

I really need this so bad. Do you have a Data engineer course ? Or any recommendations?

poizentv