Process Excel files in Azure with Data Factory and Databricks | Tutorial

preview_player
Показать описание
Excel files are one of the most commonly used file format on the market. Popularity of the tool itself among the business users, business analysts and data engineers is driven by its flexibility, ease of use, powerful integration features and low price.

This is why every data engineer out there should be to understand advantages and disadvantages of this format. The variety of different internal formats like XLS, XLSX, XLSB and XLSM and which tools to use in order to process those files effectively in the cloud.

Today I bring to you a quick introduction to the process of building ETL solutions with Excel files in Azure using Data Factory and Databricks services.

Agenda
00:00 Introduction
00:25 Excel Business Justification
01:22 Excel Challenges
02:20 Supported Services
04:30 Data Factory Introduction
05:35 Demo Setup
07:13 Demo using Data Factory
13:36 Databricks Introduction
14:44 Databricks Setup
18:14 Databricks Demo - Reading Excels
20:55 Databricks Demo - Reading Excels using References
25:56 Databricks Demo - Workbook Metadata
28:05 Databricks Demo - Defining Schema
30:03 Databricks Demo - Defining Schema
32:53 Additional Options

Next steps for you after watching the video
1. Excel format in Data Factory
2. Spark Excel by Crealytics documentation

### Want to connect?
Рекомендации по теме
Комментарии
Автор

As a force I habit, I keep saying Crealytics library, but in fact, this library is called Spark-Excel and was developed by Crealytics company. 😊

AdamMarczakYT
Автор

Very clear explanation and well organized tutorial. Thank you so much for sharing. Keep up the great work!

lonaosmani
Автор

This video is amazingly informative and helpful!
I really appreciate the production value you put into this!

HierImNorden
Автор

This really helped me alot. We had to deal with lots of excel sheets with diff format. Thank you so much Adam for such an wonderful video.You are a star.

deepjyotimitra
Автор

As usual simple & clear. I really like your videos Adam. Way you explain is so natural.

raviv
Автор

Awesome content Adam. Specially the demos are pretty helpful. Please make more videos covering other use cases using ADF.

big-bang-movies
Автор

This fits my business case. Thank you so much for this to the point tutorial!

manishdasgupta
Автор

Hi Adam, as always this is a great presentation ! Thanks for posting these videos !

ericjanssens
Автор

Thank you Adam for all your videos and contribution. It helped me a lot.

mersihaceranic
Автор

One of the awesome tutorials on ADF and Azure Databricks. Thanks for sharing.

jatinderarora
Автор

Most demanding solution asked by business for long. Thanks for sharing :)

shahid
Автор

Thanks for all your videos. They have been very helpful!

jahnavimurthy
Автор

Your ADF playlist is AWESOME 🙂 and make videos on real time scenarios. Thank you...

balajibp
Автор

Nice and ble to learn the concepts!!Thanks Adam

zipzapzoom
Автор

That's awesome. Thanks for posting

amjds
Автор

Very Excellent Video, nice step by step tutorial.

pdsqsql
Автор

Overall, your videos are very good, but man... this video is really amazing! I really liked the way you explained everything from the introduction putting the current problem into context to the possible solutions.

I hope you make more videos of this "real problems" style and how to solve them with the different tools that Azure provides us (and if it is related to data engineering better :p )

I congratulate you for the video, very very good.

carlosalonsocapilla
Автор

awesome tutorial Adam... Thanks for sharing..

scsourav
Автор

As always another awesome video. Thanks a lot for this video... Wondering how you were able to demo most of the azure services with pretty cool clarity and to the point !!!

balanm
Автор

Great Tutorial Adam. Spark-Excel installed on Interactive cluster and used in Development environment is working fine. When moving up to higher enviroments linked services created with Job clusters. How the Spark-Excel library gets Intalled in job clusters?

mohamedriyazdeen