Master Databricks & Apache Spark Step by Step: Lesson 5 - Using The Data Science Process

preview_player
Показать описание
In this video, you will learn about the data science process. This series of steps takes you from project inception to completion. You'll need this to do your own projects and we'll be following this structure for the rest of this series.

Master Azure Databricks Step By Step

Video Slides

Great Blog on the Data Science Process
Рекомендации по теме
Комментарии
Автор

Lesson
5 completed . Interesting thought regarding data scientists. I had assumed it was just a matter of learning some concepts in a course or two . Thanks for explaining so nicely . Would still love to explore courses for data scientist role coz I am actually from core data engineering background.

Continuing my journey ……. On to lesson 6

mansah
Автор

I would love to hear your insights about analysis and running interviews!
Great videos! I'm learning a lot from you.
The way you teach is perfect

lusk
Автор

Fantastic videos Bryan. You are doing an amazing job in this series.

Let me quickly add, where I am from, the data scientist don't just build and deploy model at the ML phase you described. It is a common knowledge that 80% of what we do in data science is data wrangling and cleaning.

You make it sound like the data scientist just wait for the business analyst/Data Engineer to do all the dirty work of data collection, cleaning and exploration. As a matter of fact, I am a data scientist with no masters or PhD (I know a lot of folks wont like this), but I do all of these tasks when i embark on a project from stakeholders management across all board, to data collection and exploration down to building the models itself and evaluation performance of the models using different metrics

saheedajayi
Автор

Great stuff Bryan!
Your coverage of the Data Science Process was informative and chock full of great real-world observations.
I agree with your comment about the title "Data Scientist" should have credentials like a masters degree or PhD in statistics.

mwmckeetube
Автор

Am glad i have reached this far in the video series. I like what am learning on this channel. It's very practical

paulntalo
Автор

@Bryan, you brought up many good distinctions between data engineers and data scientists/machine learning engineers. You're absolutely right about the Data Scientist title being hijacked by those who don't appreciate the depth of domain expertise and the rigor that goes with that title. A large proportion of job descriptions that I come across aren't very clear about these roles and responsibilities. Many employers who are hoping to leverage data science don't seem to know where the boundaries end. They expect both these roles to know and be experienced in a ton of these tasks, which I personally think is unfair to the candidates and demonstrates the lack of maturity and clarity in understanding on the employers' part.

anthonygonsalvis
Автор

Thanks Bryan**, such a wealth of information and experience

MathewBurford
Автор

You're the man Bryan. Thank you for this!

I liked your mindset about taking notes.

Have you developed some particular structure based on your experience in the field or is it simple unstructured notes that captures the essence?

morten
Автор

Thank you! 100% agree with who is a data scientist!

jaredcornell-mwnf
Автор

very well done!!! Such a wealth of insights!

thierrydesjardins
Автор

I so agree with nailing things down before moving on, but due to over zealous project managers and scrum masters this rarely happens in my experience, grrr! They push to move things along due to time and cost but as you rightly pointed out that it costs so much more to have to change things later on. I could moan for hours on this subject!

ChrisUK
Автор

It would be great to know how the business analyst or project manager uses databricks. A video on the analysis and how to come up with the question would be great. Please

destinyokwuosa
Автор

great videos! I'd love to hear your insights about data analysis

tomerperetz
Автор

This is helpful breakdown od the data science process.

fotomakr
Автор

@Bryan, I'm enjoying this video series so far. Thank you for breaking down the process in simple English for the newbies. Where would the Model Selection step fit into your list of steps? I'd say it comes right after Feature Engineering. In my opinion, for business problems such as classification, a Data Scientist/ML Engineer may start with several different classification models (e.g. logistic regression, decision tree, random forest, support vector machine, naive Bayes, K-nearest neighbor, etc.) and conduct multiple test runs with each before narrowing down the choice to just one. Unfortunately, even some of the best YouTubers don't explain very well, if at all, what other models they considered and evaluated before selecting a certain ML model for the problem at hand. If you know of any good YouTube channels that clearly explain such things, please share them in your response. Thx.

anthonygonsalvis
Автор

Do you have series of steps to cover "Identifying Use Case" ???

JD-xdxp
Автор

26:00 Interestingly enough Agile/SAFe advocates deferring committment where possible by building in flexibility. I guess the balance is regestering as early as possible where there is an issue and making an informed choice whether it is possible to kick the can down the road rather than ignoring it..

azursmile
Автор

This series is great, and your perspective and breakdown are really helpful for learning the big picture concepts involved in the data science process.

I am actually currently creating a university class about data technologies for data scientists, and I am using your content as a guide for learning the basics myself (I'm trained as an academic/applied statistician, but not as an industry data scientist). Are your videos largely based on the book you wrote (i.e., should I cite your book if I use a lot of the concepts you talk about)? Can I reach out to you for more advice or insights?

gnaistvlogs
Автор

Amazing stuff! Thanks so much for providing these high quality educational videos. My question is about ingesting syslog data for analysis. Does it require converting the file into csv?

yosiasz
Автор

Hi Bryan, what do you think about the impact of advanced AI on the data science process. Is it a complete game changer?

izeofsan