Full Python Portfolio Project! Create a smart program to download & transcribe top podcasts.

preview_player
Показать описание

In this video we create a Python program that can automatically scrape the RSS feeds of your favorite podcasters, pulling out the episodes you’ll find most interesting, and downloading + transcribing them.

This project leverages a wide range of Python skills making it a good portfolio project. In it you’ll use the BeautifulSoup + requests libraries to first web scrape & download MP3 podcast files. You can use the regex library (re) and other NLP libraries to smart search for specific episodes that you'll enjoy. Next you'll see how you can use AssemblyAI's speech to text API to transcribe all of the episodes that you download. This code will be leveraged to create a text corpora that is going to be used in language analysis in upcoming tutorials.

If you have any questions, let me know in the comments!

Make sure to smash like + subscribe if you enjoyed this video :)

-------------------------
Follow me on social media!

-------------------------

Practice your Python Pandas data science skills with problems on StrataScratch!

Join the Python Army to get access to perks!

*I use affiliate links on the products that I recommend. I may earn a purchase commission or a referral bonus from the usage of these links.

-------------------------
Video timeline!
0:00 - Video Introduction
1:19 - How podcasts work (RSS feeds overview)
5:11 - How can we utilize the XML webpages? (breakdown of RSS feed information & how we’ll use it to create a smart program)
7:47 - Accessing this project on GitHub
9:22 -Writing Python code to download podcasts locally (requests & beautifulsoup libraries)
18:10 - Modify our script to be able to download many podcasts
22:51 - Building in smart search capabilities to grab podcasts we’ll find most interesting!
31:00 - Using the AssemblyAI API to transcribe the podcasts we’ve downloaded
1:06:08 - Cleaning our code with functions & classes and putting everything into Python scripts.
1:18:09 - Portfolio project extension ideas! (Spotify API, NLP semantic search)
1:19:56 - Smash like & subscribe pretty please :)
Рекомендации по теме
Комментарии
Автор

Awesome video idea! Can't wait to watch on lunch break

professuh
Автор

I recommend Darknet Diaries if you like computer crime podcasts. Also, thanks for this video!

nolimit
Автор

I absolutely loved your videos! Please make more data science projects!

andyn
Автор

Hello! Thank you for your invaluable video! I find it extremely useful for beginners! I would like to ask about one thing regarding data. I learnt Pandas in terms of Data Wrangling and Transformation. Therefore, how about Pandas for Data Engineers? Is it useful tool for ETL/ELT transformations? Obviously, the next step will be PySpark, but I would like to start learninig Pandas. It seems it is a good path for the next one. What do you think about it ? I would appreciate it if you could share your views about it.

ukaszdugozima
Автор

Really cool! Could you do more on model deployment?

drakeweissman
Автор

Whoa. This is rad. I so pumped to build this!
thanks dog!

chillydoog
Автор

are you working for any a company right now?

mirshodoripov
Автор

could you explain how this can be done locally using openai's whisper?

tokyofamily
Автор

awesome! i listen to podcasts a lot as well! that is so exciting to learn this topic with python!! at the same time, is it possible to analyze individual podcast preference?

fiefiego
Автор

Do you have a list of the podcasts that you like to listen to?

bennguyen