Extracting EMAILS and PHONES from website contacts | Python Requests BeautifulSoup Regex

preview_player
Показать описание
Hey what's up guys, recently I got en email from one of my subscribers asking me to help him writing a crawling logic for his contracts extraction we scraping project. This video is inspired by the idea that many beginners actually start exactly from extracting contacts data from the websites probably because it's one of the simplest real world freelance projects to encounter. In this video I'm covering the very basics regarding crawling the list of URLs along with extracting contact page if available and actually composing regular expressions to match email/phone patterns in the content. I'm not providing the source code because it happens that people just don't watch video at all and trying to make use the source code for their own purposes, then they are encountering some issues due to the lack of understanding of what's going on under the hood and eventually they are starting asking questions that are already has been covered within the video itself. So please consider this tutorial as a guideline and not as a production ready solution - this would give you a freedom to use it as a basis for your own projects.
Рекомендации по теме
Комментарии
Автор

Great content as usual, please keep posting such interesting practical videos

developerdomain
Автор

52:30



I found what I was looking for - "list(dict.fromkeys(my_dict)"


so:

fieldnames = list(dict.fromkeys(contacts))

python
Автор

Can you show us how to set timer for scrapy ? i mean how we give time from cmd to control spider running.

ammarahmed
Автор

**Now looking at a way to screenshot your video, run OCR on it, and extract the code from it....😉


OpenCV, Python, and Tesseract?

python
Автор

I have a mystery: you need to do> pip install lxml ... yet in the code you dont need import lxml

chizzlemo
join shbcf.ru