Requests-HTML: A Python Library For Scraping The Web

preview_player
Показать описание
This video introduces the Requests-HTML library, which combines Requests features with HTML parsing tools for easy website scraping.

WORK WITH ME👇🏼

✅ Implement features and fix bugs in your app: Live, one-on-one screenshare
Рекомендации по теме
Комментарии
Автор

Hey, Anthony! Good to see new content :)

Idea for next video here. Would be great to show how to work with Requests-HTML via asyncio (aiohttp, for example) on real life web app!

vic_shine
Автор

Great video! A project suggestion I could think of is an price alert with a cron. So you know when you on a ecommerce shop the is an discount. On your product of choice.

timvogt
Автор

Great tutorials! Seems like a nice library that makes things a lot easier :)

thedumbfounds
Автор

On the webpage/url that I call session.get(url) on, there is a javascript script, one thing this script does is send a request of its own, how can I capture the response to this request?

chashmal
Автор

Hi A! Very cool! Can you make a part 2 where you add this to a cron-tab on a heroku server to automate it daily and write it to a mysql db? Maybe an e-commerce example makes sense like: amazon python books / geekbooks.me / etc.

dirkb
Автор

It says "Full JavaScript support!" and "XPath Selectors". I'm going to check that out.

sinancetinkaya
Автор

Great job explaining everything!

I just have a problem, when finding a class, lets say:
title = f.html.find('.title', first=True)

When I print(title.text) I should get the text within the tag H1 (for example). But I get the whole clean text (without HTML formating) of the whole site.

print(title) will show <Element 'h1' class=('title')>

No problem, but I cannot print the text within <h1> tag.
Am I doing anything wrong?

Thanks for your help!

ButterySAM
Автор

Hi, when i try your example with finding the headline, it is not working, because it returns to me different html content. How can i fix it?

Stefan_Dragancev
Автор

Hello, is there any way to have a callback function in session.get(url) statement? I want to have an upload progress bar.

el
Автор

Thanks a lot. You got my sub. Would like to see an advanced scraping with pagination. Could you do that? Happy Easter

DanielWeikert
Автор

How do I scrape articles from specific date range with requests-html ?

FindingDoyin
Автор

What about javascript?
This library can render javascript

MrTASGER
Автор

Thanks for the video Anthony.Very clear!!. Just had one query. How do we scrap a site that prompts for username and password from PipEnv?. I have done it from a Text Editor but would like to know how to do it from PipEnv.

pradeepbhat
Автор

Hi Anthony and thanks for this tutorial. Would this library work with sites built with React or Angular, i.e. sites generated dynamically on the client side? Thanks again.

LesCarbonaro
Автор

Hello, Do you know whether this package can be installed in Anaconda? if yes, can you please provide some links for instructions how to installe this package?

Than U

mazkaibil
Автор

Thanks Anthony!
For a real example, you want to try scrapping boardgamegeeks.com
I had trouble working with their API and i'm wondering if this could be easier.

sylvainrobillard
Автор

hi! how i can get value of attibute 'href' in your lib for example?
Thanks

oboistore
Автор

Great tutorial very clear ! Does this work for loading page like Pinterest ? If yes, we don’t need selenium any more?

yesweet
Автор

Scrape nbareference.com. Have it go to "schedule & results" page. Then "boxscores"" link. And finally, get the table data of both teams results.

xxjaydogxx
Автор

can I use to post data to form? I've tried but always failed

adiyatmubarak