Requests-HTML - Checking out a new HTML parsing library for Python

Показать описание

Checking out a new HTML-parsing library by the author of Requests:

Рекомендации по теме

Комментарии

I like this type of video. You should do like a monthly video of new module so people can be aware. This will be very useful people that learn python.

thehungman

man, i'm mixing that HTML parsing sauce with my beautiful soup right now

rumidom

I think the // for retry in range(100): // part is what is allowing the script to continue after raising the error. From their doc: "The simplest use case is retrying a flaky function whenever an Exception occurs until a value is returned." So this would allow the exception to be printed, yet the script to continue I believe. Great content man, thanks for all of the awesome videos :).

cooperlimond

So basically they just have a list of symbols like 'more', 'next' or 'older' and look for their hrefs.

So on HN page 2, the title from the CNBC story has the word 'more' in it. Haha.

However, there are statistical method about how you can find out how a page uses pagination but I guess that's a bigger nut to crack for such a young library :)

yokoono

Multiple classes in html is shown by spaces.
So in CSS selectors, it will be separated by a '.'(dot)
For example:
<div class="foo bar"> will be referenced as about.find("div.foo.bar")
Also, '(' and ')' are invalid css selector characters so you have to escape them.

mshirazab

Best way to remove error
-> comment out raise statement. 😁😂

rohnchatterjee

This is cool! And it gave me the idea of a series of videos about how to create a python package

Lucas-wlpy

Hello!
Recently I am trying to parse some webpages with Requests-html asynchronously. Theoretically this can be done by working with AsyncHTMLSession. However, I am unable to get result with it most of the time (I also use arender, the attempts to parse the webpages fails due to different reasons - most probably timeouts). Maybe it's just the poor internet connection, but I'd be really grateful if you uploaded a video or help me with this.

anyad

Thank u sentdex you are leading me to the real world from africa

LolLol-wyfp

Thank you so much!, I got the answer based on your guide. *nice helmet ⛑ you got back there*

shazkingdom

The function couldn't clean up user data because these files were locked by chromium process.

MohamedMagdyHammad

18:52 I ‘think’ replacing the spaces with periods should fix the period error, no idea about the parentheses, though. Backslash or HTML escape?

Hans-jcju

Sentdex! Can you show scraping from a page with a "show more" button, that loads more of the page in JavaScript ?

SimOn-bzxy

Make another video building a crawler using it. Nice video!

SkySesshomaru

Strange, I have installed requests_html but when I import it in a Python script in Python 2.x or 3.7, I get: ModuleNotFoundError: No module named 'requests_html'

Hegelian

what the extension, who print the result down?

developerarchitect

to find td‘s or other elements that pertain to multiple classes you just would have had to put dots in between. Read up on css selectors, jquery also uses them, pretty standard nowadays and less headache than xpaths ;)

WhiterockFTP

Can I get some help how to install "requests-html" package to be run globally, for example, through Sublime Text?
I am using Conda on Windows 10.

I have been trying to do that, but as I understand so far, it runs only in virtual environment that cannot be used by Sublime? Correct me if I am wrong.

pyxelr

Please make a video building a webcrawler, would be very insightful!

SimonEliasen

Thanks for posting this. I've used BS4 and another module to do the JavaScript (render the page) on many projects, it's nice to have it in a concise package.

Btw, I think the pagination on HackerNews failed because it looks for one of three (by default) "next" labels. "next", "more", "older" (DEFAULT_NEXT_SYMBOLS). The CNBC link has "more" in it.

kylek

Requests-HTML - Checking out a new HTML parsing library for Python

Requests-HTML - Checking out a new HTML parsing library for Python

Requests-HTML: A Python Library For Scraping The Web

requests HTML - Python requests on sterioids

Create A Web Scraper Class in Python and requests-html

Python Tutorial: Web Scraping with Requests-HTML

Python and Requests-HTML - Web Scraping Dynamic Content from JavaScript applications

Slow Web Scraper? Try this with ASYNC and Requests-html

A Quick Guide to Web Scrapping with Python - using requests-html

Leave Request and Approval Process Training Video

Python Web Scraping - Append to CSV, Cleaning Data, Requests HTML

How I Scrape JAVASCRIPT websites with Python

How to web scrape with Python | Scraping websites with Python Requests, Beautiful Soup, and Selenium

Common Mistake with HTML Forms #webdevelopment

How to Scrape Amazon for ASINs with Requests-HTML

Get Live Web Data W/O API Requests HTML| Raspberry Pi| Python

Getting Started with Python Web Scraping : Requesting HTML | packtpub.com

GET Data from API & Display in HTML with JavaScript Fetch API

How To Scrape Woocommerce products with Python & requests-html

I Don't Waste Time Parsing HTML (So I do THIS)

How do you Submit an HTML Form? How does it work?

Python Basics Tutorial Download and Parse HTML With Requests and BeautifulSoup

Web scraping with Python using Requests to pull HTML code (Part 1)

Fill out an HTML Form with Python Requests

💡 Angular Interceptor Secrets! 🌟 Build, Test, and Secure Your HTTP Requests!