Intro To Web Scraping With Puppeteer

preview_player
Показать описание
In this video, we will look at Puppeteer to scrape data from a web page.

💻 Code:

Puppeteer Docs:

⭐ All Courses:

💖 Show Support

👇 Follow Traversy Media On Social Media:

Timestamps:
0:00 - Intro
0:36 - Install & Setup
3:36 - Init Browser & Page Object
5:02 - Screenshot & PDF
6:54 - Targeting HTML, Text, and Links
11:22 - Scraping Courses
17:08 - $$eval()
18:40 - Save JSON Data
Рекомендации по теме
Комментарии
Автор

No "promos". Yet awesome. Thanks, Brad.

P.S. (Dec 2023)
#courses > #cscourses

P.P.S
Advanced scrapping tutorial will be amazing.

thomasnarkiss
Автор

Updates:
-Use '#cscourses' instead of '#courses' and promo code no longer exists so omit that to prevent error



const courses = await page.evaluate(() =>
.card'), (e) => ({
title: e.querySelector('.card-body h3').innerText,
level: e.querySelector('.card-body .level').innerText,
url: e.querySelector('.card-footer a').href,
}))
)



-also, to get formatted json during write, set following options.

// Save data to JSON file
fs.writeFile('courses.json', JSON.stringify(courses, null, 4), (err) => {
if (err) throw err
console.log('File saved')
})

jritzeku
Автор

Thanks, Can you make more courses on puppeteer scrapping in detail + there are ni convincing courses for developing chrome extensionsin market. In you can make in-depth course for Chrome extension. Thanks.

Kodeispoetry
Автор

After 11:00 whatever I'm trying to do I get the following error

triggerUncaughtException(err, true /* fromPromise */);
Can someone help

akshatmishra
Автор

Don't forget Testing Automation.
@Traversy Media

MightYoungJoe
Автор

Can we scraping data from Facebook ads ?

trongnhanle
Автор

Anyone who can reliably scrape cargurus, please comment. I have work for you.

chrisl
Автор

Does anyone know how to easily have a variable copied to clip board from the .js and posted into a website?

Doug
Автор

i have tried this too many times but i still get a timeout error. can someone help me fix this?
'TimeoutError: Timed out after 30000 ms while waiting for the WS endpoint URL to appear in stdout!' is the timeout message from the command prompt

goodluckoriuwa
Автор

the problem is it taking a lot of resources

oopss
Автор

how can i do this but with websites that have "paste URL here" with my own URL and get a screenshot of the new page.

ITEngines
Автор

UPDATE! As for 2023 May, you'll have to change #courses for #cscourses, otherwise the code will return an empty array on 14:00.

vitorbutkus
Автор

Are you available for hire? I'm working with a group in need of a Laravel coder.

adev
Автор

Hello guys I wished to save file to pdf instead of json.How can I get it done.
// Get courses using $$eval
const courses = await page.$$eval('#courses .card', (elements) =>
elements.map((e) => ({
title: e.querySelector('.card-body h3').innerText,
level: e.querySelector('.card-body .level').innerText,
url: e.querySelector('.card-footer a').href,
promo: e.querySelector('.card-footer .promo-code .promo').innerText,
}))
);

console.log(courses);

// Save data to JSON file
fs.writeFile('courses.json', JSON.stringify(courses), (err) => {
if (err) throw err;
console.log('File saved');
});
Thanks again

magxtopher
Автор

I wish everyone can make tutorials of this quality.

qwizzwizz
Автор

Brad Schiff introduced me to Web Scraping. Great vid.

outpost
Автор

Thanks.I am automating my work with beautiful soup.

codified
Автор

how can you scrape handlerbars injected values into HTML? thx for the help

hermesmercuriustrismegistu
Автор

If you take a look at my search history, you'll find out that I was searching for scraping tutorials 2 days ago. I'm super happy that you released this video today. The timing is just perfect. Thank you so much!

thinotmandresy
Автор

and you should show cheerios library for get element as jquery $()

marianivanov