Web scraping data with n8n and Puppeteer

preview_player
Показать описание
This tutorial shows two methods on how to scrape data from websites with n8n and Puppeteer.

In this tutorial I cover:
1. Data scraping in n8n using HTTP request node
2. Retrieving email addresses from scraped source code
3. Key features of Puppeteer and writing basic script
4. Deploying Puppeteer function to Google Cloud Functions
5. Using Puppeteer in n8n via Google Cloud Functions

Resources:

My other tutorials:

Disclaimer: Please note that web scraping should be performed only on websites that explicitly grant permission. I cannot be held responsible for any actions or consequences resulting from the use of the information provided in this video. Please make sure you have proper authorization before engaging in web scraping activities.

0:00 Introduction
0:20 Step 1: Basic scrape with HTTP request
2:51 Step 2: Prepare Puppeteer script
5:28 Step 3: Adjust code to Google Cloud Functions
6:06 Step 4: Deploy function to Google Cloud Functions
8:16 Step 5: Scrape data with Puppeteer in n8n
Рекомендации по теме
Комментарии
Автор

Really cool stuff, i spend days trying to use Google cloud function

happypixeldesign
Автор

I loved it. I’ve been working with web scraping in the old school way with simple fetch, cheerio and puppeteer, but with this video you gave me new ideas. Thanks a lot 💪

codesandtags
Автор

Thanks for sharing the tutorial. I would love to see how to use n8n to scrape emails from various Google search results. For example "SEO agencies" in Poland - so i could get 10 first result companies results.

LifeWithoutOffice
Автор

Do you have a way of scraping entire sites with multiple pages using Puppeteer and n8n?

bupi.
Автор

I get this error always:Error: Forbidden
Your client does not have permission to get URL /getPageContent?
Is there a solution ?

AbdullahAmer-egmm
Автор

How does dit work with live data? I want to scrape live data from an website what counts down 😮

maxmiddelhoven
Автор

is there any software that we could use to mimic google cloud functions on our own server?

GordonShamway