filmov
tv
Python Scrapy Tutorial - 24 - Bypass Restrictions using Proxies
Показать описание
In this last video we bypassed the scraping restrictions by using user-agents and in this video we will be learning how to bypass them by using something known as proxies.
Before we go into proxies, you need to understand what is an IP address. An IP address is basically an address of your computer. You can find your own IP address by going to google and typing in 'What is my IP'.
Whenever you connect to a website you are automatically telling them your IP address. A website like amazon can recognize your IP address and ban you if you try to scrape a lot of it's data. But what if used a another IP address instead of our own. And even better we can use a lot of IP addresses that our not our own, and put them in rotation. So we every-time we send a request to amazon. It's going to be with a different IP address.
When you use an IP address that is not your own. Then that other IP address is known as a proxy. If we look up the definition of proxy on google it says 'the authority to represent someone else'. So basically we are hiding our address and using someone elses.
Next video - Scraping multiple page of amazon
#python
Before we go into proxies, you need to understand what is an IP address. An IP address is basically an address of your computer. You can find your own IP address by going to google and typing in 'What is my IP'.
Whenever you connect to a website you are automatically telling them your IP address. A website like amazon can recognize your IP address and ban you if you try to scrape a lot of it's data. But what if used a another IP address instead of our own. And even better we can use a lot of IP addresses that our not our own, and put them in rotation. So we every-time we send a request to amazon. It's going to be with a different IP address.
When you use an IP address that is not your own. Then that other IP address is known as a proxy. If we look up the definition of proxy on google it says 'the authority to represent someone else'. So basically we are hiding our address and using someone elses.
Next video - Scraping multiple page of amazon
#python
Комментарии