Python 2.7 Tutorial Pt 13 Website Scraping

preview_player
Показать описание
Here I show you how to scrap websites for information. I also introduce the urllib and Beautiful Soup Modules.

Рекомендации по теме
Комментарии
Автор

You're very welcome :) I have a ton of MySQL tutorials, but I mainly cover using it with PHP and Java. I really need to improve on my Python Tutorial. It took me till I made my newest HTML tutorial to figure out how to teach this stuff. Now I have been making Java tutorials for almost a year because I've always loved Java. I promise when I finish with Java I'll make a new Python tutorial and it will be much better

derekbanas
Автор

Thank you for taking the time to tell me they helped :) When I sit down to cover a topic that I know very few people have an interest in ( Design Patterns ) I think about those few that want to see them. Thanks for taking the time to watch my videos

derekbanas
Автор

@95Ners A couple of things: it's urllib I noticed you typed urlib. Make sure you are using python 2.7 or lower and not python 3. To properly install beautiful soup go to the beautiful soup download folder and type setup.py install in the windows terminal. It will install everything in the right place. This also works for every other module you want to add to python. To check your version of python just type python into the terminal and hit return. I hope that helps

derekbanas
Автор

You know the teacher is good when you're watching a 7yo tutorial because the newer ones don't come close. Thanks a lot!

randmprecisin
Автор

You're very welcome :) I show how to scrap using PHP as well in my Web Design and Programming tutorial. The last video in that series gets into serious scraping

derekbanas
Автор

Thank you :) I have to rerun this code. There is a chance that the sites I'm pulling from have changed since I made this tutorial. In that situation you'd have to understand regular expressions, which I also cover, to make the changes needed to properly pull from any given site. I hope that makes sense and helps

derekbanas
Автор

I'm just happy that you fixed it :) I'm not sure what you mean about the snakes?

derekbanas
Автор

You're very welcome :) Thanks for taking the time to tell me you liked it

derekbanas
Автор

You may need to adjust the number of results depending on the site

derekbanas
Автор

A String is expected. I'm sorry about this, but I think the problem is that the code on my site has backquotes. You'll need to do a find and replace on it to replace with normal quotes. I bet that will solve the problem. Tell me if that doesn't work

derekbanas
Автор

@ribsmcgee1 To install beautiful soup on windows go to the folder you downloaded and run command setup.py install (assuming *.py is known in path extensions, otherwise run c:\python27\python setup.py install).

derekbanas
Автор

@entrevu What operating system are you using? In most linux / mac based OS you install by going to the directory in your terminal and typing python setup.py install On a PC open the command prompt and type setup.py install

derekbanas
Автор

Thank you :) I don't mean to sound arrogant by any means. I think that may come across because often when I'm making these videos I'm thinking in the back of my head "Wow, isn't programming Cool!" I'm definitely not thing "Wow, I'm so Awesome" :D

derekbanas
Автор

Change the range range(2, 16) to get more or less data. The wikimedia error has something to do with an error on their end because of recent editing or something. It sounds like it is temporary

derekbanas
Автор

The answer is most of the time. If a website puts security in your way it can be impossible to pull from a web page. PyQt allows you to print to PDF. To save to Excel I use DataFrame.to_excel from pandas. I hope that helps :)

derekbanas
Автор

Thank you for the input. I'll see if adding that to the description and on my site helps. It's a bit to late to put them in the video :)

derekbanas
Автор

@grizzlybayer listIterator is just a list of numbers. Think of a list as an array if that makes more sense? So, based on that listiterator is an array with 15 values stored in it from 2 to 16. This is used to cycle through the for loop. The first value of i will be 2 and then the loop ends when it gets to 16. You can do anything with the information that BS gathers. Hope that answers your questions :)

derekbanas
Автор

It is people like you who make internet such a wonderful palce to learn ..thanks a lot make more videos if possible ....

bosepukur
Автор

I think beautiful soup isn't being worked on any longer. I'll have to double check everything to see what may be going on. I moved on to website scrap with PHP because I found it to be much easier

derekbanas
Автор

Thank you Derek for sharing your expertise. Would you consider making a video on transferring scraped data to a Wordpress database for publication?

jimmym