Python Web Crawler Tutorial - 3 - Adding and Deleting Links

preview_player
Показать описание
Рекомендации по теме
Комментарии
Автор

For anyone who is still facing this problem replace the write_file with below (for Python > 3.2):
# create a mew file
def write_file(path, data):
os.makedirs(os.path.dirname(path), exist_ok=True)
with open(path, "w") as f :
f.write(data)

iEmrul
Автор

Those who are getting error in creating a file, remember you need to create the directory first and then pass the directory name in the create_file function.

tusharbarman
Автор

the way he said i love u...remembered me of deadpool..

amrityamsrivastava
Автор

🐐🐐when he said that he was going to use multi threading to speed up the program. A definite SUB🔥🔥. I'm very hopeful that this tutorial would help me seal a deal soon."

bagoviggo
Автор

Also not getting it to work at 0:57. Not getting an error, just not getting the new pages created. Running newest version of PyCharm with Anaconda3, if that would make a difference.

jonnygoth
Автор

I want a web crawler robot analysis algorithm

o_minaq
Автор

Hey, Bucky! Im really confused the way deleting the file, it seems a waste of memory. Do you mean by when crawl an url, the contents in crawl file and queue file needed to be cleaned and rewritten with new data? Is there a better way to do this?

yichenzhu
Автор

isn.t better idea to store the information bout queue and crawled in separate Classes? Readin and writin files makes a lot of time.

wBacz
Автор

i ended up with some strange issue, as i was writing the program, this popped up when he runs it as 0:57, because when i do the same, I end up with this error






Traceback (most recent call last):
File "C:/Users/theho/PycharmProjects/Crawler/general.py", line 24, in <module>
File "C:/Users/theho/PycharmProjects/Crawler/general.py", line 14, in create_data_files
write_file(queue, base_url)
File "C:/Users/theho/PycharmProjects/Crawler/general.py", line 19, in write_file
f = open(path, 'w')
FileNotFoundError: [Errno 2] No such file or directory: 'thenewboston/queue.txt'"

jebprime
Автор

well..., why cant we just right click and then create a new file in pycharm ? xD

amoghkulkarni