How to use Python to parse JSON sitemaps | Flatten nested dictionaries to get codes for WEB SCRAPING

preview_player
Показать описание
If you are web scraping with Scrapy, you may want to scrape many categories, but not just scrape all links with a crawler.

If you can find a sitemap, in JSON format, you can flatten the structure, with its lists and dictionaries and then make a new list to use for your URLS or form query string parameters for your URLs to scrape.

Sound like hard work? Not really, 8 lines of code inside a function and off you go. Just print "type" regularly to check what type you are iterating through...

Timings:
0:00 Intro - About sitemaps
4:05 - Start the code
19:00 - Using slice to get 'code' and 'name'

Any questions, add a comment, I'll be pleased to reply!
Dr Pi.

#webscraping #json #sitemap
Рекомендации по теме
join shbcf.ru