Webscraping with Python How to Save to CSV, JSON and Clean Data

preview_player
Показать описание

This is the fourth video in the webscraping 101 series, aimed out how to export out scraped data to json and csv, along with some simple data cleaning pipelines.

This is a series so make sure you subscribe to get the remaining episodes as they are released!

If you are new, welcome! I am John, a self taught Python (and Go, kinda..) developer working in the web and data space. I specialize in data extraction and JSON web API's both server and client. If you like programming and web content as much as I do, you can subscribe for weekly content.

:: Links ::

:: Disclaimer ::
Some/all of the links above are affiliate links. By clicking on these links I receive a small commission should you chose to purchase any services or items.
Рекомендации по теме
Комментарии
Автор

I really enjoy this series and will probably need to replay it in the future. this is helpful and practical as it shows the whole process on how to approach it.
thank you John.

AliceShisori
Автор

Thanks for another informative video!

There is one tiny concern with the append_to_csv code. The file lacks the normal (but optional per RFC 4180) header that some apps expect or that may be needed if there were more fields in the file. This small change would create the header line just once when the file is created. Before the with block simply add this little bit of code:

# Check if the file exists
if not
# Open file in write mode to write the header line
with open('append.csv', 'w') as f:
writer = csv.DictWriter(f, field_names)
writer.writeheader()

PanFlute
Автор

Still waiting for the neovim set up video ❤

bakasenpaidesu
Автор

John thanks a lot for your videos! They are really interesting and well made, i learnt a lot with you! Many thanks! CHEERS!

andrepereira
Автор

John,

Another great presentaion!

Also the program is very logically developed.

I liked to see list compressions.

Another idea I think. Could have a GUI front end where user inputs some conditions or product categories or names or whatever, and the program returns records based on the conditions either one at a time or in a table on the form. Just a thought.

Thanks!

thebuggser
Автор

Wouldn't the clean_data function also remove the word "Item" and "$" from the name of the product too?

rajatkumar
Автор

Thanks for information but I have a question about something similar to this topic. If I have an local web page and I have some graphics in jpg format, how do I scrap them or store them in a specific file by using a web scraper? Thanks alot for all info

mohammedaldbag
Автор

Thanks bro for sharing the great content, So if you not have any issue can you make the same or another web scraping content in object oriented programming concept.

adarshjamwal
Автор

I am scrapping a online shopping site
With from last 10days it's doesn't work properly
After 3-4 times scan it take about 15-20X more time to scan
And after again it work smooth for 2-3 times and then again it take lots of time

Why it's happing

I m using scrapy py

lordlegendsss