How To Get High Quality Data For Your Website Directory

preview_player
Показать описание
A common question when people build a website directory is "where do I get the data?" I'll share why it's so important to collect high quality data, why you shouldn't take shortcuts here, and why I believe the quality of your data will ultimately reflect the usefulness of your directory.

Together, we'll scrape data from Google maps, clean it, and enrich the data.

Timestamps:
00:59 Importance of empathy when building a directory
02:27 Scraping Google Maps data
05:29 Using ChatGPT before cleaning the data
06:30 Cleaning data (by number of reviews)
07:08 Identifying the smartest way to clean 107k Excel Rows
08:11 Using Google Maps to filter by names
09:26 Examples of junk data
09:42 Cleaning the junk data (by filtering name column)
10:08 Cleaning data (by blank addresses)
11:09 Cleaning 52k rows of data (by name column)
12:29 Final Cleaning: Cherry picking the best data (by name column)
14:06 Enriching data: How it will differentiate your directory
14:38 Enriching data by Google Review Tags + Ahrefs
17:42 Reverse engineering how people search on Google Maps
18:13 Tip for finding manual data faster
20:20 Next steps to building your directory on Wordpress

Want to connect? Follow me on Twitter/X @freychu
Рекомендации по теме
Комментарии
Автор

Edit: Ok so it appears that a lot of you have smarter ways to automate data enrichment, which is awesome. My email and twitter/X DMs are open for anyone who is willing to chat about automating this step of the process! Would love to learn from you.

Before you start building your directory website, making a logo, choosing fonts in Wordpress...here's the most important part of the whole directory building process.

If anyone discovers a better way to clean data, please share them below!


P.S. sorry for the bird that started chirping mid-video.

FreyChu
Автор

YOOOO WHATEVER YOU DO PLEASE PLEASE PLEASE FINISH OUT THIS SERIES from validating to building website to pgrammatic SEO to taking the #1 Google Search result. YOU ARE THE MAN

MinhNguyen-wdwc
Автор

Happy new year! Awesome video again. Looking forward to learning more technical stuff like these!

kohkoh
Автор

Thanks man, great video.

A couple thoughts:
1. The more data you have in your directory, the bigger the moat. While parsing through a massive number of listings is a lot of work, it also makes the data more valuable.
2. For the handful of directories I have, APIs have been incredible. Sometimes people have already collected the data and offered it publicly. Your job is then to present the data through a new lens. Yesterday I scraped data for 615 movies, and am building a directory in a movie niche. I used Claude to build a script that scraped from multiple data sources and combined it. I just told Claude in plain language what I wanted. Worked well.
3. My hypothesis is part of the added value of a directory is someone knowing that a real person created it. It isn’t some programmatic BOT spinning up a website. This perception adds value to the data.

I am looking forward to your video on getting traffic

marcusedvalson
Автор

Good job! Thanks takinfg the time to share and teach what us what you know. Looking forward to see how your directory turns out.

camiloaquino
Автор

Man, you came again with gold mine 🎉 thanks ❤

hashim.pakara
Автор

Enjoying your work! Trying to build an area directory

Debblogging
Автор

Fascinating video. I know even less about spreadsheets than you do, but I still want to learn how to make a directory with recent technology like AI. Approx 20 something years ago I made a local directory for doctors by manually copying and pasting the data. I enjoyed building it, but I eventually let it go because I knew nothing about getting traffic to it. Without traffic, it was a useless project.

luxurycardstore
Автор

Another banger video. Now i am getting greedy to see you build whole WordPress website with this big data. I would love to see how you monetize and display ads on the website. Thank you for doing this for us.😃

jddude
Автор

I just did a directory that started with 25, 000 rows. Refined to roughly 1800. I was able to automate the data enrichment with google sheets addons. I think you could automate your review scraping pretty easily and clean the data with clever AI prompting. If you want some help I’d be happy to give some pointers.

BWBGarage
Автор

thanks a lot for the video.
i am a web dev, i don't know much about wordpress but there is a way to fill the data into you website easly using Puppeteer or cypress.
basicly it will act as a robot that control your browser, reads your csv file and fill the form then submit the data row by row.

myjamal
Автор

thanks for the video, I really like the idea of a web directory even though it's so 1996. Might work. Having issues understanding why would someone visit my web directory instead of just googling. I would like to see your take on this in a future video. Thanks bro!

veed
Автор

Hey @FreyChu, as you mentioned to use the copy pasting the data from Google Maps Listings to the directory, is that allowed to copy that data like reviews and images and store them on the site as that against to google terms. there is no clear answer for this anywhere.

triwebdigital
Автор

Great video thanks for sharing all this behind the scenes details. This was very helpful. Question: Do you also get the photos when you scrape the data? If you do, how do you go about adding that to the directory? is that a manual or automated process?

jwalkoviak
Автор

Hey Frey thanks for the video! I think outscraper might be skirting around the Places API TOS by not "caching or storing" the data they send to you, and not selling the data, instead selling the service of pulling the data. However if you use the data you might violate Google's TOS, should probably be careful. Anyways GL!

mmxcrono
Автор

Hey Frey, i wanna ask if im starting out to learn SEO - are they any course you recommend me taking?

DannyChowmrchow
Автор

amazing video! :) looking forward to more videos like this! Very interesting.

If you manage to find a good way to automate this step more, please share it here on youtube! Would be nice to see! Thanks man for the value.

PinkKoala-ks
Автор

Is 122, 916 the max it can spit out? I get the same number for every pull

eNVy
Автор

can you please share the website template design for the dog park niche?

einthwil
Автор

Hi Frey, what do you think about using a CMS like Wordpress compared to using dedicated platforms like edirectory, brilliant directories, etc for building a directory?

zakhir