Address Book Data Cleanup

mkbw · January 10, 2025, 5:50pm

Hi,

I’m working on address book solution to become a system-of-record for addresses in multiple foreign systems. I’ve have an export from another system with an incredible number of duplicates and near-duplicates with errors, typos, “Street” vs “St”, etc.

Total dataset is ~14,000 records. I suspect 70% of this is duplicates or near duplicates.

Have any of you tried using Glide or other tools to clean something up like this? Was thinking an LLM via API might be a solution?

The standing features in Excel are not up to the job on this.

ThinhDinh · January 11, 2025, 1:20am

My best guess would be something like this, or an autocomplete API, or maybe even a geocoding API?

You have to standardize all sorts of variations into a structured format, and also hope those APIs can correct the typos as well.

Topic		Replies	Views
API's/ Row IDs issue Ask for Help	1	75	April 16, 2024
Address fields & Trimming Feature Requests	18	1379	December 16, 2020
Address Entry -Difficult to get users to provide complete addresses by txt entry Feature Requests	6	397	August 1, 2020
👀 Codes/Lookup/ImportRange/Row Count - Help Needed 🤔 Ask for Help	12	454	December 20, 2020
Challenging Data Structure Ask for Help	6	263	July 27, 2023

Address Book Data Cleanup

Related topics