[Tech] Database Merge Duplicates Program nearly done
Jamie O'Keefe
jokeefe at jamesokeefe.org
Sun Oct 21 03:40:09 EDT 2007
I have a working program to merge duplicates from the effort to
combine all supporter records into one database. Dan will be happy to
know that each new record notes its voter id and contributor db id.
The whole process took about a minute to run.
I need to correct the phone number matching and we need to finish
reviewing the names for errors, but I am hopeful that it will be
finished by tomorrow night.
I started with 61856 records and after duplicates were merged, 36182
records were left. We haven't corrected all of the names, so there
are more duplicates to be found. With this uncorrected data here are
some stats:
Address info
Bad Address 408
Updated Add 5967
Other Addr. 27000+
Party breakdown
F 192
G 1267
J 8580
D 2959
R 112
U 2468
Note that this only has the latest F/G/Js. We have note combed
through the voter database to correct anyone's record who might have
moved out of state or changed party.
email info
email 9222
blank 26960
Anyway, this is great progress that I hope the campaigns will be able
to use soon. Once we have merged the duplicate records, I will load
them into our web db and then give out logins to the campaigns.
peace,
Jamie
More information about the Tech
mailing list