[Tech] Dup detecting advice sought
john saylor
js0000 at gmail.com
Mon Oct 29 13:28:53 EST 2007
hi
On 10/29/07, Jamie O'Keefe <jokeefe at jamesokeefe.org> wrote:
> I think this ranking is a good one. However, how do I choose which to
> use if two or more records are from the contributor DB? Should I
> choose both?
can you write of list of these cases? i don't know if a computer can
make this decision, or if it's worth the time to program one to do it.
i think the optimal solution is to have the software do most of the
matching, but have a path for the information about matches that are
too difficult [like the one you mention] to take.
there may be no shortcut with these cases.
do any of the records have dates with them? that might be possible solution.
also, you can check to see if one record has more data than another.
this may not be the most recent record, but if you combine the fields
in some way you may end up ahead.
maybe someone else has a better answer ...
--
\js [ http://or8.net/~johns/ ]
More information about the Tech
mailing list