Re: 2 Million Records, Duplicate and SubString Removal

Re: 2 Million Records, Duplicate and SubString Removal

 

  

Hi Ranjit, you do realise this is a DB2 Forum?
Your question is too imprecise to answer, what do you mean by 'remove' and
'duplicates' for instance? the column value or the entire row? but clearly
you'll want to edit in a single pass rather than in two.




"Ranjit Singh
Sandhu "
<db2udbdba-ezmlms To
hield-x64140281.x "LazyDBA Discussion"
[Email Address Removed] <[Email address protected]
a.com> cc

28/06/2006 12:17 Subject
2 Million Records, Duplicate and
SubString Removal










Hello,
I have 2,000,000 records in an RDBMS (Oracle 9i/10g or SQL Server 2000)

Let me define the question in the following points:

1. I have 2 Million records like the following


TABLE NAME: NamesTable
NameId NameString length
R2ZRWRTY

RRRARAZZZRAAZRZARWZWZRWRZWZAZRRRZWRZRRRRZRWRRRZAARAZZRWRZWZZWZRRRRARZWRZRRRZ

AWRRZZRWRZRRZAZAARWRRRRANZRRRA 106


Now in the above structure of table assume you have 2 million records.
NameString length is never more than 200.

2. In the "NamesTable" Table Remove all the DUPLICATE "NameString".

3. In the table left after removing duplicates, REMOVE ALL THE SUB -
NameString.
For Example, If NameString1="WORLDISGREAT" and NameString2 = "GREAT",
then you can clearly see that NameString2 exists in NameString1. So REMOVE
NameString2 because it is contained in NameString1.
So remove all Sub-NameStrings from the table.

How do you think we should do this? We can use Oracle 9i/10g or Sql Server
2000. Considering that there is so much of data and there are so many
possible combinations.

What is the best possible and the fastest solution to solve this kind of
DUPLICATE AND SUB-STRING REMOVAL PROBLEM.

Please Help

Thanks and Regards
Ranjit



---------------------------------------------------------------------
PLEASE CLICK REPLY-ALL TO SEND A REPLY TO EVERYONE
website: http://www.LazyDBA.com
To unsubscribe: http://www.lazydba.com/unsubscribe.html




DB2 & UDB email list listserv db2-l LazyDBA home page