刪除完全重復(fù)和部分關(guān)鍵字段重復(fù)的記錄

字號(hào):

重復(fù)記錄分為兩種,第一種是完全重復(fù)的記錄,也就是所有字段均重復(fù)的記錄,第二種是部分關(guān)鍵字段重復(fù)的記錄,例如Name字段重復(fù),而其它字段不一定重復(fù)或都重復(fù)。
    1、第一種重復(fù)很容易解決,不同數(shù)據(jù)庫(kù)環(huán)境下方法相似:
    Mysql
    create table tmp select distinct * from tableName;
    drop table tableName;
    create table tableName select * from tmp;
    drop table tmp;
    SQL Server
    select distinct * into #Tmp from tableName;
    drop table tableName;
    select * into tableName from #Tmp;
    drop table #Tmp;
    Oracle
    create table tmp as select distinct * from tableName;
    drop table tableName;
    create table tableName as select * from tmp;
    drop table tmp;
    發(fā)生這種重復(fù)的原因是由于表設(shè)計(jì)不周而產(chǎn)生的,增加索引列就可以解決此問(wèn)題。
    2、此類重復(fù)問(wèn)題通常要求保留重復(fù)記錄中的第一條記錄,操作方法如下。 假設(shè)有重復(fù)的字段為Name,Address,要求得到這兩個(gè)字段的結(jié)果集
    Mysql
    alter table tableName add autoID int auto_increment not null;
    create table tmp select min(autoID) as autoID from tableName group by Name,Address;
    create table tmp2 select tableName.* from tableName,tmp where tableName.autoID = tmp.autoID;
    drop table tableName;
    rename table tmp2 to tableName;
    SQL Server
    select identity(int,1,1) as autoID, * into #Tmp from tableName;
    select min(autoID) as autoID into #Tmp2 from #Tmp group by Name,Address;
    drop table tableName;
    select * into tableName from #Tmp where autoID in(select autoID from #Tmp2);
    drop table #Tmp;
    drop table #Tmp2;
    Oracle
    DELETE FROM tableName t1 WHERE t1.ROWID > (SELECT MIN(t2.ROWID) FROM tableName t2 WHERE t2.Name = t1.Name and t2.Address = t1.Address);
    說(shuō)明:
    1. MySQL和SQL Server中最后一個(gè)select得到了Name,Address不重復(fù)的結(jié)果集(多了一個(gè)autoID字段,在大家實(shí)際寫時(shí)可以寫在select子句中省去此列)
    2. 因?yàn)镸ySQL和SQL Server沒(méi)有提供rowid機(jī)制,所以需要通過(guò)一個(gè)autoID列來(lái)實(shí)現(xiàn)行的性,而利用Oracle的rowid處理就方便多了。而且使用ROWID是效的刪除重復(fù)記錄方法。