5 ways to delete duplicate records Oracle

In Oracle there are many ways to delete duplicate records. Note that below example are described to just explain the different possibilities.

Consider the EMP table with below rows

create table emp(
EMPNNO  integer,
EMPNAME varchar2(20),
SALARY  number);

10    Bill    2000
11    Bill    2000
12    Mark    3000
12    Mark    3000
12    Mark    3000
13    Tom    4000
14    Tom    5000
15    Susan    5000

1. Using rowid

SQL > delete from emp
where rowid not in
(select max(rowid) from emp group by empno);

This technique can be applied to almost scenarios. Group by operation should be on the columns which identify the duplicates.

2. Using self-join

SQL > delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );

3. Using row_number()

SQL > delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);

This is another efficient way to delete duplicates

4. Using dense_rank()

SQL > delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid) rn
from emp
)
where rn > 1
);

Here you can use both rank() and dens_rank() since both will give unique records when order by rowid.

5. Using group by

Consider the EMP table with below rows

10    Bill    2000
11    Bill    2000
12    Mark    3000
13    Mark    3000

SQL > delete from emp where
(empno,empname,salary) in
(
select max(empno),empname,salary from emp
group by empname,salary
);

This technique is only applicable in few scenarios.

Always take extra caution while deleting records. 

1. First identify the duplicates using select.

2. Double verify those are actual  ‘duplicates’ or not

3. Take backup if necessary

4. Apply commit only if you are sure.

Did you find above post useful ? Your comments are highly valuable.

Advertisements

62 thoughts on “5 ways to delete duplicate records Oracle

  1. Delete from employee a where ROWID >
    (select min(ROWID) from employee b where a,eid=b,eid);

    I think it is the most easy way to delete the newer records from table. 🙂
    #stayhappykeepcoding

  2. The last method using group by seems to be incorrect. It will delete all duplicate occurring rows instead of deleting all but one.

  3. Hi,
    By using first method total records are deleted not only the duplicate records can you please tell me the reason
    delete from piuswhere rowid not in (select min(rowid) from piusgorup by ename);
    here pius is tablename and ename is duplicate column

  4. My table has two records that have the same data (duplicate). It has no primary key, no AutoIncrement column, no ROWID column (almost every example I find talks about ROWID, but when I use it on a query it says error: “ROWID column not found”). How do I proceed in this case?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s