Troubleshooting Deleting duplicated records
Introduction
Your
organization has a lot of duplicated records
in their Account Employee table. This is a
daily problem in any organization and you
should be prepared to fix the problem. You
have been assigned to clean all of the
duplicated records.
Begin by
logging in as the oracle user.
SQL> CONNECT
oracle/learning
Set the
linesize to 100 and the pagesize to 55.
SQL> SET
LINESIZE 100 PAGESIZE 55
CREATE TABLE
AS �
First, let's
create a table with lots of duplicated
records. Create a table named dup_emp and
copy all of the EMP's records into it.
SQL> CREATE
TABLE dup_emp
AS (SELECT * FROM emp)
SQL> /
Query the
dup_emp table.
SQL> SELECT *
FROM dup_emp
SQL> /
INSERT INTO �
SELECT command
Insert all the
accounting department rows into the dup_emp
table.
SQL> INSERT
INTO dup_emp
SELECT * FROM emp WHERE deptno = 10
SQL> /
SQL> commit;
Query all the
duplicated records from the dup_emp table
order by the employee name.
SQL> SELECT *
FROM dup_emp
WHERE deptno = 10
ORDER BY ename
SQL> /
Notice that
all of the employees of department 10 have
been duplicated. Don�t try to understand the
following PL/SQL statements. You will learn
how to write a PL/SQL statement at the
PL/SQL book.
Using PL/SQL
to delete duplicated records
Write a
procedure to delete all of the duplicated
records.
SQL> DECLARE
SQL> -- Get the unique empno.
SQL> CURSOR c_empno IS
SQL> SELECT DISTINCT empno AS empno FROM
dup_emp;
SQL> -- Get all duplicated records for an
employee.
SQL> CURSOR c_duprec (p_empno NUMBER) IS
SQL> SELECT rowid FROM dup_emp WHERE empno =
p_empno;
SQL> first_one BOOLEAN;
SQL> BEGIN
SQL> FOR v_empno IN c_empno LOOP
SQL> first_one := TRUE;
SQL> FOR v_duprec IN c_duprec (v_empno.empno)
LOOP
SQL> IF NOT first_one THEN
SQL> DELETE FROM dup_emp
SQL> WHERE rowid = v_duprec.rowid;
SQL> COMMIT;
SQL> END IF;
SQL> first_one := FALSE;
SQL> END LOOP;
SQL> END LOOP;
SQL> END;
SQL> /
Query again
the dup_emp table order by the department
and employee number.
Then check for duplication.
SQL> SELECT * FROM
dup_emp
WHERE deptno = 10
ORDER BY ename
SQL> /
Notice that there are no
duplicated records.
Using SQL to
delete duplicated records
You could
write a delete statement to remove all
duplicated records. The following statement
will delete all records that have the same
employee number except the first one.
In this time, you did not have any
duplicated record to be deleted.
SQL> DELETE
FROM dup_emp
WHERE ROWID IN (SELECT MAX(ROWID)
FROM dup_emp
GROUP BY empno
HAVING COUNT (empno) > 1)
SQL> /
Drop the
dup_emp table.
SQL> DROP
TABLE dup_emp
SQL> /
Questions:
Q: Why it is
important to eliminate duplicate records?
Q: What does
the following SQL statement?
SQL> DELETE
FROM dup_emp
WHERE ROWID IN (SELECT MAX(ROWID)
FROM dup_emp
GROUP BY empno
HAVING COUNT (empno) > 1)
SQL> / |