Remove Duplicates From Data List
Duplicates are very common in not only Excel but also databases. And the one
of the easiest ways to work the data is through Excel. During our
Advanced
Excel Course, users always asked us how to remove these records in the database.
And one of the
most common problems is having muliple instance of the same email
address because the client share one email address or having the same mailing address
because your clients are staying in the same house.
So imagine when you are sending emails or printed materials to all your customers,
your customers' inbox or letter boxes will have a few copies of the same
emails or printed materials from you. Not only it is a waste of resources,
it also does not reflect very well on your company too.
There are 2 ways to remove duplicates in Excel. The first way is by using
Advanced Filter. This method is very effective if 2 records in your database
is exactly the same, meaning the content in all the fields of one record is
the exactly the same as another. Should one of the fields differs, then
the record is considered different and will not be filtered out. First,
let me show you what I mean by that.
You have a list of email addresses. In the list, the last 2 records are
repeats of the first 2 records. And you want to remove them using
advanced filter. And here is what you do.
- Go to the menu and click on Data -> Filter -> Advanced Filter.
- Select the option to copy the results into another location.
- Select the range of the list. In our case, it is A1 to A7, including the header (A must!).
- In the box for Copy to: select a cell away from the list e.g. cell F1
- Make sure that the box next to "unique records only" is checked.
- Click Ok and the results in cell F1 will show a list without the last 2 emails
which are duplicates.
Now if you have a list containing Emails and Names as shown:
Using Advanced Filter, you will get a list with 5 records instead of
4 in the previous example. This is because the names for the email address
jason@anc.com (A2 and A6) are different (Jason and Jason Khoo). As such,
they are recognised as 2 different records.
To remove multiple instances of the same email address from such a list,
you have to use the 2nd method, i.e. the IF
formula method. Here are the steps:
- Sort the list by emails.
- In column C, enter the IF formula to compare the email address in the current
row with that contained in the previous row. If the email address is the same,
i.e. A3 is the same as A2, the formula will return the word "duplicate"
in cell C3. If they are different (A3 is different from A2, return a
blank - denoted by double inverted comma "")
- Copy the formula down.
- Record 4 and 7 will show up as duplicate. In this example, record 4 is also marked
even though the Names (in B3 and B4) are different.
- Activate auto-filter and select blanks as the filter criteria for column C.
- All the emails will appear only one time in the filtered list.
- Copy and paste records to a new worksheet.
- You have a list of emails and names without duplicates in it. Even if you
have a list that contains many fields, this method will surely work for you.
|