Table of Contents
Introduction
A List is an interface in the Java collection framework which can be used to store a group of objects. A List is ordered and it allows duplicates. However, there are often scenarios where you would need to eliminate duplicate elements from a List.
There are several ways you can achieve this. In the next few sections, I will be going over each method in detail.
1) Using a For Loop
The simplest way to remove duplicates from a List is to use a for loop. The following code demonstrates this:
public static void usingForLoop() { List<Integer> input = Arrays.asList(5,10,15,20,10,5,35,40,10,25); List<Integer> output = new ArrayList<Integer>(); for(Integer num:input){ if(!output.contains(num)){ output.add(num); } } for(Integer num:output){ System.out.print(num+" "); } }
First, the code creates a new ArrayList to store the output, i.e. a List without duplicates. A for loop is then used which iterates through the input List. It first checks if the input element is present in the output List and if not, it adds it to the List. Finally, another for loop is used to print each element in the output list. So this code prints the following output: 5 10 15 20 35 40 25
2) Using a Set
Set is also an interface in the Java Collection Framework. Unlike a List, a Set does not allow duplicates. So you can use a Set to eliminate the duplicates in a List. There are several ways in which you can use a Set to eliminate duplicates as demonstrated below:
(a) Using a HashSet
The most commonly used implementation of Set is HashSet. HashSet has a constructor which accepts as parameter a Collection. So basically, if you create a new HashSet using this constructor and pass in the List as input, this will eliminate the duplicates. The following code demonstrates this:
public static void usingHashSet() { List<Integer> input = Arrays.asList(5,10,15,20,10,5,35,40,10,25); Set<Integer> output = new HashSet<Integer>(input); for(Integer num:output){ System.out.print(num+" "); } }
Here, a new HashSet is created by passing the List as input. So this will automatically create a Set with the elements in the List and eliminate the duplicates. So this code prints the same output as before: 35 20 5 40 25 10 15
The problem with HashSet is that it does not maintain the order of elements. So you can see that the order of elements in the output is different from that in the input list.
(b) Using a LinkedHashSet
Another implementation of the Set interface is LinkedHashSet. LinkedHashSet maintains the order of elements and helps to overcome the HashSet limitation. The following code demonstrates this:
public static void usingLinkedHashSet() { List<Integer> input = Arrays.asList(5,10,15,20,10,5,35,40,10,25); Set<Integer> output = new LinkedHashSet<Integer>(input); for(Integer num:output){ System.out.print(num+" "); } }
This code is quite similar to the one seen earlier, except that it uses a LinkedHashSet. The input List is passed as a parameter to the LinkedHashSet constructor. Since a LinkedHasSet is used, the elements will be ordered in this case as can be seen from the output below: 5 10 15 20 35 40 25
(c) Using Set.addAll
The Set interface has a method called addAll. This accepts as parameter a Collection. So if you invoke this method by passing the List, the duplicates will be eliminated. The following code demonstrates this:
public static void usingSetAddAll() { List<Integer> input = Arrays.asList(5,10,15,20,10,5,35,40,10,25); Set<Integer> output = new LinkedHashSet<Integer>(); output.addAll(input); for(Integer num:output){ System.out.print(num+" "); } }
Here, a new LinkedHashSet is created and the addAll method is invoked by passing the List as input. So this will automatically create a Set with the elements in the List and eliminate the duplicates. So this code prints the same output as before: 5 10 15 20 35 40 25
3) Using Stream API
Java 8 has added the Stream API that helps to easily perform bulk operations on Collections. A new method called stream() method has been added to all the collection interfaces that returns a Stream corresponding to the underlying collection.
The Stream interface has a method called distinct that can be used to eliminate duplicates. The following code demonstrates this:
public static void usingStream() { List<Integer> input = Arrays.asList(5,10,15,20,10,5,35,40,10,25); Stream<Integer> streamWithDuplicates = input.stream(); Stream<Integer> streamWithoutDuplicates = streamWithDuplicates.distinct(); List<Integer> output = streamWithoutDuplicates.collect(Collectors.toList()); for(Integer num:output){ System.out.print(num+" "); } }
Here, first the stream() method is invoked on the input List. This returns a Stream corresponding to the List. Then the distinct() method is invoked on the Stream. This returns a new Stream that has only the unique elements from the input List.
Finally, the collect() method is invoked which converts the Stream to a List back. So this code prints the following output: 5 10 15 20 35 40 25