Set is a kind of collection which is widely used in the Java programming. In this tutorial, we will help you understand and master Set collections with core information and a lot of code examples. You will learn about:
Basically, Set is a type of collection that does not allow duplicate elements. That means an element can only exist once in a Set. It models the set abstraction in mathematics. The following picture illustrates three sets of numbers in mathematics:
The following characteristics differentiate a Set collection from others in the Java Collections framework:
Based on the characteristics, consider using a Set collection when:
For example, you can use a Set to store unique integer numbers; you can use a Set to store cards randomly in a card game; you can use a Set to store numbers in random order, etc.
The Java Collections Framework provides three major implementations of the Set interface: HashSet, LinkedHashSet and TreeSet. The Set API is described in the following diagram:
Let’s look at the characteristics of each implementation in details:
Therefore, besides the uniqueness of elements that a Set guarantees, consider using HashSet when ordering does not matter; using LinkedHashSet when you want to order elements by their insertion order; using TreeSet when you want to order elements by their values.
The code examples in this tutorial mostly use HashSet implementation.
Always use generics to declare a Set of specific type, e.g. a Set of integer numbers:
Set<Integer> numbers = new HashSet<>();
Remember using the interface type (Set) on as the reference type, and concrete implementation (HashSet, LinkedHashSet, TreeSet, etc) as the actual object type:
Set<String> names = new LinkedHashSet<>();
We can create a Set from an existing collection. This is a trick to remove duplicate elements in non-Set collection. Consider the following code snippet:
List<Integer> listNumbers = Arrays.asList(3, 9, 1, 4, 7, 2, 5, 3, 8, 9, 1, 3, 8, 6); System.out.println(listNumbers); Set<Integer> uniqueNumbers = new HashSet<>(listNumbers); System.out.println(uniqueNumbers);
Output:
[3, 9, 1, 4, 7, 2, 5, 3, 8, 9, 1, 3, 8, 6] [1, 2, 3, 4, 5, 6, 7, 8, 9]
You see, the list listNumbers contains duplicate numbers, and the set uniqueNumbers removes the duplicate ones.
As with Java 8, we can use stream with filter and collection functions to return a Set from a collection. The following code collects only odd numbers to a Set from the listNumbers above:
Set<Integer> uniqueOddNumbers = listNumbers.stream() .filter(number -> number % 2 != 0).collect(Collectors.toSet()); System.out.println(uniqueOddNumbers);
Output:
[1, 3, 5, 7, 9]
Note that the default, initial capacity of a HashSet and LinkedHashSet is 16, so if you are sure that your Set contains more than 16 elements, it’s better to specify a capacity in the constructor. For example:
Set<String> bigNames = new HashSet<>(1000);
This creates a new HashSet with initial capacity is 1000 elements. For more ways of creating a Set object, refer to this article.
The add()method returns true if the set does not contain the specified element, and returns false if the set already contains the specified element:
Set<String> names = new HashSet<>(); names.add("Tom"); names.add("Mary"); if (names.add("Peter")) { System.out.println("Peter is added to the set"); } if (!names.add("Tom")) { System.out.println("Tom is already added to the set"); }
Output:
Peter is added to the set Tom is already added to the set
The Set can contain a null element:
names.add(null);
The remove() method removes the specified element from the set if it is present (the method returns true, or false otherwise):
if (names.remove("Mary")) { System.out.println("Marry is removed"); }
Note that the objects in the Set should implement the equals() and hashCode() methods correctly so the Set can find and remove the objects.
The isEmpty() method returns true if the set contains no elements, otherwise returns false:
if (names.isEmpty()) { System.out.println("The set is empty"); } else { System.out.println("The set is not empty"); }
The clear() method removes all elements from the set. The set will be empty afterward:
names.clear(); if (names.isEmpty()) { System.out.println("The set is empty"); }
The size() method returns the number of elements contained in the set:
Set<String> names = new HashSet<>(); names.add("Tom"); names.add("Mary"); names.add("Peter"); names.add("Alice"); System.out.printf("The set has %d elements", names.size());
Output:
The set has 4 elements
Note that the Set interface does not provide any API for retrieving a specific element due to its nature of unordered. Except the TreeSet implementation allows retrieving the first and the last elements.
If you want to dive deep into Java collections framework, this famous Java collection book is a good read.
Set<String> names = new HashSet<>(); names.add("Tom"); names.add("Mary"); names.add("Peter"); names.add("Alice"); Iterator<String> iterator = names.iterator(); while (iterator.hasNext()) { String name = iterator.next(); System.out.println(name); }
Output:
Tom Alice Peter Mary
for (String name : names) { System.out.println(name); }
Using the forEach() method with Lambda expression in Java 8:
names.forEach(System.out::println);
For more information about collections iteration mechanism, see: The 4 Methods for Iterating Collections in Java.
The contains(Object) method returns true if the set contains the specified element, or return false otherwise. For example:
Set<String> names = new HashSet<>(); names.add("Tom"); names.add("Mary"); names.add("Peter"); names.add("Alice"); if (names.contains("Mary")) { System.out.println("Found Mary"); }
Note that if the set contains custom objects of your own type, e.g. Student or Employee, the object should implement the equals() and hashCode() methods correctly so the Set can find the objects.
We can perform some mathematic-like operations between two sets such as subset, union, intersection and set difference. Suppose that we have two sets s1 and s2.
Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(20, 56, 89, 31, 8, 5)); Set<Integer> s2 = new HashSet<>(Arrays.asList(8, 89)); if (s1.containsAll(s2)) { System.out.println("s2 is a subset of s1"); }
Output:
s2 is a subset of s1
s1.addAll(s2)
— transforms s1
into the union of s1
and s2
. (The union of two sets is the set containing all of the elements contained in either set.)Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(1, 3, 5, 7, 9)); Set<Integer> s2 = new HashSet<>(Arrays.asList(2, 4, 6, 8)); System.out.println("s1 before union: " + s1); s1.addAll(s2); System.out.println("s1 after union: " + s1);
Output:
s1 before union: [1, 3, 5, 7, 9] s1 after union: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Intersection operation:
s1.retainAll(s2)
— transforms s1
into the intersection of s1
and s2
. (The intersection of two sets is the set containing only the elements common to both sets.)Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(1, 2, 3, 4, 5, 7, 9)); Set<Integer> s2 = new HashSet<>(Arrays.asList(2, 4, 6, 8)); System.out.println("s1 before intersection: " + s1); s1.retainAll(s2); System.out.println("s1 after intersection: " + s1);
Output:
s1 before intersection: [1, 2, 3, 4, 5, 7, 9] s1 after intersection: [2, 4]
s1.removeAll(s2)
— transforms s1
into the (asymmetric) set difference of s1
and s2
. (For example, the set difference of s1
minus s2
is the set containing all of the elements found in s1
but not in s2
.) Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(1, 2, 3, 4, 5, 7, 9)); Set<Integer> s2 = new HashSet<>(Arrays.asList(2, 4, 6, 8)); System.out.println("s1 before difference: " + s1); s1.removeAll(s2); System.out.println("s1 after difference: " + s1);
Output:
s1 before difference: [1, 2, 3, 4, 5, 7, 9] s1 after difference: [1, 3, 5, 7, 9]
All three implementations HashSet, LinkedHashSet and TreeSet are not synchronized. So if you use them in concurrent context (multi-threads), you have to synchronize them externally using Collections.synchronizedSet() static method. For example:
Set<Integer> numbers = Collections.synchronizedSet(new HashSet<Integer>());
The returned set is synchronized (thread-safe). And remember you must manually synchronize on the returned set when iterating over it:
synchronized (numbers) { Iterator<Integer> iterator = numbers.iterator(); while (iterator.hasNext()) { Integer number = iterator.next(); System.out.println(number); } }
The Collections utility class provide several methods involving in set collection. So consult its Javadoc to check if some useful operations are already made for reuse:
That's almost everything you need to know about Set in Java. I hope you enjoy this tutorial. And as I mentioned above, this Java Generics and Collections book will help you dive deeper into Java collections framework. I also recommend you to check this Java course to learn more.