Last Updated on 18 July 2024   |   Print Email
Set is a kind of collection which is widely used in the Java programming. In this tutorial, we will help you understand and master Set collections with core information and a lot of code examples. You will learn about:
Basically, Set is a type of collection that does not allow duplicate elements. That means an element can only exist once in a Set. It models the set abstraction in mathematics. The following picture illustrates three sets of numbers in mathematics:
Characteristics of a Set collection:
The following characteristics differentiate a Set collection from others in the Java Collections framework:
Duplicate elements are not allowed.
Elements are not stored in order. That means you cannot expect elements sorted in any order when iterating over elements of a Set.
Why and When Use Sets?
Based on the characteristics, consider using a Set collection when:
You want to store elements distinctly without duplication, or unique elements.
You don’t care about the order of elements.
For example, you can use a Set to store unique integer numbers; you can use a Set to store cards randomly in a card game; you can use a Set to store numbers in random order, etc.
The Java Collections Framework provides three major implementations of the Set interface: HashSet, LinkedHashSet and TreeSet. The Set API is described in the following diagram:Let’s look at the characteristics of each implementation in details:
HashSet: is the best-performing implementation and is a widely-used Set implementation. It represents the core characteristics of sets: no duplication and unordered.
LinkedHashSet: This implementation orders its elements based on insertion order. So consider using a LinkedHashSet when you want to store unique elements in order.
TreeSet: This implementation orders its elements based on their values, either by their natural ordering, or by a Comparator provided at creation time.
Therefore, besides the uniqueness of elements that a Set guarantees, consider using HashSet when ordering does not matter; using LinkedHashSet when you want to order elements by their insertion order; using TreeSet when you want to order elements by their values.The code examples in this tutorial mostly use HashSet implementation.
Always use generics to declare a Set of specific type, e.g. a Set of integer numbers:
Set<Integer> numbers = new HashSet<>();
Remember using the interface type (Set) on as the reference type, and concrete implementation (HashSet, LinkedHashSet, TreeSet, etc) as the actual object type:
Set<String> names = new LinkedHashSet<>();
We can create a Set from an existing collection. This is a trick to remove duplicate elements in non-Set collection. Consider the following code snippet:
You see, the list listNumbers contains duplicate numbers, and the set uniqueNumbers removes the duplicate ones.As with Java 8, we can use stream with filter and collection functions to return a Set from a collection. The following code collects only odd numbers to a Set from the listNumbers above:
Note that the default, initial capacity of a HashSet and LinkedHashSet is 16, so if you are sure that your Set contains more than 16 elements, it’s better to specify a capacity in the constructor. For example:
Set<String> bigNames = new HashSet<>(1000);
This creates a new HashSet with initial capacity is 1000 elements. For more ways of creating a Set object, refer to this article.
The add()method returns true if the set does not contain the specified element, and returns false if the set already contains the specified element:
Set<String> names = new HashSet<>();
names.add("Tom");
names.add("Mary");
if (names.add("Peter")) {
System.out.println("Peter is added to the set");
}
if (!names.add("Tom")) {
System.out.println("Tom is already added to the set");
}
Output:
Peter is added to the set
Tom is already added to the set
The Set can contain a null element:
names.add(null);
Removing an element from a Set:
The remove() method removes the specified element from the set if it is present (the method returns true, or false otherwise):
if (names.remove("Mary")) {
System.out.println("Marry is removed");
}
Note that the objects in the Set should implement the equals() and hashCode() methods correctly so the Set can find and remove the objects.
Check if a Set is empty:
The isEmpty() method returns true if the set contains no elements, otherwise returns false:
if (names.isEmpty()) {
System.out.println("The set is empty");
} else {
System.out.println("The set is not empty");
}
Remove all elements from a Set:
The clear() method removes all elements from the set. The set will be empty afterward:
names.clear();
if (names.isEmpty()) {
System.out.println("The set is empty");
}
Get total number of elements in a Set:
The size() method returns the number of elements contained in the set:
Set<String> names = new HashSet<>();
names.add("Tom");
names.add("Mary");
names.add("Peter");
names.add("Alice");
System.out.printf("The set has %d elements", names.size());
Output:
The set has 4 elements
Note that the Set interface does not provide any API for retrieving a specific element due to its nature of unordered. Except the TreeSet implementation allows retrieving the first and the last elements.If you want to dive deep into Java collections framework, this famous Java collection book is a good read.
The contains(Object) method returns true if the set contains the specified element, or return false otherwise. For example:
Set<String> names = new HashSet<>();
names.add("Tom");
names.add("Mary");
names.add("Peter");
names.add("Alice");
if (names.contains("Mary")) {
System.out.println("Found Mary");
}
Note that if the set contains custom objects of your own type, e.g. Student or Employee, the object should implement the equals() and hashCode() methods correctly so the Set can find the objects.
We can perform some mathematic-like operations between two sets such as subset, union, intersection and set difference. Suppose that we have two sets s1 and s2.
Subset operation:
s1.containsAll(s2) returns true if s2is a subset of s1 (s2 is a subset of s1 if s1 contains all of the elements in s2).
Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(20, 56, 89, 31, 8, 5));
Set<Integer> s2 = new HashSet<>(Arrays.asList(8, 89));
if (s1.containsAll(s2)) {
System.out.println("s2 is a subset of s1");
}
Output:
s2 is a subset of s1
Union operation:
s1.addAll(s2) — transforms s1 into the union of s1 and s2. (The union of two sets is the set containing all of the elements contained in either set.)
Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(1, 3, 5, 7, 9));
Set<Integer> s2 = new HashSet<>(Arrays.asList(2, 4, 6, 8));
System.out.println("s1 before union: " + s1);
s1.addAll(s2);
System.out.println("s1 after union: " + s1);
Output:
s1 before union: [1, 3, 5, 7, 9]
s1 after union: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Intersection operation:
s1.retainAll(s2) — transforms s1 into the intersection of s1 and s2. (The intersection of two sets is the set containing only the elements common to both sets.)
Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(1, 2, 3, 4, 5, 7, 9));
Set<Integer> s2 = new HashSet<>(Arrays.asList(2, 4, 6, 8));
System.out.println("s1 before intersection: " + s1);
s1.retainAll(s2);
System.out.println("s1 after intersection: " + s1);
Output:
s1 before intersection: [1, 2, 3, 4, 5, 7, 9]
s1 after intersection: [2, 4]
Set difference operation:
s1.removeAll(s2) — transforms s1 into the (asymmetric) set difference of s1 and s2. (For example, the set difference of s1 minus s2 is the set containing all of the elements found in s1 but not in s2.)
Example:
Set<Integer> s1 = new HashSet<>(Arrays.asList(1, 2, 3, 4, 5, 7, 9));
Set<Integer> s2 = new HashSet<>(Arrays.asList(2, 4, 6, 8));
System.out.println("s1 before difference: " + s1);
s1.removeAll(s2);
System.out.println("s1 after difference: " + s1);
Output:
s1 before difference: [1, 2, 3, 4, 5, 7, 9]
s1 after difference: [1, 3, 5, 7, 9]
All three implementations HashSet, LinkedHashSet and TreeSet are not synchronized. So if you use them in concurrent context (multi-threads), you have to synchronize them externally using Collections.synchronizedSet() static method. For example:
The Collections utility class provide several methods involving in set collection. So consult its Javadoc to check if some useful operations are already made for reuse:
checkedSet(): Returns a dynamically typesafe view of the specified set.
checkedSortedSet(): Returns a dynamically typesafe view of the specified sorted set.
emptySet(): Returns the empty set (immutable).
singleton(): Returns an immutable set containing only the specified object.
unmodifiableSet(): Returns an unmodifiable view of the specified set.
unmodifiableSortedSet(): Returns an unmodifiable view of the specified sorted set.
Conclusion
That's almost everything you need to know about Set in Java. I hope you enjoy this tutorial. And as I mentioned above, this Java Generics and Collections book will help you dive deeper into Java collections framework. I also recommend you to check this Java course to learn more.
Nam Ha Minh is certified Java programmer (SCJP and SCWCD). He began programming with Java back in the days of Java 1.4 and has been passionate about it ever since. You can connect with him on Facebook and watch his Java videos on YouTube.
Comments