java-hashset-vs-treeset
HashSet and TreeSet Comparison
1. Introduction
2. Differences
2.1. Ordering
HashSet stores the objects in random order, whereas TreeSet applies the natural order of the elements. Let’s see the following example:
@Test
public void givenTreeSet_whenRetrievesObjects_thenNaturalOrder() {
Set<String> set = new TreeSet<>();
set.add("Baeldung");
set.add("is");
set.add("Awesome");
assertEquals(3, set.size());
assertTrue(set.iterator().next().equals("Awesome"));
}
After adding the String objects into TreeSet, we see that the first one is “Awesome”, even though it was added at the very end. A similar operation done with HashSet does not guarantee that the order of elements will remain constant over time.
2.2. Null Objects
@Test(expected = NullPointerException.class)
public void givenTreeSet_whenAddNullObject_thenNullPointer() {
Set<String> set = new TreeSet<>();
set.add("Baeldung");
set.add("is");
set.add(null);
}
@Test
public void givenHashSet_whenAddNullObject_thenOK() {
Set<String> set = new HashSet<>();
set.add("Baeldung");
set.add("is");
set.add(null);
assertEquals(3, set.size());
}
If we try to store the null object in a TreeSet, the operation will result in a thrown NullPointerException. The only exception was in Java 7 when it was allowed to have exactly one null element in the TreeSet.
2.3. Performance
HashSet provides constant-time performance for most operations like add(), remove() and contains(), versus the log(n) time offered by the TreeSet.
Usually, we can see that the execution time for adding elements into TreeSet is much better than for the HashSet.
Please remember that the JVM might be not warmed up, so the execution times can differ. A good discussion how to design and perform micro tests using various Set implementations is available here.
2.4. Implemented Methods
-
pollFirst() – to return the first element, or null if Set is empty
-
pollLast() – to retrieve and remove the last element, or return null if Set is empty
-
first() – to return the first item
-
last() – to return the last item
-
ceiling() – to return the least element greater than or equal to the given element, or null if there is no such element
-
lower() – to return the largest element strictly less than the given element, or null if there is no such element
The methods mentioned above make TreeSet much easier to use and more powerful than HashSet.
3. Similarities
Both TreeSet and HashSet guarantee a duplicate-free collection of elements, as it is a part of the generic Set interface:
@Test
public void givenHashSetAndTreeSet_whenAddDuplicates_thenOnlyUnique() {
Set<String> set = new HashSet<>();
set.add("Baeldung");
set.add("Baeldung");
assertTrue(set.size() == 1);
Set<String> set2 = new TreeSet<>();
set2.add("Baeldung");
set2.add("Baeldung");
assertTrue(set2.size() == 1);
}
3.2. Not synchronized
3.3. Fail-fast Iterators
That means that any modification of the Set at any time after the Iterator is created will throw a ConcurrentModificationException:
@Test(expected = ConcurrentModificationException.class)
public void givenHashSet_whenModifyWhenIterator_thenFailFast() {
Set<String> set = new HashSet<>();
set.add("Baeldung");
Iterator<String> it = set.iterator();
while (it.hasNext()) {
set.add("Awesome");
it.next();
}
}
4. Which Implementation to Use?
Both implementations fulfill the contract of the idea of a set so it’s up to the context which implementation we might use.
Here are few quick points to remember:
-
If we want to keep our entries sorted, we need to go for the TreeSet
-
If we value performance more than memory consumption, we should go for the HashSet
-
If we are short on memory, we should go for the TreeSet
-
If we want to access elements that are relatively close to each other according to their natural ordering, we might want to consider TreeSet because it has greater locality
-
HashSet‘s performance can be tuned using the initialCapacity and loadFactor, which is not possible for the TreeSet
-
If we want to preserve insertion order and benefit from constant time access, we can use the LinkedHashSet
5. Conclusion
As always, the code examples for this article are available over on GitHub.