If your Java list looks like a reunion for identical objects you did not invite then this guide is your clean up crew. We will walk through practical ways to remove duplicates from a List while preserving insertion order when you care and squeezing raw performance when you do not.
If speed is the goal and order is optional then use a HashSet for fast deduplication. Creating a HashSet from a list takes advantage of hash based lookup and gives you unique elements in roughly linear time.
One simple pattern is to construct a set from the list and if you need a list back wrap it again in an ArrayList. For example use new HashSet(list)
or when you need a list use new ArrayList(new HashSet(list))
HashSet offers expected O n time for insert and membership operations assuming good hashCode implementations. That makes it the go to when you want to remove duplicates from very large lists without fuss.
If someone will notice when the order changes then LinkedHashSet is your friend. It keeps insertion order while still dropping duplicates. The idiom is familiar and low ceremony.
Example pattern use new ArrayList(new LinkedHashSet(list))
to remove duplicates and keep original ordering in one line that does the job with minimal explaining required later.
If you prefer fluent code then the Stream API reads nicely and is easy to reason about. Use the Stream distinct operation and collect back to a list with Collectors.
Typical usage looks like list.stream().distinct().collect(Collectors.toList())
which is clean, expressive, and fits well in modern codebases. Expect similar performance to a set based approach with a bit of overhead from stream machinery.
When uniqueness depends on a derived key rather than object equality use a seen set inside a filter. This keeps the first occurrence based on your key and drops later clones.
An example pattern looks like this use a concurrent set to avoid concurrency problems then filter by adding the derived key to the seen set. For instance Set seen = ConcurrentHashMap.newKeySet()
then list.stream().filter(e -> seen.add(getKey(e))).collect(Collectors.toList())
. This keeps the first item for each key and ignores later ones.
In short pick HashSet for speed pick LinkedHashSet when order matters pick Stream distinct for readable code and use a seen key filter for custom equality. Now go forth and declutter your Java lists with the right data structure and a tiny bit of dignity preserved.
I know how you can get Azure Certified, Google Cloud Certified and AWS Certified. It's a cool certification exam simulator site called certificationexams.pro. Check it out, and tell them Cameron sent ya!
This is a dedicated watch page for a single video.