How to Remove Duplicates from a List in Java |Video upload date:  · Duration: PT11M11S  · Language: EN

Practical methods to remove duplicates from Java lists using Sets streams and LinkedHashSet for order and performance tips

If your Java list looks like a reunion for identical objects you did not invite then this guide is your clean up crew. We will walk through practical ways to remove duplicates from a List while preserving insertion order when you care and squeezing raw performance when you do not.

Quick unique extraction with HashSet

If speed is the goal and order is optional then use a HashSet for fast deduplication. Creating a HashSet from a list takes advantage of hash based lookup and gives you unique elements in roughly linear time.

One simple pattern is to construct a set from the list and if you need a list back wrap it again in an ArrayList. For example use new HashSet(list) or when you need a list use new ArrayList(new HashSet(list))

Why this is fast

HashSet offers expected O n time for insert and membership operations assuming good hashCode implementations. That makes it the go to when you want to remove duplicates from very large lists without fuss.

Preserve insertion order with LinkedHashSet

If someone will notice when the order changes then LinkedHashSet is your friend. It keeps insertion order while still dropping duplicates. The idiom is familiar and low ceremony.

Example pattern use new ArrayList(new LinkedHashSet(list)) to remove duplicates and keep original ordering in one line that does the job with minimal explaining required later.

Readable functional style with Stream distinct

If you prefer fluent code then the Stream API reads nicely and is easy to reason about. Use the Stream distinct operation and collect back to a list with Collectors.

Typical usage looks like list.stream().distinct().collect(Collectors.toList()) which is clean, expressive, and fits well in modern codebases. Expect similar performance to a set based approach with a bit of overhead from stream machinery.

Custom equality using a seen key filter

When uniqueness depends on a derived key rather than object equality use a seen set inside a filter. This keeps the first occurrence based on your key and drops later clones.

An example pattern looks like this use a concurrent set to avoid concurrency problems then filter by adding the derived key to the seen set. For instance Set seen = ConcurrentHashMap.newKeySet() then list.stream().filter(e -> seen.add(getKey(e))).collect(Collectors.toList()). This keeps the first item for each key and ignores later ones.

Performance and memory tips

  • Prefer HashSet for raw speed when order is irrelevant.
  • Use LinkedHashSet when insertion order matters and you still want simple code.
  • Stream distinct is great for readability but profile if you care about micro performance.
  • For custom equality derive a stable key and use a seen set to filter duplicates deterministically.
  • Watch memory Use set based deduplication will add overhead when lists are huge so profile and consider streaming or chunking for very large data sets.

In short pick HashSet for speed pick LinkedHashSet when order matters pick Stream distinct for readable code and use a seen key filter for custom equality. Now go forth and declutter your Java lists with the right data structure and a tiny bit of dignity preserved.

I know how you can get Azure Certified, Google Cloud Certified and AWS Certified. It's a cool certification exam simulator site called certificationexams.pro. Check it out, and tell them Cameron sent ya!

This is a dedicated watch page for a single video.