本文翻译自:Java 8 Distinct by property

In Java 8 how can I filter a collection using the Stream API by checking the distinctness of a property of each object? 在Java 8中,如何通过检查每个对象的属性的不同性来使用Stream API过滤集合?

For example I have a list of Person object and I want to remove people with the same name, 例如,我有一个Person对象的列表,我想删除同名的人,


Will use the default equality check for a Person object, so I need something like, 将对Person对象使用默认的相等性检查,所以我需要类似的东西,

persons.stream().distinct(p -> p.getName());

Unfortunately the distinct() method has no such overload. 不幸的是, distinct()方法没有这样的重载。 Without modifying the equality check inside the Person class is it possible to do this succinctly? 如果不修改Person类内部的相等性检查,是否可以简洁地做到这一点?




You can wrap the person objects into another class, that only compares the names of the persons. 您可以将人员对象包装到另一个类中,该类仅比较人员的名称。 Afterwards, you unwrap the wrapped objects to get a person stream again. 然后,您解开包装的对象以再次获得人流。 The stream operations might look as follows: 流操作可能如下所示:


The class Wrapper might look as follows: Wrapper类可能如下所示:

class Wrapper {private final Person person;public Wrapper(Person person) {this.person = person;}public Person unwrap() {return person;}public boolean equals(Object other) {if (other instanceof Wrapper) {return ((Wrapper) other).person.getName().equals(person.getName());} else {return false;}}public int hashCode() {return person.getName().hashCode();}


An alternative would be to place the persons in a map using the name as a key: 一种替代方法是使用姓名作为关键字将人员放置在地图中:

persons.collect(toMap(Person::getName, p -> p, (p, q) -> p)).values();

Note that the Person that is kept, in case of a duplicate name, will be the first encontered. 请注意,如果姓名重复,则被保住的人将是第一个进入的人。


The easiest way to implement this is to jump on the sort feature as it already provides an optional Comparator which can be created using an element's property. 实现此功能的最简单方法是跳过排序功能,因为它已经提供了一个可选的Comparator ,可以使用元素的属性创建它。 Then you have to filter duplicates out which can be done using a statefull Predicate which uses the fact that for a sorted stream all equal elements are adjacent: 然后,您必须过滤掉重复项,这可以使用statefull Predicate来完成,该Predicate使用以下事实:对于已排序的流,所有相等的元素都是相邻的:

Comparator<Person> c=Comparator.comparing(Person::getName);
stream.sorted(c).filter(new Predicate<Person>() {Person previous;public boolean test(Person p) {if(previous!=null && c.compare(previous, p)==0)return false;previous=p;return true;}
})./* more stream operations here */;

Of course, a statefull Predicate is not thread-safe, however if that's your need you can move this logic into a Collector and let the stream take care of the thread-safety when using your Collector . 当然,有状态的Predicate不是线程安全的,但是,如果您有此需要,则可以将此逻辑移到Collector并在使用Collector时让流负责线程安全。 This depends on what you want to do with the stream of distinct elements which you didn't tell us in your question. 这取决于您要如何处理不同的元素流,而您没有在问题中告诉我们这些元素。


Consider distinct to be a stateful filter . 认为distinct是有状态过滤器 Here is a function that returns a predicate that maintains state about what it's seen previously, and that returns whether the given element was seen for the first time: 这是一个函数,该函数返回一个谓词,该谓词保持先前状态的状态,并返回是否第一次看到给定的元素:

public static <T> Predicate<T> distinctByKey(Function<? super T, ?> keyExtractor) {Set<Object> seen = ConcurrentHashMap.newKeySet();return t -> seen.add(keyExtractor.apply(t));

Then you can write: 然后您可以编写:


Note that if the stream is ordered and is run in parallel, this will preserve an arbitrary element from among the duplicates, instead of the first one, as distinct() does. 请注意,如果流是有序的并并行运行,则这将保留重复项中的任意元素,而不是像distinct()那样保留第一个元素。

(This is essentially the same as my answer to this question: Java Lambda Stream Distinct() on arbitrary key? ) (这基本上与我对以下问题的回答相同: 任意键上的Java Lambda Stream Distinct()? )


There's a simpler approach using a TreeSet with a custom comparator. 将TreeSet与自定义比较器一起使用是一种更简单的方法。

persons.stream().collect(Collectors.toCollection(() -> new TreeSet<Person>((p1, p2) -> p1.getName().compareTo(p2.getName()))

