[Java] 浅析 Set.of(...) 方法

浅析 `java` 中的 `Set.of(...)` 方法

JDK 9 在 java.util.Set 接口中提供了一组 of(...) 静态方法(代码可参考 Set.java)。通过调用这些静态方法，我们可以构造不可变(immutable)集合。

这组方法涵盖了以下 <math xmlns="http://www.w3.org/1998/Math/MathML"> 2 2 </math>2 类情形 ⬇️

逐个提供元素作为参数：参数个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 从 <math xmlns="http://www.w3.org/1998/Math/MathML"> 0 0 </math>0 到 <math xmlns="http://www.w3.org/1998/Math/MathML"> 10 10 </math>10 共有 <math xmlns="http://www.w3.org/1998/Math/MathML"> 11 11 </math>11 种情况（这里只列出了 <math xmlns="http://www.w3.org/1998/Math/MathML"> 0 ≤ n ≤ 3 0\le n \le3 </math>0≤n≤3 的情况， <math xmlns="http://www.w3.org/1998/Math/MathML"> n ≥ 4 n \ge 4 </math>n≥4 的情况是类似的）
- of()
- of(E e1)
- of(E e1, E e2)
- of(E e1, E e2, E e3)
元素以数组方式提供
- of(E... elements)

当您调用这些方法时，有没有想过以下问题？ ⬇️

Set.of(...) 返回的 Set 具体是什么类型的，是 HashSet 吗，还是其他类型的 Set？
为什么 Set.of(...) 返回的 Set 具有不可变(immutable)的特点，它是如何实现的？
Set.of(...) 所返回的 Set，是如何存储元素的？

本文会从以上 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 个问题入手，对 Set.of(...) 方法进行探索。

要点

Set.of(...) 所返回的 Set 的精确类型与待处理的 元素个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 有关
- 如果 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 1 n=1 </math>n=1 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 2 n=2 </math>n=2，Set 的精确类型会是 java.util.ImmutableCollections.Set12
- 如果 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 n=0 </math>n=0 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n ≥ 3 n \ge 3 </math>n≥3，Set 的精确类型会是 java.util.ImmutableCollections.SetN
AbstractImmutableCollection 中的 add(...)/remove(...) 等方法会抛出 UnsupportedOperationException，而 Set12 和 SetN 都间接继承了 AbstractImmutableCollection，所以 Set12/SetN 的实例就是不可变(immutable)的了
SetN 中有 elements 字段，它的 length 会是元素数量的 2 倍（元素个数总是 <math xmlns="http://www.w3.org/1998/Math/MathML"> > 0 \gt0 </math>>0）。在保存元素和查询元素时，会使用 开放寻址法(open addressing)

正文

问题一：`Set.of(...)` 返回的 `Set` 具体是什么类型的，是 `HashSet` 吗，还是其他类型的 `Set`？

我写了如下的代码来进行探索 ⬇️

java 复制代码

import java.util.Set;

public class Question1 {
    public static void main(String[] args) {
        System.out.println(Set.of().getClass().getName());
        System.out.println(Set.of(1).getClass().getName());
        System.out.println(Set.of(1, 2).getClass().getName());
        System.out.println(Set.of(1, 2, 3).getClass().getName());
    }
}

请将以上代码保存为 Question1.java，用如下的命令可以编译 Question1.java 并运行 Question1 类的 main 方法。

bash 复制代码

javac Question1.java
java Question1

运行结果如下

text 复制代码

java.util.ImmutableCollections$SetN
java.util.ImmutableCollections$Set12
java.util.ImmutableCollections$Set12
java.util.ImmutableCollections$SetN

看来 Set.of(...) 方法返回的值可以是以下两个类的实例 ⬇️

java.util.ImmutableCollections.Set12
java.util.ImmutableCollections.SetN

但仅凭以上几行代码所列举的情况，并不能说明 Set.of(...) 的返回值是否有可能是其他类的实例，我们还是去看源码吧。在 Set.java 中可以找到各个 Set.of(...) 方法的代码。这里有两种情形。

情形一：逐个提供元素作为 `Set.of(...)` 的参数

参数个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 从 <math xmlns="http://www.w3.org/1998/Math/MathML"> 0 0 </math>0 到 <math xmlns="http://www.w3.org/1998/Math/MathML"> 10 10 </math>10 共有 <math xmlns="http://www.w3.org/1998/Math/MathML"> 11 11 </math>11 种情况。其中 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 从 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 到 <math xmlns="http://www.w3.org/1998/Math/MathML"> 10 10 </math>10 的情况代码高度雷同，我就不复制过来了，以下只展示 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 , 1 , 2 , 3 n=0, 1, 2, 3 </math>n=0,1,2,3 的情况 ⬇️ （为了节约空间，对应的 javadoc 都略去了）

java 复制代码

    @SuppressWarnings("unchecked")
    static <E> Set<E> of() {
        return (Set<E>) ImmutableCollections.EMPTY_SET;
    }
    
    static <E> Set<E> of(E e1) {
        return new ImmutableCollections.Set12<>(e1);
    }

    static <E> Set<E> of(E e1, E e2) {
        return new ImmutableCollections.Set12<>(e1, e2);
    }

    static <E> Set<E> of(E e1, E e2, E e3) {
        return new ImmutableCollections.SetN<>(e1, e2, e3);
    }

可以将代码中的信息汇总成下方的表格 ⬇️

参数个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n	`Set.of(...)` 返回什么	说明
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 n=0 </math>n=0	`ImmutableCollections.EMPTY_SET`	⬅️ 它是 `ImmutableCollections.SetN` 类型的
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 1 n=1 </math>n=1	`ImmutableCollections.Set12` 的实例
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 2 n=2 </math>n=2	`ImmutableCollections.Set12` 的实例
<math xmlns="http://www.w3.org/1998/Math/MathML"> 3 ≤ n ≤ 10 3\le n\le10 </math>3≤n≤10	`ImmutableCollections.SetN` 的实例

<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 n=0 </math>n=0 时，Set.of() 返回的也是 java.util.ImmutableCollections.SetN 的实例。可以把上方的表格再简化一下，变成下面这样 ⬇️

参数个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n	`Set.of(...)` 返回什么
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 1 n=1 </math>n=1 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 2 n=2 </math>n=2	`java.util.ImmutableCollections.Set12` 的实例
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 n=0 </math>n=0 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 ≤ n ≤ 10 3\le n\le10 </math>3≤n≤10	`java.util.ImmutableCollections.SetN` 的实例

情形二：元素以数组方式提供

这次我们直接看源码 ⬇️

java 复制代码

    /**
     * Returns an unmodifiable set containing an arbitrary number of elements.
     * See <a href="#unmodifiable">Unmodifiable Sets</a> for details.
     *
     * @apiNote
     * This method also accepts a single array as an argument. The element type of
     * the resulting set will be the component type of the array, and the size of
     * the set will be equal to the length of the array. To create a set with
     * a single element that is an array, do the following:
     *
     * <pre>{@code
     *     String[] array = ... ;
     *     Set<String[]> list = Set.<String[]>of(array);
     * }</pre>
     *
     * This will cause the {@link Set#of(Object) Set.of(E)} method
     * to be invoked instead.
     *
     * @param <E> the {@code Set}'s element type
     * @param elements the elements to be contained in the set
     * @return a {@code Set} containing the specified elements
     * @throws IllegalArgumentException if there are any duplicate elements
     * @throws NullPointerException if an element is {@code null} or if the array is {@code null}
     *
     * @since 9
     */
    @SafeVarargs
    @SuppressWarnings("varargs")
    static <E> Set<E> of(E... elements) {
        switch (elements.length) { // implicit null check of elements
            case 0:
                @SuppressWarnings("unchecked")
                var set = (Set<E>) ImmutableCollections.EMPTY_SET;
                return set;
            case 1:
                return new ImmutableCollections.Set12<>(elements[0]);
            case 2:
                return new ImmutableCollections.Set12<>(elements[0], elements[1]);
            default:
                return new ImmutableCollections.SetN<>(elements);
        }
    }

这个 switch 语句对元素个数的分类与 情形一 类似，我们可以把 情形二 的情况整理成如下表格 ⬇️

`elements` 参数的 `length`	`Set.of(E... elements)` 返回什么	说明
<math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h = 0 length=0 </math>length=0	`ImmutableCollections.EMPTY_SET`	⬅️ 它是 `ImmutableCollections.SetN` 类型的
<math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h = 1 length=1 </math>length=1	`ImmutableCollections.Set12` 的实例
<math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h = 2 length=2 </math>length=2	`ImmutableCollections.Set12` 的实例
<math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h ≥ 3 length\ge 3 </math>length≥3	`ImmutableCollections.SetN` 的实例

情形一和情形二的汇总

现在可以把情形一和情形二汇总成如下的表格 ⬇️

元素个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n	`Set.of(...)` 返回什么
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 1 n=1 </math>n=1 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 2 n=2 </math>n=2	`java.util.ImmutableCollections.Set12` 的实例
<math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 n=0 </math>n=0 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n ≥ 3 n\ge3 </math>n≥3	`java.util.ImmutableCollections.SetN` 的实例

我给 Set12/SetN 画了张类图（图中把所有的泛型都省略了） ⬇️ 从类图中可以看到 Set12/SetN 实现了哪些接口。

classDiagram class `java.lang.Iterable` <> `java.lang.Iterable` class `java.io.Serializable` <> `java.io.Serializable` class `java.util.Collection` <> `java.util.Collection` class `java.util.Set` <> `java.util.Set` `java.lang.Iterable` <|-- `java.util.Collection` `java.util.Collection` <|.. `java.util.AbstractCollection` class `java.util.AbstractCollection` <> `java.util.AbstractCollection` `java.util.AbstractCollection` <| -- `java.util.ImmutableCollections.AbstractImmutableCollection` class `java.util.ImmutableCollections.AbstractImmutableCollection` <> `java.util.ImmutableCollections.AbstractImmutableCollection` `java.util.ImmutableCollections.AbstractImmutableCollection` <|-- `java.util.ImmutableCollections.AbstractImmutableSet` class `java.util.ImmutableCollections.AbstractImmutableSet` <> `java.util.ImmutableCollections.AbstractImmutableSet` `java.util.Collection` <|-- `java.util.Set` `java.util.Set` <|.. `java.util.ImmutableCollections.AbstractImmutableSet` `java.util.ImmutableCollections.AbstractImmutableSet` <|--`java.util.ImmutableCollections.Set12` `java.util.ImmutableCollections.AbstractImmutableSet` <|-- `java.util.ImmutableCollections.SetN` `java.io.Serializable` <|.. `java.util.ImmutableCollections.Set12` `java.io.Serializable` <|.. `java.util.ImmutableCollections.SetN`

小总结

Set.of(...) 所返回的 Set 的精确类型与待处理的元素个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 有关

如果 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 1 n=1 </math>n=1 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 2 n=2 </math>n=2，Set 的精确类型会是 java.util.ImmutableCollections.Set12
如果 <math xmlns="http://www.w3.org/1998/Math/MathML"> n = 0 n=0 </math>n=0 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> n ≥ 3 n \ge 3 </math>n≥3，Set 的精确类型会是 java.util.ImmutableCollections.SetN

问题二：为什么 `Set.of(...)` 返回的 `Set` 具有不可变(`immutable`)的特点，它是如何实现的？

我们可以写点代码来进行分析。请将以下代码保存为 Question2.java。

java 复制代码

import java.util.Set;

public class Question2 {
    public static void main(String[] args) {
        Set.of().add(1);
    }
}

如下的命令可以编译 Question2.java 以及运行 Question2 类中的 main 方法。

bash 复制代码

javac Question2.java
java Question2

运行上述命令后，会看到以下的报错

text 复制代码

Exception in thread "main" java.lang.UnsupportedOperationException
	at java.base/java.util.ImmutableCollections.uoe(ImmutableCollections.java:142)
	at java.base/java.util.ImmutableCollections$AbstractImmutableCollection.add(ImmutableCollections.java:147)
	at Question2.main(Question2.java:5)

根据上述报错信息，可以找到抛出异常的位置如下 ⬇️

图中 <math xmlns="http://www.w3.org/1998/Math/MathML"> 146 146 </math>146 行有如下注释 ⬇️

// all mutating methods throw UnsupportedOperationException

从图中的代码也可以看出 addAll(...)/clear()/remove(...)/removeAll(...) 等方法都会抛出 UnsupportedOperationException

我画了张简单的类图来辅助理解 ⬇️ （图中将所有泛型信息都省略了，和 问题二 无关的接口/字段/方法·也略去了）

classDiagram class `java.util.Collection` <> `java.util.Collection` `java.util.Collection` <|.. `java.util.AbstractCollection` class `java.util.AbstractCollection` <> `java.util.AbstractCollection` `java.util.AbstractCollection` <| -- `java.util.ImmutableCollections.AbstractImmutableCollection` class `java.util.ImmutableCollections.AbstractImmutableCollection` <> `java.util.ImmutableCollections.AbstractImmutableCollection` `java.util.ImmutableCollections.AbstractImmutableCollection` <|-- `java.util.ImmutableCollections.AbstractImmutableSet` class `java.util.ImmutableCollections.AbstractImmutableSet` <> `java.util.ImmutableCollections.AbstractImmutableSet` `java.util.ImmutableCollections.AbstractImmutableSet` <|--`java.util.ImmutableCollections.Set12` `java.util.ImmutableCollections.AbstractImmutableSet` <|-- `java.util.ImmutableCollections.SetN` `java.util.Collection`: add(E e) `java.util.Collection`: addAll(Collection c) `java.util.Collection`: clear() `java.util.Collection`: remove(Object o)` `java.util.Collection`: removeAll(Collection c)` `java.util.Collection`: removeIf(Predicate filter) `java.util.Collection`: retainAll(Collection c)` `java.util.ImmutableCollections.AbstractImmutableCollection`: add(E e) `java.util.ImmutableCollections.AbstractImmutableCollection`: addAll(Collection c) `java.util.ImmutableCollections.AbstractImmutableCollection`: clear() `java.util.ImmutableCollections.AbstractImmutableCollection`: remove(Object o) `java.util.ImmutableCollections.AbstractImmutableCollection`: removeAll(Collection c) `java.util.ImmutableCollections.AbstractImmutableCollection`: removeIf(Predicate filter) `java.util.ImmutableCollections.AbstractImmutableCollection`: retainAll(Collection c) note for `java.util.ImmutableCollections.AbstractImmutableCollection` "注意: 在 AbstractImmutableCollection 中展示的 7 个方法都会抛 UnsupportedOperationException"

小总结

AbstractImmutableCollection 中的 add(...)/remove(...) 等方法会抛出 UnsupportedOperationException，而 Set12 和 SetN 都间接继承了 AbstractImmutableCollection，所以 Set12/SetN 的实例就是不可变(immutable)的了

问题三：其中的元素是如何存储的？

这里需要区分 Set12/SetN 两种情况

`Set12`

在 ImmutableCollections.java 中，可以找到 Set12 的代码，其中部分代码如下 ⬇️

java 复制代码

    @jdk.internal.ValueBased
    static final class Set12<E> extends AbstractImmutableSet<E>
            implements Serializable {

        @Stable
        private final E e0;

        @Stable
        private final Object e1;

        Set12(E e0) {
            this.e0 = Objects.requireNonNull(e0);
            // Use EMPTY as a sentinel for an unused element: not using null
            // enable constant folding optimizations over single-element sets
            this.e1 = EMPTY;
        }

        Set12(E e0, E e1) {
            if (e0.equals(Objects.requireNonNull(e1))) { // implicit nullcheck of e0
                throw new IllegalArgumentException("duplicate element: " + e0);
            }

            this.e0 = e0;
            this.e1 = e1;
        }

        @Override
        public int size() {
            return (e1 == EMPTY) ? 1 : 2;
        }

        @Override
        public boolean isEmpty() {
            return false;
        }

        @Override
        public boolean contains(Object o) {
            return o.equals(e0) || e1.equals(o); // implicit nullcheck of o
        }

        @Override
        public int hashCode() {
            return e0.hashCode() + (e1 == EMPTY ? 0 : e1.hashCode());
        }

字段

Set12 中有如下两个字段 ⬇️

private final E e0
private final Object e1

构造函数

在调用 Set12(E e0) 这个构造函数时，

e0 参数会被赋给 e0 字段
e1 字段会是一个特殊值（不是 null）

在调用 Set12(E e0, E e1) 这个构造函数时，

e0 参数会被赋给 e0 字段
e1 参数会被赋给 e1 字段

Set12 中的 size()/isEmpty() 等方法的逻辑也比较直观，这里就不赘述了。

小总结

Set12 可以处理集合中元素个数 <math xmlns="http://www.w3.org/1998/Math/MathML"> n n </math>n 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 1 1 </math>1 或 <math xmlns="http://www.w3.org/1998/Math/MathML"> 2 2 </math>2 的情况。

`SetN`

在 ImmutableCollections.java 中，可以找到 SetN 的代码，其中部分代码如下 ⬇️

java 复制代码

    /**
     * An array-based Set implementation. The element array must be strictly
     * larger than the size (the number of contained elements) so that at
     * least one null is always present.
     * @param <E> the element type
     */
    @jdk.internal.ValueBased
    static final class SetN<E> extends AbstractImmutableSet<E>
            implements Serializable {

        @Stable
        final E[] elements;

        @Stable
        final int size;

        @SafeVarargs
        @SuppressWarnings("unchecked")
        SetN(E... input) {
            size = input.length; // implicit nullcheck of input

            elements = (E[])new Object[EXPAND_FACTOR * input.length];
            for (int i = 0; i < input.length; i++) {
                E e = input[i];
                int idx = probe(e); // implicit nullcheck of e
                if (idx >= 0) {
                    throw new IllegalArgumentException("duplicate element: " + e);
                } else {
                    elements[-(idx + 1)] = e;
                }
            }
        }

        @Override
        public int size() {
            return size;
        }

        @Override
        public boolean isEmpty() {
            return size == 0;
        }

在上面的代码中，可以看到 SetN 中的字段以及构造函数的逻辑。

字段

SetN 中有如下两个字段 ⬇️

final E[] elements
final int size

而这两个字段的作用是 ⬇️

在 SetN 的构造函数中，会向 elements 字段填充元素（具体是如何填充的，我们等一下再说），所以 elements 字段应该是保存集合中的元素的地方。
size 字段用于表示集合中的元素数量。

构造函数

我在构造函数的代码中加了点注释 ⬇️ (代码里原有的注释也没删除)

java 复制代码

        @SafeVarargs
        @SuppressWarnings("unchecked")
        SetN(E... input) {
            // 获取元素个数
            size = input.length; // implicit nullcheck of input

            // EXPAND_FACTOR 值为 2，所以 elements 的长度会是 2 * size
            elements = (E[])new Object[EXPAND_FACTOR * input.length];
            // 遍历处理，为每一个元素找到它该去的地方
            for (int i = 0; i < input.length; i++) {
                E e = input[i];
                // 利用开放寻址法(open addressing)找到 e 应该去的位置
                int idx = probe(e); // implicit nullcheck of e
                if (idx >= 0) {
                    throw new IllegalArgumentException("duplicate element: " + e);
                } else {
                    // -(idx + 1) 就是 e 应该去的位置
                    elements[-(idx + 1)] = e;
                }
            }
        }

这里大部分逻辑都很直观，只有 int idx = probe(e); 这一行看起来不太好懂。这里涉及 开放寻址法(open addressing) ⬇️

我按照自己的理解说一下它的思想 ⬇️ （这里只说保存元素和查询元素，不涉及删除元素）

用开放寻址法保存元素

我们在使用 hashCode 时，会遇到有冲突的情况。假设现在有一个 <math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h = 10 length=10 </math>length=10 的数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A，一开始数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 为空，我们想把如下的元素保存到数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 中。

E1 (假设它的 hashCode 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 564132213 564132213 </math>564132213)
E2 (假设它的 hashCode 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 315343729 315343729 </math>315343729)
E3 (假设它的 hashCode 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 6323193 6323193 </math>6323193)

一个办法是利用 hashCode % 10 来计算每个元素应该保存到哪里去。经过简单的计算，可以得到如下的表格 ⬇️

哪个元素	`hashCode`	`hashCode % 10`
`E1`	<math xmlns="http://www.w3.org/1998/Math/MathML"> 564132213 564132213 </math>564132213	<math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3
`E2`	<math xmlns="http://www.w3.org/1998/Math/MathML"> 315343729 315343729 </math>315343729	<math xmlns="http://www.w3.org/1998/Math/MathML"> 9 9 </math>9
`E3`	<math xmlns="http://www.w3.org/1998/Math/MathML"> 6323193 6323193 </math>6323193	<math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3

最初数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 为空，其中是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 10 10 </math>10 个 <math xmlns="http://www.w3.org/1998/Math/MathML"> n u l l null </math>null ⬇️ 将 E1 保存到下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 的位置之后 ⬇️

将 E2 保存到下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 9 9 </math>9 的位置之后 ⬇️

尝试将 E3 保存到下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 的位置时，会发现这个位置已经被占了（被 E1 占了），于是再尝试把 E3 保存到下一个位置去（如果现在已经是下标最大的位置，那就尝试把 E3 保存到下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 0 0 </math>0 的位置）。下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置尚未被占用，所以我们将 E3 保存到下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置（如果下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置也被占用了，那就继续检查下一个位置，直到找到空位置为止） ⬇️

注意，由于 E3 和 E1 要去的位置有冲突，所以最终 E3 保存在了下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置 ⬇️

哪个元素	`hashCode`	`hashCode % 10`	实际保存在了哪里
`E1`	<math xmlns="http://www.w3.org/1998/Math/MathML"> 564132213 564132213 </math>564132213	<math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 的位置
`E2`	<math xmlns="http://www.w3.org/1998/Math/MathML"> 315343729 315343729 </math>315343729	<math xmlns="http://www.w3.org/1998/Math/MathML"> 9 9 </math>9	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 9 9 </math>9 的位置
`E3`	<math xmlns="http://www.w3.org/1998/Math/MathML"> 6323193 6323193 </math>6323193	<math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置

用开放寻址法查询元素是否存在

查询的过程也要用到 hashCode。下面举两个例子 ⬇️

我们现在要查询数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 中有 E2 元素，E2 的 hashCode 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 315343729 315343729 </math>315343729，hashCode % 10 得到的结果是 9，检查下标为 9 的位置，那里不是 null 而且保存的元素和 E2 相等，所以数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 中有 E2 元素。
我们现在要查询数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 中有 E3 元素，E3 的 hashCode 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 6323193 6323193 </math>6323193，hashCode % 10 得到的结果是 3，检查下标为 3 的位置，那里不是 null ，但是那里保存的元素(即 E1)和 E3 不相等 ，所以再检查下一个位置（即下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置，如果现在已经是下标最大的位置，则检查下标为 0 的位置），在下标为 4 的位置，那里不是 null，而且保存的元素和 E3 相等，所以数组 <math xmlns="http://www.w3.org/1998/Math/MathML"> A A </math>A 中有 E3 元素。

说完了 开放寻址法 的思想，我们接着看 SetN 的构造函数的逻辑。上文提到 SetN 的构造函数有 int idx = probe(e); 这样一行，我把 probe(Object) 方法的代码复制到了下方（加了点注释） ⬇️

java 复制代码

        // returns index at which element is present; or if absent,
        // (-i - 1) where i is location where element should be inserted.
        // Callers are relying on this method to perform an implicit nullcheck
        // of pe
        private int probe(Object pe) {
            // 在 Set.of(...) 调用 probe(...) 方法时，elements.length 总是正数，
            // 此时可以将下一行简单理解为 idx = pe.hashCode() % elements.length
            int idx = Math.floorMod(pe.hashCode(), elements.length);
            // 用开放寻址法找到 pe 应该去的地方
            while (true) {
                E ee = elements[idx];
                // 如果 ee == null，说明这个位置还没有被占，那么 idx 就是 pe 该去的地方（调用者会从 -idx - 1 这个返回值反推出 idx）
                if (ee == null) {
                    return -idx - 1;
                // 如果 pe.equals(ee)，说明 pe 已经在 elements 数组里了
                } else if (pe.equals(ee)) {
                    return idx;
                // 去下一个位置（如果下标已经是最大值了，就去下标为 0 的位置）
                } else if (++idx == elements.length) {
                    idx = 0;
                }
            }
        }

现在再看 SetN 的构造函数，应该能明白它的逻辑了。为了加深理解以及验证上文的说法是否正确，我写了个简单的例子来展示 elements 字段中的元素。由于要涉及求余数的运算，对 <math xmlns="http://www.w3.org/1998/Math/MathML"> 10 10 </math>10 求余数很直观，而 elements 的 <math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h length </math>length 会是元素个数的 2 倍，所以我打算在例子中用到 <math xmlns="http://www.w3.org/1998/Math/MathML"> 5 5 </math>5 个元素。考虑到 java.lang.Integer 的 hashCode 就是它对应的 int 值，这样 hashCode 的计算会很直观，所以我会用 <math xmlns="http://www.w3.org/1998/Math/MathML"> 5 5 </math>5 个 java.lang.Integer。具体的代码如下 ⬇️

java 复制代码

import java.lang.reflect.Field;
import java.util.Set;

public class Question3 {
    public static void main(String[] args) throws Exception {
        Set<Integer> set = Set.of(101, 42, 11, 22, 6);
        Field field = set.getClass().getDeclaredField("elements");
        field.setAccessible(true);

![image.png](https://p3-xtjj-sign.byteimg.com/tos-cn-i-73owjymdk6/63db3ee20e1d47d882a0e6eb6ead5c2b~tplv-73owjymdk6-jj-mark-v1:0:0:0:0:5o6Y6YeR5oqA5pyv56S-5Yy6IEAg6YeR6YqA6YqF6ZC1:q75.awebp?rk3s=f64ab15b&x-expires=1757563592&x-signature=H8MND%2Bp5PFmRoxstaxmmZ%2F4jaFk%3D)
        Object[] elements = (Object[]) field.get(set);
        for (int i = 0; i < elements.length; i++) {
            if (elements[i] != null) {
                String message = String.format("number: [%s] is at index: %s", elements[i], i);
                System.out.println(message);
            } else {
                System.out.println("null is at index: " + i);
            }
        }
    }
}

请将上方的代码保存为 Question3.java。用下方的命令可以编译 Question3.java 并运行 Question3 里的 main(...) 方法。

bash 复制代码

javac Question3.java
java --add-opens=java.base/java.util=ALL-UNNAMED Question3

运行结果如下

text 复制代码

null is at index: 0
number: [101] is at index: 1
number: [42] is at index: 2
number: [11] is at index: 3
number: [22] is at index: 4
null is at index: 5
number: [6] is at index: 6
null is at index: 7
null is at index: 8
null is at index: 9

我们来分析一下具体的过程。由于调用 Set.of(...) 时，提供了 <math xmlns="http://www.w3.org/1998/Math/MathML"> 5 5 </math>5 个参数，SetN 类中的 elements 的 <math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h length </math>length 会是 EXPAND_FACTOR * 5 也就是 2 * 5 = 10。通过在 Intellij IDEA 里打断点，也可以进行验证 ⬇️

elements 字段的 <math xmlns="http://www.w3.org/1998/Math/MathML"> l e n g t h length </math>length 是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 10 10 </math>10，然后我们可以手动模拟 101, 42, 11, 22, 6 这 <math xmlns="http://www.w3.org/1998/Math/MathML"> 5 5 </math>5 个元素的保存过程 ⬇️

是第几个元素（从 <math xmlns="http://www.w3.org/1998/Math/MathML"> 1 1 </math>1 开始）	值	`hashCode`	`hashCode % length`	最终保存在哪里
第 <math xmlns="http://www.w3.org/1998/Math/MathML"> 1 1 </math>1 个	<math xmlns="http://www.w3.org/1998/Math/MathML"> 101 101 </math>101	<math xmlns="http://www.w3.org/1998/Math/MathML"> 101 101 </math>101	<math xmlns="http://www.w3.org/1998/Math/MathML"> 1 1 </math>1	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 1 1 </math>1 的位置
第 <math xmlns="http://www.w3.org/1998/Math/MathML"> 2 2 </math>2 个	<math xmlns="http://www.w3.org/1998/Math/MathML"> 42 42 </math>42	<math xmlns="http://www.w3.org/1998/Math/MathML"> 42 42 </math>42	<math xmlns="http://www.w3.org/1998/Math/MathML"> 2 2 </math>2	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 2 2 </math>2 的位置
第 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 个	<math xmlns="http://www.w3.org/1998/Math/MathML"> 11 11 </math>11	<math xmlns="http://www.w3.org/1998/Math/MathML"> 11 11 </math>11	<math xmlns="http://www.w3.org/1998/Math/MathML"> 1 1 </math>1	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 的位置
第 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 个	<math xmlns="http://www.w3.org/1998/Math/MathML"> 22 22 </math>22	<math xmlns="http://www.w3.org/1998/Math/MathML"> 22 22 </math>22	<math xmlns="http://www.w3.org/1998/Math/MathML"> 2 2 </math>2	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的位置
第 <math xmlns="http://www.w3.org/1998/Math/MathML"> 5 5 </math>5 个	<math xmlns="http://www.w3.org/1998/Math/MathML"> 6 6 </math>6	<math xmlns="http://www.w3.org/1998/Math/MathML"> 6 6 </math>6	<math xmlns="http://www.w3.org/1998/Math/MathML"> 6 6 </math>6	下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 6 6 </math>6 的位置

请注意，在保存 <math xmlns="http://www.w3.org/1998/Math/MathML"> 11 11 </math>11 和 <math xmlns="http://www.w3.org/1998/Math/MathML"> 22 22 </math>22 的时候，都会遇到冲突的情况，所以它们分别被保存到了下标为 <math xmlns="http://www.w3.org/1998/Math/MathML"> 3 3 </math>3 和 <math xmlns="http://www.w3.org/1998/Math/MathML"> 4 4 </math>4 的地方。

小总结

SetN 中有 elements 字段，它的 length 会是元素数量的 2 倍（元素个数不会是 <math xmlns="http://www.w3.org/1998/Math/MathML"> 0 0 </math>0）。SetN 使用 开放寻址法(open addressing) 对元素进行保存和查询。

参考资料

OpenJDK 中的
- Set.java
- ImmutableCollections.java

[Java] 浅析 Set.of(...) 方法

浅析 java 中的 Set.of(...) 方法

要点

正文

问题一：Set.of(...) 返回的 Set 具体是什么类型的，是 HashSet 吗，还是其他类型的 Set？

情形一：逐个提供元素作为 Set.of(...) 的参数

情形二：元素以数组方式提供

情形一和情形二的汇总

小总结

问题二：为什么 Set.of(...) 返回的 Set 具有不可变(immutable)的特点，它是如何实现的？

小总结

问题三：其中的元素是如何存储的？

Set12

字段

构造函数

小总结

SetN

字段

构造函数

用开放寻址法保存元素

用开放寻址法查询元素是否存在

小总结

参考资料

浅析 `java` 中的 `Set.of(...)` 方法

问题一：`Set.of(...)` 返回的 `Set` 具体是什么类型的，是 `HashSet` 吗，还是其他类型的 `Set`？

情形一：逐个提供元素作为 `Set.of(...)` 的参数

问题二：为什么 `Set.of(...)` 返回的 `Set` 具有不可变(`immutable`)的特点，它是如何实现的？

`Set12`

`SetN`