在Java中一种更快的反射调用方式

背景

在使用Java进行开发时，我们会不可避免的使用到大量的反射操作，比如Spring Boot会在接收到HTTP请求时，利用反射Controller调用接口中的对应方法，或是Jackson框架使用反射来解析json中的数据给对应字段进行赋值，我们可以编写一个简单的JMH测试来评估一下通过反射调用来创建对象的性能，与直接调用对象构造方法之间的差距：

java 复制代码

@BenchmarkMode(value = Mode.AverageTime)  
@Warmup(iterations = 3, time = 500, timeUnit = TimeUnit.MILLISECONDS)  
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)  
@State(Scope.Thread)  
@OutputTimeUnit(TimeUnit.NANOSECONDS)  
public abstract class JmhTest {  
    public static void runTest(Class<?> launchClass) throws RunnerException {  
        Options options = new OptionsBuilder().include(launchClass.getSimpleName()).build();  
        new Runner(options).run();  
    }  
}

package cn.zorcc.common.jmh;  
  
import org.openjdk.jmh.annotations.Benchmark;  
import org.openjdk.jmh.annotations.Param;  
import org.openjdk.jmh.infra.Blackhole;  
import org.openjdk.jmh.runner.RunnerException;  
  
import java.lang.invoke.MethodHandle;  
import java.lang.invoke.MethodHandles;  
import java.lang.invoke.MethodType;  
import java.lang.reflect.Constructor;  
import java.lang.reflect.Method;  
  
public class ReflectionTest extends JmhTest {  
    @Param({"10", "100", "1000", "10000"})  
    private int size;  
  
    static class Test {  
        private int integer;  
  
        public int getInteger() {  
            return integer;  
        }  
  
        public void setInteger(int integer) {  
            this.integer = integer;  
        }  
    }  
  
    @Benchmark  
    public void testDirectCall(Blackhole bh) {  
        for(int i = 0; i < size; i++) {  
            Test test = new Test();  
            bh.consume(test);  
            test.setInteger(i);  
            bh.consume(test.getInteger());  
        }  
    }  
  
    @Benchmark  
    public void testNormalReflection(Blackhole bh) {  
        try{  
            Constructor<Test> constructor = Test.class.getDeclaredConstructor();  
            Method setter = Test.class.getDeclaredMethod("setInteger", int.class);  
            Method getter = Test.class.getDeclaredMethod("getInteger");  
            for(int i = 0; i < size; i++) {  
                Test test = constructor.newInstance();  
                bh.consume(test);  
                setter.invoke(test, i);  
                int integer = (int) getter.invoke(test);  
                bh.consume(integer);  
            }  
        }catch (Throwable e) {  
            throw new UnknownError();  
        }  
    }  
  
  
    public static void main(String[] args) throws RunnerException {  
        runTest(ReflectionTest.class);  
    }  
}

在Test类中，具有一个简单的int类型的变量，我们分别测试直接调用构造方法，赋值然后取值，以及使用Constructor和Method进行普通反射调用之间的性能对比，注意一定要将构造出来的对象使用Blackhole.consume()方法给吃掉，这样JVM才不会把没有使用到的变量给直接的优化掉，得出错误的测试结果，以上代码在笔者的机器上运行的结果如下：

plaintext 复制代码

Benchmark                                  (size)  Mode  Cnt       Score       Error  Units
ReflectionTest.testDirectCall                  10  avgt   50      10.584 ±     0.141  ns/op
ReflectionTest.testDirectCall                 100  avgt   50     108.301 ±     1.129  ns/op
ReflectionTest.testDirectCall                1000  avgt   50    1068.026 ±    12.312  ns/op
ReflectionTest.testDirectCall               10000  avgt   50   10660.596 ±   148.673  ns/op
ReflectionTest.testNormalReflection            10  avgt   50     145.483 ±     1.300  ns/op
ReflectionTest.testNormalReflection           100  avgt   50    1131.994 ±    19.586  ns/op
ReflectionTest.testNormalReflection          1000  avgt   50   13461.067 ±   130.624  ns/op
ReflectionTest.testNormalReflection         10000  avgt   50  148811.318 ±  5766.679  ns/op

可以看到，使用反射的性能比起直接调用来讲有非常大的差距，尤其是在这种极其简单的对象创建场景中，但是使用反射是很多情况下我们不得不采用的一个做法，那么我们有没有什么办法来尽可能优化一下反射调用的性能呢？

先让我们试一下MethodHandle提供的方法调用模型，MethodHandle是自JDK7版本后开始推出的，用于替换旧反射调用的新方式，相比起原有的反射调用，提供了更多的交互方式，并且具备对Java方法调用和Native方法调用一致的模型，我们可以简单的创建一个用例进行测试：

java 复制代码

@Benchmark  
public void testMethodHandleReflection(Blackhole bh) {  
    try{  
        MethodHandles.Lookup lookup = MethodHandles.lookup();  
        MethodType constructorType = MethodType.methodType(void.class);  
        MethodHandle constructorHandle = lookup.findConstructor(Test.class, constructorType);  
        MethodHandle iSetter = lookup.findSetter(Test.class, "integer", int.class);  
        MethodHandle iGetter = lookup.findGetter(Test.class, "integer", int.class);  
        for(int i = 0; i < size; i++) {  
            Test test = (Test) constructorHandle.invokeExact();  
            bh.consume(test);  
            iSetter.invokeExact(test, i);  
            int integer = (int) iGetter.invokeExact(test);  
            bh.consume(integer);  
        }  
    }catch (Throwable e) {  
        throw new UnknownError();  
    }  
}

实测的结果则更加的不尽人意：

plaintext 复制代码

ReflectionTest.testMethodHandleReflection      10  avgt   50    1346.515 ±    17.347  ns/op
ReflectionTest.testMethodHandleReflection     100  avgt   50    2355.083 ±    37.358  ns/op
ReflectionTest.testMethodHandleReflection    1000  avgt   50  456694.572 ± 31415.118  ns/op
ReflectionTest.testMethodHandleReflection   10000  avgt   50  982008.110 ± 46807.572  ns/op

可以看到，使用MethodHandle与使用普通反射之间的性能差距，就和普通反射与直接调用之间的差距一样大，事实上在JDK18以后，根据# JEP 416: Reimplement Core Reflection with Method Handles 使用java.lang.reflect和java.lang.invoke的相关API已经进行了相应的底层重构，转而使用MethodHandle进行实现，很明显，在使用java.lang.reflect和java.lang.invoke中的方法时，与直接使用MethodHandle相比，具备了更多的优化工作，根据官方的说法，在使用MethodHandle时因将字段尽可能定义为static final，这样JVM可以将其进行常量折叠，从而实现巨大的性能提升，让我们修改一下以上的测试代码：

java 复制代码

private static final MethodHandle constructorHandle;  
private static final MethodHandle iSetter;  
private static final MethodHandle iGetter;  
static {  
    try{  
        MethodHandles.Lookup lookup = MethodHandles.lookup();  
        MethodType constructorType = MethodType.methodType(void.class);  
        constructorHandle = lookup.findConstructor(Test.class, constructorType);  
        iSetter = lookup.findSetter(Test.class, "integer", int.class);  
        iGetter = lookup.findGetter(Test.class, "integer", int.class);  
    }catch (Throwable e) {  
        throw new UnknownError();  
    }  
}

@Benchmark  
public void testMethodHandleReflection(Blackhole bh) {  
    try{  
        for(int i = 0; i < size; i++) {  
            Test test = (Test) constructorHandle.invokeExact();  
            bh.consume(test);  
            iSetter.invokeExact(test, i);  
            int integer = (int) iGetter.invokeExact(test);  
            bh.consume(integer);  
        }  
    }catch (Throwable e) {  
        throw new UnknownError();  
    }  
}

得到了如下的数据：

plaintext 复制代码

ReflectionTest.testMethodHandleReflection      10  avgt   50       9.825 ±    0.084  ns/op
ReflectionTest.testMethodHandleReflection     100  avgt   50      99.174 ±    1.128  ns/op
ReflectionTest.testMethodHandleReflection    1000  avgt   50     997.094 ±   11.961  ns/op
ReflectionTest.testMethodHandleReflection   10000  avgt   50   10212.014 ±  215.662  ns/op

突然之间，我们的反射调用和直接调用的性能已经完全一致了，那么这是不是意味着，我们想要的功能已经完全实现了呢？事实上并未如此，如果我们必须在static final中指定需要使用到的反射字段，那么就相当于损失了绝大多数的灵活性，在实际操作中可行性并不高。

同样的，我们可以试一试，将直接使用java.lang.reflect和java.lang.invoke的函数所需的对象先构建并缓存在本地，再测试一下其对应的性能：

java 复制代码

private Constructor<Test> c;  
private Method setter;  
private Method getter;

@Setup   
public void setup() {
	try{  
	    this.c = Test.class.getDeclaredConstructor();  
		this.setter = Test.class.getDeclaredMethod("setInteger", int.class);  
		this.getter = Test.class.getDeclaredMethod("getInteger");
	}catch (Throwable e) {  
	    throw new UnknownError();  
	}
}

@Benchmark  
public void testNormalReflection(Blackhole bh) {  
    try{  
        for(int i = 0; i < size; i++) {  
            Test test = c.newInstance();  
            bh.consume(test);  
            setter.invoke(test, i);  
            int integer = (int) getter.invoke(test);  
            bh.consume(integer);  
        }  
    }catch (Throwable e) {  
        throw new UnknownError();  
    }  
}

与在测试MethodHandle时我们将需要初始化的变量定义为static final不同，此处我们直接将其定义为private变量，在JMH框架中提供的@Setup函数中进行初始化，更贴合的模拟我们在运行时进行创建的行为，测试得到的结果如下：

plaintext 复制代码

ReflectionTest.testNormalReflection      10  avgt   50     152.242 ±    5.625  ns/op
ReflectionTest.testNormalReflection     100  avgt   50    1495.302 ±   21.467  ns/op
ReflectionTest.testNormalReflection    1000  avgt   50   16917.774 ±  420.810  ns/op
ReflectionTest.testNormalReflection   10000  avgt   50  143252.377 ± 2150.908  ns/op

可以看到，使用普通反射的方式，无论是每次都获取新的Constructor或Method对象进行创建，还是通过提前缓存的形式进行加载，性能表现是相似的，这也使得通用的反射调用方式在各类通用场景下都能够具备比较不错的表现。

鉴于我们之前的这些测试结果，如果想要进一步的提升反射的性能，只能考虑使用类生成的方式，在编译期创建出MethodHandle的静态变量，让JVM帮我们去自动内联，当然，类生成的方式一定可以拥有非常不错的性能，但是使用ByteBuddy或Asm框架进行类生成的代码相对而言过于繁琐，目前[# JEP 457: Class-File API (Preview)].(openjdk.org/jeps/457) 特性正处于preview阶段，可以帮助我们更加简化的在JVM中进行类生成，但是目前我们还无法对其进行使用。

解决方案

Lambda表达式贯穿了我们日常的开发中的所有角落，且Lambda表达式本身的性能不会差，否则JDK内部绝对不会如此大量的使用它，Lambda表达式的生成方式也并不复杂，其背后的核心方法是通过LambdaMetafactory.metafactory()方法生成对应的方法调用，我们可是实现以下的代码来完成对应构造函数，getter方法和setter方法向Lambda函数的转换：

java 复制代码

private Supplier<Test> constructor;  
private BiConsumer<Object, Object> setConsumer;  
private Function<Test, Integer> getFunction;

@Setup  
public void setup() throws Throwable {
	MethodHandles.Lookup lookup = MethodHandles.privateLookupIn(ReflectionTest.class, MethodHandles.lookup());  
	this.constructor = lambdaGenerateConstructor(lookup);  
	this.setConsumer = lambdaGenerateSetter(lookup);  
	this.getFunction = lambdaGenerateGetter(lookup);
}

@SuppressWarnings("unchecked")  
private Supplier<Test> lambdaGenerateConstructor(MethodHandles.Lookup lookup) throws Throwable {  
    MethodHandle cmh = lookup.findConstructor(Test.class, MethodType.methodType(void.class));  
    CallSite c1 = LambdaMetafactory.metafactory(lookup,  
            "get",  
            MethodType.methodType(Supplier.class),  
            MethodType.methodType(Object.class), cmh, MethodType.methodType(Test.class));  
    return (Supplier<Test>) c1.getTarget().invokeExact();  
}  
  
@SuppressWarnings("unchecked")  
private BiConsumer<Object, Object> lambdaGenerateSetter(MethodHandles.Lookup lookup) throws Throwable {  
    MethodHandle setHandle = lookup.findVirtual(Test.class, "setInteger", MethodType.methodType(void.class, int.class));  
    CallSite callSite = LambdaMetafactory.metafactory(lookup,  
            "accept",  
            MethodType.methodType(BiConsumer.class),  
            MethodType.methodType(void.class, Object.class, Object.class),  
            setHandle,  
            MethodType.methodType(void.class, Test.class, Integer.class));  
    return (BiConsumer<Object, Object>) callSite.getTarget().invokeExact();  
}  
  
@SuppressWarnings("unchecked")  
private Function<Test, Integer> lambdaGenerateGetter(MethodHandles.Lookup lookup) throws Throwable {  
    MethodHandle getHandle = lookup.findVirtual(Test.class, "getInteger", MethodType.methodType(int.class));  
    CallSite getSite = LambdaMetafactory.metafactory(  
            lookup,  
            "apply",  
            MethodType.methodType(Function.class),  
            MethodType.methodType(Object.class, Object.class),  
            getHandle,  
            MethodType.methodType(Integer.class, Test.class)  
    );  
    return (Function<Test, Integer>) getSite.getTarget().invokeExact();  
}

@Benchmark  
public void testLambda(Blackhole bh) {  
    for(int i = 0; i < size; i++) {  
        Test test = constructor.get();  
        bh.consume(test);  
        setConsumer.accept(test, i);  
        int integer = getFunction.apply(test);  
        bh.consume(integer);  
    }  
}  
  
@Benchmark  
public void testLambdaGeneration(Blackhole bh) throws Throwable {  
    MethodHandles.Lookup lookup = MethodHandles.privateLookupIn(ReflectionTest.class, MethodHandles.lookup());  
    bh.consume(lambdaGenerateConstructor(lookup));  
    bh.consume(lambdaGenerateSetter(lookup));  
    bh.consume(lambdaGenerateGetter(lookup));  
}

测试分为两个步骤，一个是测试Lambda表达式的生成性能，一个是测试Lambda表达式的运行性能，这两个指标对我们来说都非常的重要，得到的结果如下：

plaintext 复制代码

ReflectionTest.testLambdaGeneration   10000  avgt   50   92486.909 ±  62638.147  ns/op

plaintext 复制代码

Benchmark                  (size)  Mode  Cnt      Score     Error  Units
ReflectionTest.testLambda      10  avgt   50     10.720 ±   0.087  ns/op
ReflectionTest.testLambda     100  avgt   50    105.001 ±   1.312  ns/op
ReflectionTest.testLambda    1000  avgt   50   1020.406 ±   9.990  ns/op
ReflectionTest.testLambda   10000  avgt   50  10198.842 ± 143.259  ns/op

可以看到，通过模拟Lambda表达式生成的方式，调用构造函数以及get和set方法的性能，与直接调用是几乎完全一致的，这也就达成了我们想要的效果，但是Lambda生成的性能非常不容乐观，与直接使用箭头函数进行生成的性能有着天壤之别，好在如果Lambda表达式没有捕获任何的外部变量，比如我们在示例中调用的get和set方法，那么生成的方法是可以被缓存起来重复使用的，如果使用的基数本身比较大，在多次调用的开销权衡中，初始化的开销就可以被忽略不计。

小结

本文介绍了一种在Java中的新的反射调用方式，即使用类似于Lambda表达式的生成的方式进行反射，可以将一些简单的方法，例如get和set方法，直接转化为相应的Lambda表达式来调用，虽然可以做到和直接调用一致的性能，但是该方法的生成开销比较大，需要在频繁调用的场景中进行缓存，才能起到比较好的效果。