4.4.3.3. X86TargetLowering子对象
在X86Subtarget构造函数的314行,接着调用X86TargetLowering构造函数构建X86Subtarget中的该类型的子对象TLInfo。
这个TargetLowering派生类,由基于SelectionDAG的指令选择器用于描述LLVM代码如何被降级为SelectionDAG操作。至于其他,这个类展示了:
- 用于各种ValueType的一个初始寄存器类别,
- 目标机器原生支持哪些操作,
- Setcc操作的返回类型,
- 可用作偏移数的类型,及
- 各种高级特性,比如通过常量将除法转换为一组乘法是否合算
4.4.3.3.1. TargetLowering
首先看一下基类TargetLowering的构造函数。
40TargetLowering::TargetLowering(const TargetMachine &tm)
41 : TargetLoweringBase(tm) {}
TargetLoweringBase构造函数的定义如下。它为各个目标机器提供了基准设置,各目标机器可以在自己的TargetLowering派生类的构造函数里重新设置相关的参数。
532TargetLoweringBase::TargetLoweringBase(const TargetMachine &tm) : TM(tm) {
533 initActions();
534
535 // Perform these initializations only once.
536 MaxStoresPerMemset = MaxStoresPerMemcpy = MaxStoresPerMemmove =
537 MaxLoadsPerMemcmp = 8;
538 MaxGluedStoresPerMemcpy = 0;
539 MaxStoresPerMemsetOptSize = MaxStoresPerMemcpyOptSize
540 = MaxStoresPerMemmoveOptSize = MaxLoadsPerMemcmpOptSize = 4;
541 UseUnderscoreSetJmp = false;
542 UseUnderscoreLongJmp = false;
543 HasMultipleConditionRegisters = false;
544 HasExtractBitsInsn = false;
545 JumpIsExpensive = JumpIsExpensiveOverride;
546 PredictableSelectIsExpensive = false;
547 EnableExtLdPromotion = false;
548 HasFloatingPointExceptions = true;
549 StackPointerRegisterToSaveRestore = 0;
550 BooleanContents = UndefinedBooleanContent;
551 BooleanFloatContents = UndefinedBooleanContent;
552 BooleanVectorContents = UndefinedBooleanContent;
553 SchedPreferenceInfo = Sched::ILP;
554 JumpBufSize = 0;
555 JumpBufAlignment = 0;
556 MinFunctionAlignment = 0;
557 PrefFunctionAlignment = 0;
558 PrefLoopAlignment = 0;
559 GatherAllAliasesMaxDepth = 18;
560 MinStackArgumentAlignment = 1;
561 // TODO: the default will be switched to 0 in the next commit, along
562 // with the Target-specific changes necessary.
563 MaxAtomicSizeInBitsSupported = 1024;
564
565 MinCmpXchgSizeInBits = 0;
566 SupportsUnalignedAtomics = false;
567
568 std::fill(std::begin(LibcallRoutineNames ), std::end(LibcallRoutineNames), nullptr);
569
570 InitLibcalls(TM.getTargetTriple());
571 InitCmpLibcallCCs(CmpLibcallCCs);
572}
在750行调用initActions()初始化各种action。从下面的代码可以看到,有这些action:OpActions、LoadExtActions、TruncStoreActions、IndexedModeActions与CondCodeActions。它们都是整数类型的数组,数组的内容则是一个LegalizeAction枚举类型。这个枚举类型表示指定的操作对一个目标机器是否合法。如果不是,应该采取什么行动使它们合法:
43 namespace LegalizeActions {
44 enum LegalizeAction : std::uint8_t {
45 /// The operation is expected to be selectable directly by the target, and
46 /// no transformation is necessary.
47 Legal,
48
49 /// The operation should be synthesized from multiple instructions acting on
50 /// a narrower scalar base-type. For example a 64-bit add might be
51 /// implemented in terms of 32-bit add-with-carry.
52 NarrowScalar,
53
54 /// The operation should be implemented in terms of a wider scalar
55 /// base-type. For example a <2 x s8> add could be implemented as a <2
56 /// x s32> add (ignoring the high bits).
57 WidenScalar,
58
59 /// The (vector) operation should be implemented by splitting it into
60 /// sub-vectors where the operation is legal. For example a <8 x s64> add
61 /// might be implemented as 4 separate <2 x s64> adds.
62 FewerElements,
63
64 /// The (vector) operation should be implemented by widening the input
65 /// vector and ignoring the lanes added by doing so. For example <2 x i8> is
66 /// rarely legal, but you might perform an <8 x i8> and then only look at
67 /// the first two results.
68 MoreElements,
69
70 /// The operation itself must be expressed in terms of simpler actions on
71 /// this target. E.g. a SREM replaced by an SDIV and subtraction.
72 Lower,
73
74 /// The operation should be implemented as a call to some kind of runtime
75 /// support library. For example this usually happens on machines that don't
76 /// support floating-point operations natively.
77 Libcall,
78
79 /// The target wants to do something special with this combination of
80 /// operand and type. A callback will be issued when it is needed.
81 Custom,
82
83 /// This operation is completely unsupported on the target. A programming
84 /// error has occurred.
85 Unsupported,
86
87 /// Sentinel value for when no action was found in the specified table.
88 NotFound,
89
90 /// Fall back onto the old rules.
91 /// TODO: Remove this once we've migrated
92 UseLegacyRules,
93};
94 } // end namespace LegalizeActions
因此上述数组的定义分别为:
- LegalizeAction OpActions[MVT::LAST_VALUETYPE][ISD::BUILTIN_OP_END]
对于每个操作符以及每个类型,保存一个LegalizeAction值,指示指令选择如何处理该操作。大多数操作是合法的(即目标机器原生支持),但是不支持的操作应该被描述。注意这里不考虑非法值类型上的操作。
- uint16_t LoadExtActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]
对于每个载入扩展类型以及每个值类型,保存一个LegalizeAction值,指示指令选择应如何应对涉及一个指定值类型及其扩展类型的载入。使用4比特为每个载入类型保存动作,4个载入类型为一组。
- LegalizeAction TruncStoreActions[MVT::LAST_VALUETYPE][MVT::LAST_VALUETYPE]
对于每个值类型对,保存一个LegalizeAction值,指示涉及一个指定值类型及其截断类型的截断载入是否合法。
- uint8_t IndexedModeActions[MVT::LAST_VALUETYPE][ISD::LAST_INDEXED_MODE]
其中ISD::LAST_INDEXED_MODE是内存地址索引模式的数量。对于每个索引模式以及每个值类型,保存一对LegalizeAction值来指示指令选择应如何应对保存及载入。第一维是参考的value_type。第二维代表读写的各种模式。
- uint32_t CondCodeActions[ISD::SETCC_INVALID][(MVT::LAST_VALUETYPE + 7) / 8]
其中ISD::SETCC_INVALID是LLVM IR条件指令的数量。因此对每个条件码(ISD::CondCode)保存一个LegalizeAction值,指示指令选择应如何处理该条件码。每个CC活动使用4比特。
- 另外,TargetDAGCombineArray是另一个数组定义。它的类型是:
unsigned char TargetDAGCombineArray[(ISD::BUILTIN_OP_END+CHAR_BIT-1)/CHAR_BIT]
它是一个位图,每个LLVM IR操作对应一个位,如果是1,表示该操作期望使用目标机器的回调方法PerformDAGCombine()来执行指令合并。
574 void TargetLoweringBase::initActions() {
575 // All operations default to being supported.
576 memset(OpActions, 0, sizeof(OpActions));
577 memset(LoadExtActions, 0, sizeof(LoadExtActions));
578 memset(TruncStoreActions, 0, sizeof(TruncStoreActions));
579 memset(IndexedModeActions, 0, sizeof(IndexedModeActions));
580 memset(CondCodeActions, 0, sizeof(CondCodeActions));
581 std::fill(std::begin(RegClassForVT), std::end(RegClassForVT), nullptr);
582 std::fill(std::begin(TargetDAGCombineArray),
583 std::end(TargetDAGCombineArray), 0);
584
585 // Set default actions for various operations.
586 for (MVT VT : MVT::all_valuetypes()) {
587 // Default all indexed load / store to expand.
588 for (unsigned IM = (unsigned)ISD::PRE_INC;
589 IM != (unsigned)ISD::LAST_INDEXED_MODE; ++IM) {
590 setIndexedLoadAction(IM, VT, Expand);
591 setIndexedStoreAction(IM, VT, Expand);
592 }
593
594 // Most backends expect to see the node which just returns the value loaded.
595 setOperationAction(ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, VT, Expand);
596
597 // These operations default to expand.
598 setOperationAction(ISD::FGETSIGN, VT, Expand);
599 setOperationAction(ISD::CONCAT_VECTORS, VT, Expand);
600 setOperationAction(ISD::FMINNUM, VT, Expand);
601 setOperationAction(ISD::FMAXNUM, VT, Expand);
602 setOperationAction(ISD::FMINNAN, VT, Expand);
603 setOperationAction(ISD::FMAXNAN, VT, Expand);
604 setOperationAction(ISD::FMAD, VT, Expand);
605 setOperationAction(ISD::SMIN, VT, Expand);
606 setOperationAction(ISD::SMAX, VT, Expand);
607 setOperationAction(ISD::UMIN, VT, Expand);
608 setOperationAction(ISD::UMAX, VT, Expand);
609 setOperationAction(ISD::ABS, VT, Expand);
610
611 // Overflow operations default to expand
612 setOperationAction(ISD::SADDO, VT, Expand);
613 setOperationAction(ISD::SSUBO, VT, Expand);
614 setOperationAction(ISD::UADDO, VT, Expand);
615 setOperationAction(ISD::USUBO, VT, Expand);
616 setOperationAction(ISD::SMULO, VT, Expand);
617 setOperationAction(ISD::UMULO, VT, Expand);
618
619 // ADDCARRY operations default to expand
620 setOperationAction(ISD::ADDCARRY, VT, Expand);
621 setOperationAction(ISD::SUBCARRY, VT, Expand);
622 setOperationAction(ISD::SETCCCARRY, VT, Expand);
623
624 // ADDC/ADDE/SUBC/SUBE default to expand.
625 setOperationAction(ISD::ADDC, VT, Expand);
626 setOperationAction(ISD::ADDE, VT, Expand);
627 setOperationAction(ISD::SUBC, VT, Expand);
628 setOperationAction(ISD::SUBE, VT, Expand);
629
630 // These default to Expand so they will be expanded to CTLZ/CTTZ by default.
631 setOperationAction(ISD::CTLZ_ZERO_UNDEF, VT, Expand);
632 setOperationAction(ISD::CTTZ_ZERO_UNDEF, VT, Expand);
633
634 setOperationAction(ISD::BITREVERSE, VT, Expand);
635
636 // These library functions default to expand.
637 setOperationAction(ISD::FROUND, VT, Expand);
638 setOperationAction(ISD::FPOWI, VT, Expand);
639
640 // These operations default to expand for vector types.
641 if (VT.isVector()) {
642 setOperationAction(ISD::FCOPYSIGN, VT, Expand);
643 setOperationAction(ISD::ANY_EXTEND_VECTOR_INREG, VT, Expand);
644 setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, VT, Expand);
645 setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);
646 }
647
648 // For most targets @llvm.get.dynamic.area.offset just returns 0.
649 setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);
650 }
651
652 // Most targets ignore the @llvm.prefetch intrinsic.
653 setOperationAction(ISD::PREFETCH, MVT::Other, Expand);
654
655 // Most targets also ignore the @llvm.readcyclecounter intrinsic.
656 setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);
657
658 // ConstantFP nodes default to expand. Targets can either change this to
659 // Legal, in which case all fp constants are legal, or use isFPImmLegal()
660 // to optimize expansions for certain constants.
661 setOperationAction(ISD::ConstantFP, MVT::f16, Expand);
662 setOperationAction(ISD::ConstantFP, MVT::f32, Expand);
663 setOperationAction(ISD::ConstantFP, MVT::f64, Expand);
664 setOperationAction(ISD::ConstantFP, MVT::f80, Expand);
665 setOperationAction(ISD::ConstantFP, MVT::f128, Expand);
666
667 // These library functions default to expand.
668 for (MVT VT : {MVT::f32, MVT::f64, MVT::f128}) {
669 setOperationAction(ISD::FLOG , VT, Expand);
670 setOperationAction(ISD::FLOG2, VT, Expand);
671 setOperationAction(ISD::FLOG10, VT, Expand);
672 setOperationAction(ISD::FEXP , VT, Expand);
673 setOperationAction(ISD::FEXP2, VT, Expand);
674 setOperationAction(ISD::FFLOOR, VT, Expand);
675 setOperationAction(ISD::FNEARBYINT, VT, Expand);
676 setOperationAction(ISD::FCEIL, VT, Expand);
677 setOperationAction(ISD::FRINT, VT, Expand);
678 setOperationAction(ISD::FTRUNC, VT, Expand);
679 setOperationAction(ISD::FROUND, VT, Expand);
680 }
681
682 // Default ISD::TRAP to expand (which turns it into abort).
683 setOperationAction(ISD::TRAP, MVT::Other, Expand);
684
685 // On most systems, DEBUGTRAP and TRAP have no difference. The "Expand"
686 // here is to inform DAG Legalizer to replace DEBUGTRAP with TRAP.
687 setOperationAction(ISD::DEBUGTRAP, MVT::Other, Expand);
688}
576~580行将所有这些容器都置0了,意味着所有的action都是合法的,而且所有的操作都不需要回调PerformDAGCombine。接下来的代码将个别的操作设置为Expand,下面会看到X86的派生类型还会进行自己的改写。
执行完initActions()后,在TargetLoweringBase构造函数,接下来初始化这些参数成员。
- MaxStoresPerMemset
- MaxLoadsPerMemcmp
- MaxStoresPerMemcpy
- MaxStoresPerMemmove
在降级@llvm.memset/@llvm.memcpy/@llvm.memmove时,这个域指明替换memset/memcpy/ memmove调用所需的最大储存次数。目标机器必须基于代价门限设置这个值。应该假设目标机器将根据对齐限制,首先使用尽可能多的最大的储存操作,然后如果需要较小的操作。例如,在32位机器上以16比特对齐保存9字节将导致4次2字节储存与1次单字节储存。这仅适用于设置一个常量大小的常量数组。
- MaxStoresPerMemcpyOptSize
- MaxLoadsPerMemcmpOptSize
- MaxStoresPerMemmoveOptSize
替换memcpy/memmove调用的最大储存次数,用于带有OptSize属性的函数。
- UseUnderscoreSetJmp,UseUnderscoreLongJmp
表示是否使用_setjmp或_longjmp来实现llvm.setjmp或llvm.longjmp。
- MaxGluedStoresPerMemcpy
在基于MaxStoresPerMemcpy内联memcpy时,说明保持在一起的最大储存指令数。这有助于后面的成对与向量化。
- HasMultipleConditionRegisters
告诉代码生成器目标机器是否有多个(可分配)条件寄存器用于保存比较结果。如果有多个条件寄存器,代码生成器就不会激进地将比较下沉到使用者所在基本块。
- HasExtractBitsInsn
告诉代码生成器目标机器是否有BitExtract指令。如果对BitExtract指令,使用者生成一个与shift组合的and指令,代码生成器将激进地将shift下沉到使用者所在基本块。
- JumpIsExpensive
告诉代码生成器不要生成额外的流控指令,而是应该尝试通过预测合并流控指令。
- PredictableSelectIsExpensive
告诉代码生成器,如果一个分支的预测通常是正确的,select比该跳转代价要更高。
- EnableExtLdPromotion
表示目标机器是否希望使用将ext(promotableInst1(...(promotableInstN(load))))转换为promotedInst1(...(promotedInstN(ext(load))))的优化。
- HasFloatingPointExceptions
表示目标机器是否支持或在意保留浮点数的异常行为。
- StackPointerRegisterToSaveRestore
如果设置为一个物理寄存器,就指定了llvm.savestack及llvm.restorestack应该保存及恢复的寄存器。
- BooleanContents
- BooleanFloatContents
- BooleanVectorContents
它们都是BooleanContent枚举类型,其定义如下:
140 enum BooleanContent {
141 UndefinedBooleanContent, // Only bit 0 counts, the rest can hold garbage.
142 ZeroOrOneBooleanContent, // All bits zero except for bit 0.
143 ZeroOrNegativeOneBooleanContent // All bits equal to bit 0.
144 };
用于表示各自大于i1类型中的布尔值高位的内容。
- SchedPreferenceInfo
表示目标机器的调度偏好,通常为了达到总周期数最短或最低寄存器压力的目的。它的类型是Sched::Preference,这个枚举类型给出了LLVM目前支持的调度器类型。
95 enum Preference {
96 None, // No preference
97 Source, // Follow source order.
98 RegPressure, // Scheduling for lowest register pressure.
99 Hybrid, // Scheduling for both latency and register pressure.
100 ILP, // Scheduling for ILP in low register pressure mode.
101 VLIW // Scheduling for VLIW targets.
102 };
103 }
- JumpBufSize
- JumpBufAlignment
目标机器jmp_buf缓冲的字节数以及对齐要求。
- MinFunctionAlignment
- PrefFunctionAlignment
- PrefLoopAlignment
分别表示函数的最小对齐要求(用于优化代码大小时,防止显式提供的对齐要求导致错误代码),函数的期望对齐要求(用于没有对齐要求且优化速度时),以及期望的循环对齐要求。
- MinStackArgumentAlignment
栈上任何参数所需的最小对齐要求。
在568行的容器LibcallRoutineNames的定义是:const char *LibcallRoutineNames[RTLIB:: UNKNOWN_LIBCALL]。其中RTLIB::UNKNOWN_LIBCALL是后端可以发布的运行时库函数调用的数量。这些库函数由RTLIB::Libcall枚举类型描述。这个表由下面的方法根据配置文件来填充:
118 void TargetLoweringBase::InitLibcalls(const Triple &TT) {
119 #define HANDLE_LIBCALL(code, name) \
120 setLibcallName(RTLIB::code, name);
121 #include "llvm/IR/RuntimeLibcalls.def"
122 #undef HANDLE_LIBCALL
123 // Initialize calling conventions to their default.
124 for (int LC = 0; LC < RTLIB::UNKNOWN_LIBCALL; ++LC)
125 setLibcallCallingConv((RTLIB::Libcall)LC, CallingConv::C);
126
127 // A few names are different on particular architectures or environments.
128 if (TT.isOSDarwin()) {
129 // For f16/f32 conversions, Darwin uses the standard naming scheme, instead
130 // of the gnueabi-style _gnu*_ieee.
131 // FIXME: What about other targets?
132 setLibcallName(RTLIB::FPEXT_F16_F32, "__extendhfsf2");
133 setLibcallName(RTLIB::FPROUND_F32_F16, "__truncsfhf2");
134
135 // Some darwins have an optimized __bzero/bzero function.
136 switch (TT.getArch()) {
137 case Triple::x86:
138 case Triple::x86_64:
139 if (TT.isMacOSX() && !TT.isMacOSXVersionLT(10, 6))
140 setLibcallName(RTLIB::BZERO, "__bzero");
141 break;
142 case Triple::aarch64:
143 setLibcallName(RTLIB::BZERO, "bzero");
144 break;
145 default:
146 break;
147 }
148
149 if (darwinHasSinCos(TT)) {
150 setLibcallName(RTLIB::SINCOS_STRET_F32, "__sincosf_stret");
151 setLibcallName(RTLIB::SINCOS_STRET_F64, "__sincos_stret");
152 if (TT.isWatchABI()) {
153 setLibcallCallingConv(RTLIB::SINCOS_STRET_F32,
154 CallingConv::ARM_AAPCS_VFP);
155 setLibcallCallingConv(RTLIB::SINCOS_STRET_F64,
156 CallingConv::ARM_AAPCS_VFP);
157 }
158 }
159 } else {
160 setLibcallName(RTLIB::FPEXT_F16_F32, "__gnu_h2f_ieee");
161 setLibcallName(RTLIB::FPROUND_F32_F16, "__gnu_f2h_ieee");
162 }
163
164 if (TT.isGNUEnvironment() || TT.isOSFuchsia()) {
165 setLibcallName(RTLIB::SINCOS_F32, "sincosf");
166 setLibcallName(RTLIB::SINCOS_F64, "sincos");
167 setLibcallName(RTLIB::SINCOS_F80, "sincosl");
168 setLibcallName(RTLIB::SINCOS_F128, "sincosl");
169 setLibcallName(RTLIB::SINCOS_PPCF128, "sincosl");
170 }
171
172 if (TT.isOSOpenBSD()) {
173 setLibcallName(RTLIB::STACKPROTECTOR_CHECK_FAIL, nullptr);
174 }
175}
配置文件RuntimeLibcalls.def定义了后端可以生成的运行时库调用。它包含的内容形如:
HANDLE_LIBCALL(SHL_I16, "__ashlhi3")
在InitLibcalls()开头生成的宏定义会合成这样的枚举值:RTLIB::SHL_I16。这个枚举值实际上也是根据RuntimeLibcalls.def的内容生成的(RuntimeLibcalls.h)。因此,setLibcallName()就是以这些枚举值为下标记录对应的函数名。
setLibcallCallingConv()则是以这些枚举值为下标记录对应函数使用的调用惯例。缺省都是与C调用惯例兼容的LLVM缺省调用惯例。
571行的CmpLibcallCCs的定义是:ISD::CondCode CmpLibcallCCs[RTLIB::UNKNOWN_LIBCALL]。因此InitCmpLibcallCCs()是通过CmpLibcallCCs将RTLIB::Libcall中关于比较的函数关联到反映它们布尔结果的ISD::CondCode值。