本篇文章是阅读 LeakCanary
源码的系列文章第四篇,如果没有看过前面三篇文章建议先看看前面的文章:
LeakCanary 源码阅读笔记(一)
LeakCanary 源码阅读笔记(二)
LeakCanary 源码阅读笔记(三)
本篇文章主要介绍 LeakCanary
如何解析 HPROF
文件,如果不熟悉 HPROF
文件结构的同学,强烈建议先看看我之前介绍 HPROF
文件的文章: Android HPROF 内存快照文件详解。
解析 HPROF 文件前的操作
书接上回,当 HPROF
文件 dump
成功后,会发送一个 HeapDump
的事件给 InternalLeakCanary
,我们来看看它的 sendEvent()
方法:
Kotlin
fun sendEvent(event: Event) {
for(listener in LeakCanary.config.eventListeners) {
listener.onEvent(event)
}
}
我们看到会依次回调 LeakCanary.config.eventListeners
中的监听。
Kotlin
val eventListeners: List<EventListener> = listOf(
LogcatEventListener,
ToastEventListener,
LazyForwardingEventListener {
if (InternalLeakCanary.formFactor == TV) TvEventListener else NotificationEventListener
},
when {
RemoteWorkManagerHeapAnalyzer.remoteLeakCanaryServiceInClasspath ->
RemoteWorkManagerHeapAnalyzer
WorkManagerHeapAnalyzer.validWorkManagerInClasspath -> WorkManagerHeapAnalyzer
else -> BackgroundThreadHeapAnalyzer
}
)
我们看到有很多的监听,默认分析 HPROF
文件的监听就是 BackgroundThreadHeapAnalyzer
。
Kotlin
object BackgroundThreadHeapAnalyzer : EventListener {
internal val heapAnalyzerThreadHandler by lazy {
val handlerThread = HandlerThread("HeapAnalyzer")
handlerThread.start()
Handler(handlerThread.looper)
}
override fun onEvent(event: Event) {
if (event is HeapDump) {
heapAnalyzerThreadHandler.post {
val doneEvent = AndroidDebugHeapAnalyzer.runAnalysisBlocking(event) { event ->
InternalLeakCanary.sendEvent(event)
}
InternalLeakCanary.sendEvent(doneEvent)
}
}
}
}
朴实无华的代码,直接在 HeapAnalyzer
线程中调用 AndroidDebugHeapAnalyzer#runAnalysisBlocking()
方法去分析 HPROF
。
Kotlin
fun runAnalysisBlocking(
heapDumped: HeapDump,
isCanceled: () -> Boolean = { false },
progressEventListener: (HeapAnalysisProgress) -> Unit
): HeapAnalysisDone<*> {
// 进度监听器
val progressListener = OnAnalysisProgressListener { step ->
val percent = (step.ordinal * 1.0) / OnAnalysisProgressListener.Step.values().size
progressEventListener(HeapAnalysisProgress(heapDumped.uniqueId, step, percent))
}
val heapDumpFile = heapDumped.file
val heapDumpDurationMillis = heapDumped.durationMillis
val heapDumpReason = heapDumped.reason
// 检查文件是否存在
val heapAnalysis = if (heapDumpFile.exists()) {
// 执行分析
analyzeHeap(heapDumpFile, progressListener, isCanceled)
} else {
missingFileFailure(heapDumpFile)
}
// 处理分析后的结果
val fullHeapAnalysis = when (heapAnalysis) {
is HeapAnalysisSuccess -> heapAnalysis.copy(
dumpDurationMillis = heapDumpDurationMillis,
metadata = heapAnalysis.metadata + ("Heap dump reason" to heapDumpReason)
)
is HeapAnalysisFailure -> {
val failureCause = heapAnalysis.exception.cause!!
if (failureCause is OutOfMemoryError) {
heapAnalysis.copy(
dumpDurationMillis = heapDumpDurationMillis,
exception = HeapAnalysisException(
RuntimeException(
"""
Not enough memory to analyze heap. You can:
- Kill the app then restart the analysis from the LeakCanary activity.
- Increase the memory available to your debug app with largeHeap=true: https://developer.android.com/guide/topics/manifest/application-element#largeHeap
- Set up LeakCanary to run in a separate process: https://square.github.io/leakcanary/recipes/#running-the-leakcanary-analysis-in-a-separate-process
- Download the heap dump from the LeakCanary activity then run the analysis from your computer with shark-cli: https://square.github.io/leakcanary/shark/#shark-cli
""".trimIndent(), failureCause
)
)
)
} else {
heapAnalysis.copy(dumpDurationMillis = heapDumpDurationMillis)
}
}
}
progressListener.onAnalysisProgress(REPORTING_HEAP_ANALYSIS)
val analysisDoneEvent = ScopedLeaksDb.writableDatabase(application) { db ->
val id = HeapAnalysisTable.insert(db, heapAnalysis)
when (fullHeapAnalysis) {
is HeapAnalysisSuccess -> {
val showIntent = LeakActivity.createSuccessIntent(application, id)
val leakSignatures = fullHeapAnalysis.allLeaks.map { it.signature }.toSet()
val leakSignatureStatuses = LeakTable.retrieveLeakReadStatuses(db, leakSignatures)
val unreadLeakSignatures = leakSignatureStatuses.filter { (_, read) ->
!read
}.keys
// keys returns LinkedHashMap$LinkedKeySet which isn't Serializable
.toSet()
HeapAnalysisSucceeded(
heapDumped.uniqueId,
fullHeapAnalysis,
unreadLeakSignatures,
showIntent
)
}
is HeapAnalysisFailure -> {
val showIntent = LeakActivity.createFailureIntent(application, id)
HeapAnalysisFailed(heapDumped.uniqueId, fullHeapAnalysis, showIntent)
}
}
}
return analysisDoneEvent
}
上面代码看着多,其实大部分都是唬人的,忽略掉就好了,直接看 analyzeHeap()
是如何处理 HPROF
文件的。
Kotlin
private fun analyzeHeap(
heapDumpFile: File,
progressListener: OnAnalysisProgressListener,
isCanceled: () -> Boolean
): HeapAnalysis {
val config = LeakCanary.config
// Heap 的分析
val heapAnalyzer = HeapAnalyzer(progressListener)
val proguardMappingReader = try {
// 混淆的 Map 文件的 Reader
ProguardMappingReader(application.assets.open(PROGUARD_MAPPING_FILE_NAME))
} catch (e: IOException) {
null
}
progressListener.onAnalysisProgress(PARSING_HEAP_DUMP)
// HPROF 文件流的 Provider
val sourceProvider =
ConstantMemoryMetricsDualSourceProvider(ThrowingCancelableFileSourceProvider(heapDumpFile) {
if (isCanceled()) {
throw RuntimeException("Analysis canceled")
}
})
val closeableGraph = try {
// 解析 HPROF 文件
sourceProvider.openHeapGraph(proguardMapping = proguardMappingReader?.readProguardMapping())
} catch (throwable: Throwable) {
return HeapAnalysisFailure(
heapDumpFile = heapDumpFile,
createdAtTimeMillis = System.currentTimeMillis(),
analysisDurationMillis = 0,
exception = HeapAnalysisException(throwable)
)
}
return closeableGraph
.use { graph ->
// 解析成功的 HPROF 文件的信息存放在 graph 中,然后调用 HeapAnalyzer#analyze() 方法分析泄漏。
val result = heapAnalyzer.analyze(
heapDumpFile = heapDumpFile,
graph = graph,
leakingObjectFinder = config.leakingObjectFinder,
referenceMatchers = config.referenceMatchers,
computeRetainedHeapSize = config.computeRetainedHeapSize,
objectInspectors = config.objectInspectors,
metadataExtractor = config.metadataExtractor
)
if (result is HeapAnalysisSuccess) {
val lruCacheStats = (graph as HprofHeapGraph).lruCacheStats()
val randomAccessStats =
"RandomAccess[" +
"bytes=${sourceProvider.randomAccessByteReads}," +
"reads=${sourceProvider.randomAccessReadCount}," +
"travel=${sourceProvider.randomAccessByteTravel}," +
"range=${sourceProvider.byteTravelRange}," +
"size=${heapDumpFile.length()}" +
"]"
val stats = "$lruCacheStats $randomAccessStats"
result.copy(metadata = result.metadata + ("Stats" to stats))
} else result
}
}
通过上面的代码我们发现 LeakCanary
还支持解混淆的,上面的代码主要分为两大块儿逻辑,首先是通过 openHeapGraph()
方法去解析 HPROF
文件中的内容,解析出的内容存放在 graph
变量中。然后通过 HeapAnalyzer#analyze()
方法去找到泄漏对象的引用链。
HPROF
的相关处理是在 shark
的 module
中完成的,所以我们在看到有的地方的 LeakCanary
的 Logo
时有一个鲨鱼,shark
这个库也是可以单独引用的。
解析 HPROF 文件
直接看 openHeapGraph()
方法:
Kotlin
fun DualSourceProvider.openHeapGraph(
proguardMapping: ProguardMapping? = null,
indexedGcRootTypes: Set<HprofRecordTag> = HprofIndex.defaultIndexedGcRootTags()
): CloseableHeapGraph {
// TODO We can probably remove the concept of DualSourceProvider. Opening a heap graph requires
// a random access reader which is built from a random access source + headers.
// Also require headers, and the index.
// So really we're:
// 1) Reader the headers from an okio source
// 2) Reading the whole source streaming to create the index. Wondering if we really need to parse
// the headers, close the file then parse / skip the header part. Can't the parsing + indexing give
// us headers + index?
// 3) Using the index + headers + a random access source on the content to create a closeable
// abstraction.
// Note: should see if Okio has a better abstraction for random access now.
// Also Use FileSystem + Path instead of File as the core way to open a file based heap dump.
// 解析 Header
val header = openStreamingSource().use { HprofHeader.parseHeaderOf(it) }
// 解析 Record
val index = HprofIndex.indexRecordsOf(this, header, proguardMapping, indexedGcRootTypes)
// 将解析结果封装到 HprofHeapGraph 中
return index.openHeapGraph()
}
通过 HprofHeader#parseHeaderOf()
方法解析 Header
,通过 HprofIndex#indexRecordsOf()
方法解析 Record
。
先看看 Header
的解析:
Kotlin
fun parseHeaderOf(source: BufferedSource): HprofHeader {
require(!source.exhausted()) {
throw IllegalArgumentException("Source has no available bytes")
}
// 版本字符串结束的位置
val endOfVersionString = source.indexOf(0)
// 读取版本字符串
val versionName = source.readUtf8(endOfVersionString)
// 检查是否支持该版本的 HPROF 文件的版本,如果不支持,直接报错
val version = supportedVersions[versionName]
checkNotNull(version) {
"Unsupported Hprof version [$versionName] not in supported list ${supportedVersions.keys}"
}
// Skip the 0 at the end of the version string.
// 跳过字符串结尾的 0
source.skip(1)
// 读取 ID 或者引用占用的字节数
val identifierByteSize = source.readInt()
// 读取 DUMP 的时间戳
val heapDumpTimestamp = source.readLong()
return HprofHeader(heapDumpTimestamp, version, identifierByteSize)
}
上面的代码很简单,主要读取以下数据:
- 版本字符串 版本参考:
Kotlin
enum class HprofVersion(val versionString: String) {
JDK1_2_BETA3("JAVA PROFILE 1.0"),
JDK1_2_BETA4("JAVA PROFILE 1.0.1"),
JDK_6("JAVA PROFILE 1.0.2"),
ANDROID("JAVA PROFILE 1.0.3")
}
Android
固定为 JAVA PROFILE 1.0.3
。
- ID 或者引用占用的字节数
DUMP
的时间戳
继续看 HprofIndex.indexRecordsOf()
方法:
Kotlin
fun indexRecordsOf(
hprofSourceProvider: DualSourceProvider,
hprofHeader: HprofHeader,
proguardMapping: ProguardMapping? = null,
indexedGcRootTags: Set<HprofRecordTag> = defaultIndexedGcRootTags()
): HprofIndex {
val reader = StreamingHprofReader.readerFor(hprofSourceProvider, hprofHeader)
val index = HprofInMemoryIndex.indexHprof(
reader = reader,
hprofHeader = hprofHeader,
proguardMapping = proguardMapping,
indexedGcRootTags = indexedGcRootTags
)
return HprofIndex(hprofSourceProvider, hprofHeader, index)
}
StreamingHprofReader
是核心的读取类,HprofInMemoryIndex.indexHprof()
方法也是基于它的读取结果然后把需要的数据存放在内存中,它有两次调用 StreamingHprofReader#readRecords()
方法来读取 Record
,我们先看看 StreamingHprofReader#readRecords()
方法:
Kotlin
@Suppress("ComplexMethod", "NestedBlockDepth")
fun readRecords(
// 需要处理的 RecordTag
recordTags: Set<HprofRecordTag>,
// 需要处理的 Record 会回调给 listener
listener: OnHprofRecordTagListener
): Long {
return sourceProvider.openStreamingSource().use { source ->
val reader = HprofRecordReader(header, source)
// 跳过 Header
reader.skip(header.recordsPosition)
// Local ref optimizations
val intByteSize = INT.byteSize
val identifierByteSize = reader.sizeOf(REFERENCE_HPROF_TYPE)
// 循环读取 Records
while (!source.exhausted()) {
// type of the record
// 读取 Record 的 Tag
val tag = reader.readUnsignedByte()
// number of microseconds since the time stamp in the header
// 跳过时间戳
reader.skip(intByteSize)
// number of bytes that follow and belong to this record
// Record 的 Body 的字节数长度
val length = reader.readUnsignedInt()
// 处理不同的 Record
when (tag) {
STRING_IN_UTF8.tag -> {
if (STRING_IN_UTF8 in recordTags) {
listener.onHprofRecord(STRING_IN_UTF8, length, reader)
} else {
reader.skip(length)
}
}
UNLOAD_CLASS.tag -> {
if (UNLOAD_CLASS in recordTags) {
listener.onHprofRecord(UNLOAD_CLASS, length, reader)
} else {
reader.skip(length)
}
}
LOAD_CLASS.tag -> {
if (LOAD_CLASS in recordTags) {
listener.onHprofRecord(LOAD_CLASS, length, reader)
} else {
reader.skip(length)
}
}
STACK_FRAME.tag -> {
if (STACK_FRAME in recordTags) {
listener.onHprofRecord(STACK_FRAME, length, reader)
} else {
reader.skip(length)
}
}
STACK_TRACE.tag -> {
if (STACK_TRACE in recordTags) {
listener.onHprofRecord(STACK_TRACE, length, reader)
} else {
reader.skip(length)
}
}
HEAP_DUMP.tag, HEAP_DUMP_SEGMENT.tag -> {
// 读取子 Record
val heapDumpStart = reader.bytesRead
var previousTag = 0
var previousTagPosition = 0L
// 循环读取子 Record
while (reader.bytesRead - heapDumpStart < length) {
val heapDumpTagPosition = reader.bytesRead
val heapDumpTag = reader.readUnsignedByte()
when (heapDumpTag) {
ROOT_UNKNOWN.tag -> {
if (ROOT_UNKNOWN in recordTags) {
listener.onHprofRecord(ROOT_UNKNOWN, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_JNI_GLOBAL.tag -> {
if (ROOT_JNI_GLOBAL in recordTags) {
listener.onHprofRecord(ROOT_JNI_GLOBAL, -1, reader)
} else {
reader.skip(identifierByteSize + identifierByteSize)
}
}
ROOT_JNI_LOCAL.tag -> {
if (ROOT_JNI_LOCAL in recordTags) {
listener.onHprofRecord(ROOT_JNI_LOCAL, -1, reader)
} else {
reader.skip(identifierByteSize + intByteSize + intByteSize)
}
}
ROOT_JAVA_FRAME.tag -> {
if (ROOT_JAVA_FRAME in recordTags) {
listener.onHprofRecord(ROOT_JAVA_FRAME, -1, reader)
} else {
reader.skip(identifierByteSize + intByteSize + intByteSize)
}
}
ROOT_NATIVE_STACK.tag -> {
if (ROOT_NATIVE_STACK in recordTags) {
listener.onHprofRecord(ROOT_NATIVE_STACK, -1, reader)
} else {
reader.skip(identifierByteSize + intByteSize)
}
}
ROOT_STICKY_CLASS.tag -> {
if (ROOT_STICKY_CLASS in recordTags) {
listener.onHprofRecord(ROOT_STICKY_CLASS, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_THREAD_BLOCK.tag -> {
if (ROOT_THREAD_BLOCK in recordTags) {
listener.onHprofRecord(ROOT_THREAD_BLOCK, -1, reader)
} else {
reader.skip(identifierByteSize + intByteSize)
}
}
ROOT_MONITOR_USED.tag -> {
if (ROOT_MONITOR_USED in recordTags) {
listener.onHprofRecord(ROOT_MONITOR_USED, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_THREAD_OBJECT.tag -> {
if (ROOT_THREAD_OBJECT in recordTags) {
listener.onHprofRecord(ROOT_THREAD_OBJECT, -1, reader)
} else {
reader.skip(identifierByteSize + intByteSize + intByteSize)
}
}
ROOT_INTERNED_STRING.tag -> {
if (ROOT_INTERNED_STRING in recordTags) {
listener.onHprofRecord(ROOT_INTERNED_STRING, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_FINALIZING.tag -> {
if (ROOT_FINALIZING in recordTags) {
listener.onHprofRecord(ROOT_FINALIZING, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_DEBUGGER.tag -> {
if (ROOT_DEBUGGER in recordTags) {
listener.onHprofRecord(ROOT_DEBUGGER, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_REFERENCE_CLEANUP.tag -> {
if (ROOT_REFERENCE_CLEANUP in recordTags) {
listener.onHprofRecord(ROOT_REFERENCE_CLEANUP, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_VM_INTERNAL.tag -> {
if (ROOT_VM_INTERNAL in recordTags) {
listener.onHprofRecord(ROOT_VM_INTERNAL, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
ROOT_JNI_MONITOR.tag -> {
if (ROOT_JNI_MONITOR in recordTags) {
listener.onHprofRecord(ROOT_JNI_MONITOR, -1, reader)
} else {
reader.skip(identifierByteSize + intByteSize + intByteSize)
}
}
ROOT_UNREACHABLE.tag -> {
if (ROOT_UNREACHABLE in recordTags) {
listener.onHprofRecord(ROOT_UNREACHABLE, -1, reader)
} else {
reader.skip(identifierByteSize)
}
}
CLASS_DUMP.tag -> {
if (CLASS_DUMP in recordTags) {
listener.onHprofRecord(CLASS_DUMP, -1, reader)
} else {
reader.skipClassDumpRecord()
}
}
INSTANCE_DUMP.tag -> {
if (INSTANCE_DUMP in recordTags) {
listener.onHprofRecord(INSTANCE_DUMP, -1, reader)
} else {
reader.skipInstanceDumpRecord()
}
}
OBJECT_ARRAY_DUMP.tag -> {
if (OBJECT_ARRAY_DUMP in recordTags) {
listener.onHprofRecord(OBJECT_ARRAY_DUMP, -1, reader)
} else {
reader.skipObjectArrayDumpRecord()
}
}
PRIMITIVE_ARRAY_DUMP.tag -> {
if (PRIMITIVE_ARRAY_DUMP in recordTags) {
listener.onHprofRecord(PRIMITIVE_ARRAY_DUMP, -1, reader)
} else {
reader.skipPrimitiveArrayDumpRecord()
}
}
PRIMITIVE_ARRAY_NODATA.tag -> {
throw UnsupportedOperationException("$PRIMITIVE_ARRAY_NODATA cannot be parsed")
}
HEAP_DUMP_INFO.tag -> {
if (HEAP_DUMP_INFO in recordTags) {
listener.onHprofRecord(HEAP_DUMP_INFO, -1, reader)
} else {
reader.skipHeapDumpInfoRecord()
}
}
else -> throw IllegalStateException(
"Unknown tag ${
"0x%02x".format(
heapDumpTag
)
} at $heapDumpTagPosition after ${
"0x%02x".format(
previousTag
)
} at $previousTagPosition"
)
}
previousTag = heapDumpTag
previousTagPosition = heapDumpTagPosition
}
}
HEAP_DUMP_END.tag -> {
if (HEAP_DUMP_END in recordTags) {
listener.onHprofRecord(HEAP_DUMP_END, length, reader)
}
}
else -> {
reader.skip(length)
}
}
}
reader.bytesRead
}
}
上面代码读取各种 Record
,根据传入的参数去处理需要处理的 Record
,处理方法也是通过传入的 listener
来处理,每种 Record
的详细读取方法请看我的上一篇文章介绍。
我们再继续看 HprofInMemoryIndex.indexHprof()
方法的实现:
Kotlin
fun indexHprof(
reader: StreamingHprofReader,
hprofHeader: HprofHeader,
proguardMapping: ProguardMapping?,
indexedGcRootTags: Set<HprofRecordTag>
): HprofInMemoryIndex {
// First pass to count and correctly size arrays once and for all.
var maxClassSize = 0L
var maxInstanceSize = 0L
var maxObjectArraySize = 0L
var maxPrimitiveArraySize = 0L
var classCount = 0
var instanceCount = 0
var objectArrayCount = 0
var primitiveArrayCount = 0
var classFieldsTotalBytes = 0
val stickyClassGcRootIds = LongScatterSet()
// 第一次读 Record,只读取实例和 ROOT_STICKY_CLASS
val bytesRead = reader.readRecords(
EnumSet.of(
CLASS_DUMP,
INSTANCE_DUMP,
OBJECT_ARRAY_DUMP,
PRIMITIVE_ARRAY_DUMP,
ROOT_STICKY_CLASS
)
) { tag, _, reader ->
val bytesReadStart = reader.bytesRead
when (tag) {
CLASS_DUMP -> {
// 记录 Class 数量
classCount++
reader.skipClassDumpHeader()
val bytesReadStaticFieldStart = reader.bytesRead
reader.skipClassDumpStaticFields()
reader.skipClassDumpFields()
// 记录单个 Class 占用的最大内存
maxClassSize = max(maxClassSize, reader.bytesRead - bytesReadStart)
// 记录所有 Class Field 占用的内存和
classFieldsTotalBytes += (reader.bytesRead - bytesReadStaticFieldStart).toInt()
}
INSTANCE_DUMP -> {
// 记录普通实例数量
instanceCount++
reader.skipInstanceDumpRecord()
// 记录单个实例占用内存的最大值
maxInstanceSize = max(maxInstanceSize, reader.bytesRead - bytesReadStart)
}
OBJECT_ARRAY_DUMP -> {
// 记录对象数组的数量
objectArrayCount++
reader.skipObjectArrayDumpRecord()
// 记录单个对象数组的最大值
maxObjectArraySize = max(maxObjectArraySize, reader.bytesRead - bytesReadStart)
}
PRIMITIVE_ARRAY_DUMP -> {
// 记录基本类型数组的数量
primitiveArrayCount++
reader.skipPrimitiveArrayDumpRecord()
// 记录基本类型数组的最大值
maxPrimitiveArraySize = max(maxPrimitiveArraySize, reader.bytesRead - bytesReadStart)
}
ROOT_STICKY_CLASS -> {
// StickyClass has only 1 field: id. Our API 23 emulators in CI are creating heap
// dumps with duplicated sticky class roots, up to 30K times for some objects.
// There's no point in keeping all these in our list of roots, 1 per each is enough
// so we deduplicate with stickyClassGcRootIds.
val id = reader.readStickyClassGcRootRecord().id
if (id != ValueHolder.NULL_REFERENCE) {
// 记录 ROOT_STICKY_CLASS 的 ID,其实就是 Class 中静态 Field 的 GCRoot.
stickyClassGcRootIds += id
}
}
else -> {
// Not interesting.
}
}
}
// 根据第一次扫描 Record 结果中的各种最大值来计算储存这些数据需要的字节数,供第二次扫描时储存这些数据时使用。
val bytesForClassSize = byteSizeForUnsigned(maxClassSize)
val bytesForInstanceSize = byteSizeForUnsigned(maxInstanceSize)
val bytesForObjectArraySize = byteSizeForUnsigned(maxObjectArraySize)
val bytesForPrimitiveArraySize = byteSizeForUnsigned(maxPrimitiveArraySize)
val indexBuilderListener = Builder(
longIdentifiers = hprofHeader.identifierByteSize == 8,
maxPosition = bytesRead,
classCount = classCount,
instanceCount = instanceCount,
objectArrayCount = objectArrayCount,
primitiveArrayCount = primitiveArrayCount,
bytesForClassSize = bytesForClassSize,
bytesForInstanceSize = bytesForInstanceSize,
bytesForObjectArraySize = bytesForObjectArraySize,
bytesForPrimitiveArraySize = bytesForPrimitiveArraySize,
classFieldsTotalBytes = classFieldsTotalBytes,
stickyClassGcRootIds
)
val recordTypes = EnumSet.of(
STRING_IN_UTF8,
LOAD_CLASS,
CLASS_DUMP,
INSTANCE_DUMP,
OBJECT_ARRAY_DUMP,
PRIMITIVE_ARRAY_DUMP
) + HprofRecordTag.rootTags.intersect(indexedGcRootTags)
// 第二次读 Records,读字符串,LOAD_CLASS,所有的实例和所有的 GCRoot。
reader.readRecords(recordTypes, indexBuilderListener)
return indexBuilderListener.buildIndex(proguardMapping, hprofHeader)
}
}
第一次扫描 Record
只处理实例和 ROOT_STICKY_CLASS
(也就是Class
静态 Field
的 GCRoot
),这个过程中会记录各种类型实例的数量和各种最大值,其中包括 Class
数量,Class
实例最大的占用内存,所有 Class
的 Field
占用内存和,普通实例数量,普通实例最大的占用内存,对象数组的数量,对象数组的最大值,基本类型数组的数量,基本类型数组的最大值。通过计算这些数据的最大值供第二次扫描时计算储存这些数据需要多少个字节,计算的方法就是 byteSizeForUnsigned()
,这里不再细看了。
第二次扫描 Record
会处理字符串,所有类型的实例和所有的 GCRoot
,处理的 listener
是 Builder
。我们来看看它都做了些什么。
Kotlin
override fun onHprofRecord(
tag: HprofRecordTag,
length: Long,
reader: HprofRecordReader
) {
when (tag) {
STRING_IN_UTF8 -> {
// 保存字符串到 hprofStringCache
hprofStringCache[reader.readId()] = reader.readUtf8(length - identifierSize)
}
LOAD_CLASS -> {
// classSerialNumber
reader.skip(INT.byteSize)
val id = reader.readId()
// stackTraceSerialNumber
reader.skip(INT.byteSize)
val classNameStringId = reader.readId()
// 映射 class 的 ID 和 class name 的 id
classNames[id] = classNameStringId
}
ROOT_UNKNOWN -> {
reader.readUnknownGcRootRecord().apply {
// 如果不为空,将 GCRoot 保存在 gcRoots 变量中
if (id != ValueHolder.NULL_REFERENCE) {
gcRoots += this
}
}
}
// ... 忽略很多的 GCRoot 的处理,它们的处理都是和 ROOT_UNKNOWN 一样。
CLASS_DUMP -> {
val bytesReadStart = reader.bytesRead
val id = reader.readId()
// stack trace serial number
reader.skip(INT.byteSize)
val superclassId = reader.readId()
reader.skip(5 * identifierSize)
// instance size (in bytes)
// Useful to compute retained size
val instanceSize = reader.readInt()
reader.skipClassDumpConstantPool()
val startPosition = classFieldsIndex
val bytesReadFieldStart = reader.bytesRead
// 读取所有静态 Field 到 classFieldBytes 中
reader.copyToClassFields(2)
val staticFieldCount = lastClassFieldsShort().toInt() and 0xFFFF
for (i in 0 until staticFieldCount) {
reader.copyToClassFields(identifierSize)
reader.copyToClassFields(1)
val type = classFieldBytes[classFieldsIndex - 1].toInt() and 0xff
if (type == PrimitiveType.REFERENCE_HPROF_TYPE) {
reader.copyToClassFields(identifierSize)
} else {
reader.copyToClassFields(PrimitiveType.byteSizeByHprofType.getValue(type))
}
}
// 读取所有的成员 Field 到 classFieldBytes 中
reader.copyToClassFields(2)
val fieldCount = lastClassFieldsShort().toInt() and 0xFFFF
for (i in 0 until fieldCount) {
reader.copyToClassFields(identifierSize)
reader.copyToClassFields(1)
}
// 记录静态和成员 Field 占用的空间大小
val fieldsSize = (reader.bytesRead - bytesReadFieldStart).toInt()
// 记录当前 Record 占用的空间大小
val recordSize = reader.bytesRead - bytesReadStart
// Class 的基本信息写入到 classIndex 中,id 为 key
classIndex.append(id)
.apply {
// Record 开始读取的位置(HPROF 文件中的位置)
writeTruncatedLong(bytesReadStart, positionSize)
// SuperClass 的 Id
writeId(superclassId)
// Class 实例大小
writeInt(instanceSize)
// Record 的大小
writeTruncatedLong(recordSize, bytesForClassSize)
// Feild 开始读取的位置(classFieldBytes 成员变量中的位置)
writeTruncatedLong(startPosition.toLong(), classFieldsIndexSize)
}
require(startPosition + fieldsSize == classFieldsIndex) {
"Expected $classFieldsIndex to have moved by $fieldsSize and be equal to ${startPosition + fieldsSize}"
}
}
INSTANCE_DUMP -> {
val bytesReadStart = reader.bytesRead
val id = reader.readId()
reader.skip(INT.byteSize)
val classId = reader.readId()
val remainingBytesInInstance = reader.readInt()
// 跳过实例中的数据(也就是 Field 的值的数据)
reader.skip(remainingBytesInInstance)
val recordSize = reader.bytesRead - bytesReadStart
// 普通 Instance 的数据写入到 instanceIndex 中。
instanceIndex.append(id)
.apply {
// Record 开始读取的位置(HPROF 文件中的位置)
writeTruncatedLong(bytesReadStart, positionSize)
// ClassId
writeId(classId)
writeTruncatedLong(recordSize, bytesForInstanceSize)
}
}
OBJECT_ARRAY_DUMP -> {
val bytesReadStart = reader.bytesRead
val id = reader.readId()
// stack trace serial number
reader.skip(INT.byteSize)
val size = reader.readInt()
val arrayClassId = reader.readId()
// 跳过数组中的内容
reader.skip(identifierSize * size)
// record size - (ID+INT + INT + ID)
val recordSize = reader.bytesRead - bytesReadStart
// 普通对象数组写入到 objectArrayIndex
objectArrayIndex.append(id)
.apply {
// Record 开始读取的位置(HPROF 文件中的位置)
writeTruncatedLong(bytesReadStart, positionSize)
writeId(arrayClassId)
writeTruncatedLong(recordSize, bytesForObjectArraySize)
}
}
PRIMITIVE_ARRAY_DUMP -> {
val bytesReadStart = reader.bytesRead
val id = reader.readId()
reader.skip(INT.byteSize)
val size = reader.readInt()
val type = PrimitiveType.primitiveTypeByHprofType.getValue(reader.readUnsignedByte())
// 跳过数组中的内容
reader.skip(size * type.byteSize)
val recordSize = reader.bytesRead - bytesReadStart
// 基本类型数组写入到 primitiveArrayIndex
primitiveArrayIndex.append(id)
.apply {
// Record 开始读取的位置(HPROF 文件中的位置)
writeTruncatedLong(bytesReadStart, positionSize)
// 基本类型
writeByte(type.ordinal.toByte())
writeTruncatedLong(recordSize, bytesForPrimitiveArraySize)
}
}
else -> {
// Not interesting.
}
}
}
第二次扫描就开始记录 HPROF
文件中的一些重要的数据了,这些数据都是记录在 Builder
对象中。首先将字符串 ID
和 String
的映射记录在 hprofStringCache
中;ClassId
和对应的 StringId
的映射记录在 classNames
中;所有的 GCRoot
都记录在 gcRoots
中;Class
的基本信息记录在 classIndex
中;Class
对应的成员和静态 Field
都记录在 classFieldBytes
中,他是一个字节数组,在 classIndex
中有记录对应 Field
开始读取的位置;普通实例记录在 instanceIndex
;普通实例数组记录在 objectArrayIndex
;基本类型数组记录在 primitiveArrayIndex
。
因为所有的数据都是保存在 Builder
中,我们再看看它的 buildIndex()
方法:
Kotlin
fun buildIndex(
proguardMapping: ProguardMapping?,
hprofHeader: HprofHeader
): HprofInMemoryIndex {
require(classFieldsIndex == classFieldBytes.size) {
"Read $classFieldsIndex into fields bytes instead of expected ${classFieldBytes.size}"
}
val sortedInstanceIndex = instanceIndex.moveToSortedMap()
val sortedObjectArrayIndex = objectArrayIndex.moveToSortedMap()
val sortedPrimitiveArrayIndex = primitiveArrayIndex.moveToSortedMap()
val sortedClassIndex = classIndex.moveToSortedMap()
// Passing references to avoid copying the underlying data structures.
return HprofInMemoryIndex(
positionSize = positionSize,
hprofStringCache = hprofStringCache,
classNames = classNames,
classIndex = sortedClassIndex,
instanceIndex = sortedInstanceIndex,
objectArrayIndex = sortedObjectArrayIndex,
primitiveArrayIndex = sortedPrimitiveArrayIndex,
gcRoots = gcRoots,
proguardMapping = proguardMapping,
bytesForClassSize = bytesForClassSize,
bytesForInstanceSize = bytesForInstanceSize,
bytesForObjectArraySize = bytesForObjectArraySize,
bytesForPrimitiveArraySize = bytesForPrimitiveArraySize,
useForwardSlashClassPackageSeparator = hprofHeader.version != ANDROID,
classFieldsReader = ClassFieldsReader(identifierSize, classFieldBytes),
classFieldsIndexSize = classFieldsIndexSize,
stickyClassGcRootIds = stickyClassGcRootIds,
)
}
}
扫描的各种数据又封装在 HprofInMemoryIndex
中。
Kotlin
fun indexRecordsOf(
hprofSourceProvider: DualSourceProvider,
hprofHeader: HprofHeader,
proguardMapping: ProguardMapping? = null,
indexedGcRootTags: Set<HprofRecordTag> = defaultIndexedGcRootTags()
): HprofIndex {
val reader = StreamingHprofReader.readerFor(hprofSourceProvider, hprofHeader)
val index = HprofInMemoryIndex.indexHprof(
reader = reader,
hprofHeader = hprofHeader,
proguardMapping = proguardMapping,
indexedGcRootTags = indexedGcRootTags
)
return HprofIndex(hprofSourceProvider, hprofHeader, index)
}
然后 HprofInMemoryIndex
又被封装在 HprofIndex
中。
后面又会调用 HprofIndex#openHeapGraph()
方法:
Kotlin
fun openHeapGraph(): CloseableHeapGraph {
val reader = RandomAccessHprofReader.openReaderFor(sourceProvider, header)
return HprofHeapGraph(header, reader, index)
}
然后解析的数据又全部被封装在 HprofHeapGraph
中,后面检查泄漏的实例时,很多的操作都需要通过 HprofHeapGraph
。
最后
本篇文章主要介绍了如何解析 HPROF
文件,本来计划把如何查找泄漏的对象也讲完,不过本篇文章有点长了,所以下一篇文章再介绍如何查找泄漏的对象。