learning_gem5 part2_06 创建一个简单的缓存对象

在本章中，我们将基于上一章创建的内存对象框架，为其添加缓存逻辑 【注，仅缓存，单核cpu，不需要考虑一致性】。

SimpleCache SimObject

创建了 SConscript 文件（你可以在此处下载）之后，我们可以创建 SimObject Python 文件。我们将这个简单内存对象称为 SimpleCache，并在 src/learning_gem5/simple_cache 目录下创建 SimObject Python 文件。

python 复制代码

from m5.params import *
from m5.proxy import *
from MemObject import MemObject

class SimpleCache(MemObject):
    type = 'SimpleCache'
    cxx_header = "learning_gem5/simple_cache/simple_cache.hh"

    cpu_side = VectorResponsePort("CPU side port, receives requests")
    mem_side = RequestPort("Memory side port, sends requests")

    latency = Param.Cycles(1, "Cycles taken on a hit or to resolve a miss")

    size = Param.MemorySize('16kB', "The size of the cache")

    system = Param.System(Parent.any, "The system this cache is part of")

这个 SimObject 文件与上一章的相比有几个不同之处。首先，我们多了几个参数。即，缓存访问的延迟和缓存的大小。关于这类 SimObject 参数的更多细节，请参考parameters-chapter。

接下来，我们包含了一个系统参数，它是一个指向该缓存所连接的主系统的指针。这是必需的，以便我们在初始化缓存时能从系统对象中获取缓存块大小。为了引用此缓存连接的系统对象，我们使用了一个特殊的代理参数。在这个例子中，我们使用了 Parent.any。

++在 Python 配置文件中，当实例化一个 SimpleCache 时，这个代理参数会搜索SimpleCache 实例的所有父对象，以找到****匹配 System 类型++ **++的 SimObject++。**由于我们经常++使用 System 作为根 SimObject++ ，所以你++经常会看到使用此代理参数解析系统参数++。

SimpleCache 和 SimpleMemobj 之间的第三个也是最后一个区别是，SimpleCache 没有使用两个命名的 CPU 端口（inst_port 和 data_port），而是使用了另一个特殊参数：VectorPort。VectorPort 的行为与常规端口类似（例如，它们通过 getPort 解析），但它们允许此对象连接到多个对等体。然后，在解析函数中，我们之前忽略的参数（PortID idx）被用来区分不同的端口。通过使用向量端口，这个缓存可以比 SimpleMemobj 更灵活地连接到系统中。

实现 SimpleCache

SimpleCache 的大部分代码与 SimpleMemobj 相同。构造函数和关键内存对象函数中有一些变化。

首先，我们需要在构造函数中动态创建 CPU 端端口，并根据 SimObject 参数初始化额外的成员函数。

cpp 复制代码

SimpleCache::SimpleCache(SimpleCacheParams *params) :
    MemObject(params),
    latency(params->latency),
    blockSize(params->system->cacheLineSize()),
    capacity(params->size / blockSize),
    memPort(params->name + ".mem_side", this),
    blocked(false), outstandingPacket(nullptr), waitingPortId(-1)
{
    for (int i = 0; i < params->port_cpu_side_connection_count; ++i) {
        cpuPorts.emplace_back(name() + csprintf(".cpu_side[%d]", i), i, this);
    }
}

【注，这里的 params->port_cpu_side_connection_count，怎么确定？】

在这个函数中，我们使用系统参数中的 cacheLineSize 来设置此缓存的 blockSize。我们还根据块大小和参数初始化容量，并初始化下面将需要的其他成员变量。最后，++我们必须根据连接到该对象的连接数量创建若干个 CPUSidePort++ 。++由于 cpu_side 端口在 SimpleCache Python 文件中被声明为 VectorResponsePort，参数自动拥有一个变量 port_cpu_side_connection_count++ 。这是基于参数的 Python 名称【注，VectorResponsePort】。对于每个连接，我们++将一个新的 CPUSidePort 添加到在 SimpleCache 类中声明的 cpuPorts 向量中++。

我们还向 CPUSidePort ++添加了一个额外的成员变量来保存其 id【注，int id】，并将其作为参数添加到其构造函数中++。

接下来，我们需要实现 getPort 函数。在内存端，这很简单，因为只有一个端口。然而，在 CPU 端，我们现在需要返回与请求的ID对应的端口。

cpp 复制代码

Port &
SimpleCache::getPort(const std::string &if_name, PortID idx)
{
    if (if_name == "mem_side") {
        panic_if(idx != InvalidPortID,
                 "Mem side of simple cache not a vector port");
        return memPort;
    } else if (if_name == "cpu_side" && idx < cpuPorts.size()) {
        return cpuPorts[idx];
    } else {
        return ClockedObject::getPort(if_name, idx);
    }
}

CPUSidePort 和 MemSidePort 的实现与 SimpleMemobj 中的几乎相同。唯一的区别是，我们需要向 handleRequest 添加一个额外的参数，即请求来源端口的 id。没有这个 id，我们将无法将响应转发到正确的端口。SimpleMemobj 根据原始请求是指令访问还是数据访问来判断向哪个端口发送回复。然而，这些信息对 SimpleCache 无用，因为它使用的是端口向量而不是命名端口。

新的 handleRequest 函数与 SimpleMemobj 中的 handleRequest 函数相比有两处不同。首先，如上所述，它存储了请求的端口 id。由于 SimpleCache 是阻塞的，并且一次只允许一个未完成的请求，我们只需要保存一个端口 id。

其次，访问缓存需要时间。因此，我们需要考虑访问缓存标签 和缓存数据请求的延迟 。我们为此向缓存对象添加了一个额外参数 ，并且在 handleRequest 中，我们现在++使用一个事件来将请求阻塞所需的时间++ 。我们安排一个新事件在 latency 个周期后发生。clockEdge 函数返回未来第 n 个周期发生时的 tick 值。

cpp 复制代码

bool
SimpleCache::handleRequest(PacketPtr pkt, int port_id)
{
    if (blocked) {
        return false;
    }
    DPRINTF(SimpleCache, "Got request for addr %#x\n", pkt->getAddr());

    blocked = true;
    waitingPortId = port_id;

    schedule(new AccessEvent(this, pkt), clockEdge(latency));

    return true;
}

AccessEvent 比我们在 events-chapter 中使用的 EventWrapper 稍微复杂一些。在 SimpleCache 中，我们将使用一个新类，而不是使用 EventWrapper。我们不能使用 EventWrapper 的原因是需要将数据包（pkt）从 handleRequest 传递到事件处理函数。以下代码是 AccessEvent 类。我们只需要实现 process 函数，该函数调用我们想要用作事件处理器的函数，在这个例子中是 accessTiming。我们还将 AutoDelete 标志传递给事件构造函数 ，这样我们就不需要担心释放动态创建对象的内存 。++事件代码在执行 process 函数后将自动删除该对象++。

cpp 复制代码

class AccessEvent : public Event
{
  private:
    SimpleCache *cache;
    PacketPtr pkt;
  public:
    AccessEvent(SimpleCache *cache, PacketPtr pkt) :
        Event(Default_Pri, AutoDelete), cache(cache), pkt(pkt)
    { }
    void process() override {
        cache->accessTiming(pkt);
    }
};

现在，我们需要实现事件处理器 accessTiming。

cpp 复制代码

void
SimpleCache::accessTiming(PacketPtr pkt)
{
    bool hit = accessFunctional(pkt);
    if (hit) {
        pkt->makeResponse();
        sendResponse(pkt);
    } else {
        <miss handling>
    }
}

这个函数首先对缓存进行功能访问【注，体现module的具体作用的函数】。函数 accessFunctional（下面描述）++执行缓存的功能访问，在命中时读取或写入缓存，或者在未命中时返回访问未命中++。

如果访问命中，我们只需要响应数据包。要响应，首先必须在数据包上调用 makeResponse 函数。这将数据包从请求数据包转换为响应数据包。例如，如果数据包中的内存命令是 ReadReq，它会被转换为 ReadResp。写入的行为类似。然后，我们可以将响应发送回 CPU。

sendResponse 函数执行与 SimpleMemobj 中的 handleResponse 函数相同的操作，只是它使用 waitingPortId 将数据包发送到正确的端口。在这个函数中，我们需要在调用 sendPacket 之前将 SimpleCache 标记为未阻塞 ，以防 CPU 端的对等体立即调用 sendTimingReq。然后，如果 SimpleCache 现在可以接收请求并且端口需要发送重试，我们尝试向 CPU 端端口发送重试。

cpp 复制代码

void SimpleCache::sendResponse(PacketPtr pkt)
{
    int port = waitingPortId;

    blocked = false;
    waitingPortId = -1;

    cpuPorts[port].sendPacket(pkt);
    for (auto& port : cpuPorts) {
        port.trySendRetry();
    }
}

回到 accessTiming 函数，我们现在需要处理缓存缺失的情况。在缺失时，我们首先必须检查缺失的数据包是否针对整个缓存块。如果数据包是对齐的并且请求的大小等于缓存块的大小，那么我们可以像在 SimpleMemobj 中那样简单地将请求转发到内存。

然而，如果数据包小于缓存块，那么我们需要创建一个新的数据包来从内存读取整个缓存块。在这里，无论数据包是读请求还是写请求，我们都向内存发送读请求，以将缓存块的数据加载到缓存中。如果是写操作，它将在我们从内存加载数据后在缓存中发生。

然后，我们创建一个新的数据包，其大小为 blockSize，并调用 allocate 函数在 Packet 对象中分配内存，用于存放我们将从内存读取的数据。注意：当我们释放数据包时，此内存会被释放。我们在数据包中使用原始请求对象，++以便内存端对象了解原始的请求者和请求类型，用于统计++。

最后，我们将原始数据包指针（pkt）保存在成员变量 outstandingPacket 中，以便在 SimpleCache 收到响应时可以恢复它。然后，我们通过内存端端口发送新的数据包。

cpp 复制代码

void
SimpleCache::accessTiming(PacketPtr pkt)
{
    bool hit = accessFunctional(pkt);
    if (hit) {
        pkt->makeResponse();
        sendResponse(pkt);
    } else {
        Addr addr = pkt->getAddr();
        Addr block_addr = pkt->getBlockAddr(blockSize);
        unsigned size = pkt->getSize();
        if (addr == block_addr && size == blockSize) {
            DPRINTF(SimpleCache, "forwarding packet\n");
            memPort.sendPacket(pkt);
        } else {
            DPRINTF(SimpleCache, "Upgrading packet to block size\n");
            panic_if(addr - block_addr + size > blockSize,
                     "Cannot handle accesses that span multiple cache lines");

            assert(pkt->needsResponse());
            MemCmd cmd;
            if (pkt->isWrite() || pkt->isRead()) {
                cmd = MemCmd::ReadReq;
            } else {
                panic("Unknown packet type in upgrade size");
            }

            PacketPtr new_pkt = new Packet(pkt->req, cmd, blockSize);
            new_pkt->allocate();

            outstandingPacket = pkt;

            memPort.sendPacket(new_pkt);
        }
    }
}

当收到来自内存的响应时，我们知道这是由于缓存缺失引起的。第一步是将响应的数据包插入缓存。

然后，要么存在 outstandingPacket，在这种情况下我们需要将该数据包转发给原始请求者；要么没有 outstandingPacket，这意味着我们应该**将响应中的 pkt 转发给原始请求者【**注，之前没有整line替换，现在也不用替换了】。

如果我们作为响应接收的数据包是一个升级包（因为原始请求小于缓存行），那么我们需要将新数据复制到 outstandingPacket 数据包中，或者在写入时写入缓存。然后，我们需要删除我们在缺失处理逻辑中创建的新数据包【注，一个临时性的包】。

cpp 复制代码

bool
SimpleCache::handleResponse(PacketPtr pkt)
{
    assert(blocked);
    DPRINTF(SimpleCache, "Got response for addr %#x\n", pkt->getAddr());
    insert(pkt);

    if (outstandingPacket != nullptr) {
        accessFunctional(outstandingPacket);
        outstandingPacket->makeResponse();
        delete pkt;
        pkt = outstandingPacket;
        outstandingPacket = nullptr;
    } // else, pkt contains the data it needs

    sendResponse(pkt);

    return true;
}

功能性缓存逻辑

现在，我们需要再实现两个函数：accessFunctional 和 insert。这两个函数构成了缓存逻辑的关键组成部分【注，insert 应该是 functional的一个组成部分，被func调用】。

首先，为了在功能上更新缓存，我们需要缓存内容的存储。最简单的缓存存储是一个从地址映射到数据的映射（哈希表）【注，这里使用Hash，仅仅是为了软件实现的速度，跟cache 硬件本身无关；要不就用链表】。因此，我们将以下成员添加到 SimpleCache。

cpp 复制代码

std::unordered_map<Addr, uint8_t*> cacheStore;

要访问缓存，我们首先检查映射中是否存在与数据包中地址匹配的条目。我们使用 Packet 类型的 getBlockAddr 函数来获取块对齐的地址。然后，我们简单地在 map 中搜索该地址。如果我们没有找到该地址，则此函数返回 false，表示数据不在缓存中，即未命中。

否则，如果数据包是写请求，我们需要更新缓存中的数据 。为此，我们将数据包中的数据写入缓存。我们使用 writeDataToBlock 函数，该函数将数据包中的数据写入到可能更大的数据块中的****写入偏移处。此函数接受缓存块偏移和块大小（作为参数），并将正确的偏移写入作为第一个参数传递的指针。

如果数据包是读请求，我们需要使用缓存中的数据更新数据包的数据 。setDataFromBlock 函数执行与 writeDataToBlock 函数相同的偏移计算，但使用第一个参数指针中的数据写入数据包。

cpp 复制代码

bool
SimpleCache::accessFunctional(PacketPtr pkt)
{
    Addr block_addr = pkt->getBlockAddr(blockSize);
    auto it = cacheStore.find(block_addr);
    if (it != cacheStore.end()) {
        if (pkt->isWrite()) {
            pkt->writeDataToBlock(it->second, blockSize);
        } else if (pkt->isRead()) {
            pkt->setDataFromBlock(it->second, blockSize);
        } else {
            panic("Unknown packet type!");
        }
        return true;
    }
    return false;
}

最后，我们还需要实现 insert 函数。每次内存端端口响应请求时都会调用此函数。

第一步是检查缓存当前是否已满 。如果缓存的条目（块）数量超过了 SimObject 参数设置的缓存容量，那么我们需要驱逐某些条目。以下代码通过利用 C++ unordered_map 的哈希表实现来驱逐一个随机条目【注，这里使用随机策略】。

在驱逐时，如果数据已被更新，我们需要将数据写回后备存储器。为此，我们创建一个新的 Request-Packet 对。该数据包使用一个新的内存命令：MemCmd::WritebackDirty。然后，我们通过内存端端口（memPort）发送数据包，并删除缓存存储映射中的条目。

然后，在可能驱逐了一个块之后，我们将新地址添加到缓存中。为此，我们只需为该块分配空间并向 map 中添加一个条目。最后，我们将响应数据包中的数据写入新分配的块中。由于我们确保在缓存缺失逻辑中，如果数据包小于缓存块，则会创建一个新数据包，因此这些数据的大小保证是缓存块的大小。

cpp 复制代码

void
SimpleCache::insert(PacketPtr pkt)
{
    if (cacheStore.size() >= capacity) {
        // Select random thing to evict. This is a little convoluted since we
        // are using a std::unordered_map. See http://bit.ly/2hrnLP2
        int bucket, bucket_size;
        do {
            bucket = random_mt.random(0, (int)cacheStore.bucket_count() - 1);
        } while ( (bucket_size = cacheStore.bucket_size(bucket)) == 0 );
        auto block = std::next(cacheStore.begin(bucket),
                               random_mt.random(0, bucket_size - 1));

        RequestPtr req = new Request(block->first, blockSize, 0, 0);
        PacketPtr new_pkt = new Packet(req, MemCmd::WritebackDirty, blockSize);
        new_pkt->dataDynamic(block->second); // This will be deleted later

        DPRINTF(SimpleCache, "Writing packet back %s\n", pkt->print());
        memPort.sendTimingReq(new_pkt);

        cacheStore.erase(block->first);
    }
    uint8_t *data = new uint8_t[blockSize];
    cacheStore[pkt->getAddr()] = data;

    pkt->writeDataToBlock(data, blockSize);
}

为缓存创建配置文件

我们实现的最后一步是创建一个使用我们缓存的新 Python 配置脚本。我们可以使用上一章的概要作为起点。唯一的区别是我们可能想要设置此缓存的参数（例如，将缓存大小设置为 1kB） ，并且不使用命名端口（data_port 和 inst_port），而是使用 cpu_side 端口两次 。由于 cpu_side 是一个 VectorPort，它将自动创建多个端口连接。

python 复制代码

import m5
from m5.objects import *

...

system.cache = SimpleCache(size='1kB')

system.cpu.icache_port = system.cache.cpu_side
system.cpu.dcache_port = system.cache.cpu_side

system.membus = SystemXBar()

system.cache.mem_side = system.membus.cpu_side_ports

...

Python 配置文件可以在此处下载。

运行此脚本应该会从 hello 二进制文件产生预期的输出。

复制代码

gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Jan 10 2017 17:38:15
gem5 started Jan 10 2017 17:40:03
gem5 executing on chinook, pid 29031
command line: build/X86/gem5.opt configs/learning_gem5/part2/simple_cache.py

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
warn: CoherentXBar system.membus has no snooping ports attached!
warn: ClockedObject: More than one power state change request encountered within the same simulation tick
Beginning simulation!
info: Entering event queue @ 0.  Starting simulation...
Hello world!
Exiting @ tick 56082000 because target called exit()

修改缓存的大小，例如改为 128 KB，应该会提高系统的性能。

复制代码

gem5 Simulator System.  http://gem5.org
gem5 is copyrighted software; use the --copyright option for details.

gem5 compiled Jan 10 2017 17:38:15
gem5 started Jan 10 2017 17:41:10
gem5 executing on chinook, pid 29037
command line: build/X86/gem5.opt configs/learning_gem5/part2/simple_cache.py

Global frequency set at 1000000000000 ticks per second
warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (512 Mbytes)
0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
warn: CoherentXBar system.membus has no snooping ports attached!
warn: ClockedObject: More than one power state change request encountered within the same simulation tick
Beginning simulation!
info: Entering event queue @ 0.  Starting simulation...
Hello world!
Exiting @ tick 32685000 because target called exit()

向缓存添加统计信息

了解系统的整体执行时间是一个重要的指标。但是，您可能还想包含其他统计信息，例如缓存的命中率和未命中率。为此，我们需要向 SimpleCache 对象添加一些统计信息。

首先，我们需要在 SimpleCache 对象中声明统计信息。它们属于 Stats 命名空间。在这个例子中，我们将创建四个统计信息。命中的次数 和未命中的次数 只是简单的 Scalar 计数。我们还将添加一个 missLatency，它是一个满足未命中所需时间的直方图。最后，我们将添加一个特殊的统计信息，称为 Formula（公式），用于表示 hitRatio，它是其他统计信息（命中数和未命中数）的组合。

python 复制代码

class SimpleCache : public MemObject
{
  private:
    ...

    Tick missTime; // To track the miss latency

    Stats::Scalar hits;
    Stats::Scalar misses;
    Stats::Histogram missLatency;
    Stats::Formula hitRatio;

  public:
    ...

    void regStats() override;
};

接下来，我们必须定义覆盖 regStats 的函数，以便统计信息在 gem5 的统计基础设施中注册。在这里，对于每个统计信息，我们基于"父" SimObject 名称和描述为其命名。对于直方图统计信息，我们还需要用我们想要的桶数来初始化它。最后，对于 Formula，我们只需要用代码写下公式。

cpp 复制代码

void
SimpleCache::regStats()
{
    // If you don't do this you get errors about uninitialized stats.
    MemObject::regStats();

    hits.name(name() + ".hits")
        .desc("Number of hits")
        ;

    misses.name(name() + ".misses")
        .desc("Number of misses")
        ;

    missLatency.name(name() + ".missLatency")
        .desc("Ticks for misses to the cache")
        .init(16) // number of buckets
        ;

    hitRatio.name(name() + ".hitRatio")
        .desc("The ratio of hits to the total accesses to the cache")
        ;

    hitRatio = hits / (hits + misses);

}

最后，我们需要在代码中更新统计信息。在 accessTiming 类中，我们可以在命中和未命中时分别增加命中和未命中计数。此外，在未命中时，我们保存当前时间以便测量延迟。

cpp 复制代码

void
SimpleCache::accessTiming(PacketPtr pkt)
{
    bool hit = accessFunctional(pkt);
    if (hit) {
        hits++; // update stats
        pkt->makeResponse();
        sendResponse(pkt);
    } else {
        misses++; // update stats
        missTime = curTick();
        ...

然后，当我们得到响应时，需要将测量的延迟添加到我们的直方图中。为此，我们使用 sample 函数。这向直方图添加一个数据点。该直方图会自动调整桶的大小以适应接收到的数据。

cpp 复制代码

bool
SimpleCache::handleResponse(PacketPtr pkt)
{
    insert(pkt);

    missLatency.sample(curTick() - missTime);
    ...

SimpleCache 头文件的完整代码可以在此处下载，SimpleCache 实现的完整代码可以在此处下载。

现在，如果我们运行上面的配置文件，我们可以检查 stats.txt 文件中的统计信息。对于 1 KB 的情况，我们得到以下统计信息。91% 的访问是命中，平均未命中延迟是 53334 ticks（或 53 ns）。

复制代码

system.cache.hits                                8431                       # Number of hits
system.cache.misses                               877                       # Number of misses
system.cache.missLatency::samples                 877                       # Ticks for misses to the cache
system.cache.missLatency::mean           53334.093501                       # Ticks for misses to the cache
system.cache.missLatency::gmean          44506.409356                       # Ticks for misses to the cache
system.cache.missLatency::stdev          36749.446469                       # Ticks for misses to the cache
system.cache.missLatency::0-32767                 305     34.78%     34.78% # Ticks for misses to the cache
system.cache.missLatency::32768-65535             365     41.62%     76.40% # Ticks for misses to the cache
system.cache.missLatency::65536-98303             164     18.70%     95.10% # Ticks for misses to the cache
system.cache.missLatency::98304-131071             12      1.37%     96.47% # Ticks for misses to the cache
system.cache.missLatency::131072-163839            17      1.94%     98.40% # Ticks for misses to the cache
system.cache.missLatency::163840-196607             7      0.80%     99.20% # Ticks for misses to the cache
system.cache.missLatency::196608-229375             0      0.00%     99.20% # Ticks for misses to the cache
system.cache.missLatency::229376-262143             0      0.00%     99.20% # Ticks for misses to the cache
system.cache.missLatency::262144-294911             2      0.23%     99.43% # Ticks for misses to the cache
system.cache.missLatency::294912-327679             4      0.46%     99.89% # Ticks for misses to the cache
system.cache.missLatency::327680-360447             1      0.11%    100.00% # Ticks for misses to the cache
system.cache.missLatency::360448-393215             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::393216-425983             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::425984-458751             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::458752-491519             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::491520-524287             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::total                   877                       # Ticks for misses to the cache
system.cache.hitRatio                        0.905780                       # The ratio of hits to the total access

而当使用 128 KB 缓存时，我们得到了稍高的命中率。看来我们的缓存工作正常！

复制代码

system.cache.hits                                8944                       # Number of hits
system.cache.misses                               364                       # Number of misses
system.cache.missLatency::samples                 364                       # Ticks for misses to the cache
system.cache.missLatency::mean           64222.527473                       # Ticks for misses to the cache
system.cache.missLatency::gmean          61837.584812                       # Ticks for misses to the cache
system.cache.missLatency::stdev          27232.443748                       # Ticks for misses to the cache
system.cache.missLatency::0-32767                   0      0.00%      0.00% # Ticks for misses to the cache
system.cache.missLatency::32768-65535             254     69.78%     69.78% # Ticks for misses to the cache
system.cache.missLatency::65536-98303             106     29.12%     98.90% # Ticks for misses to the cache
system.cache.missLatency::98304-131071              0      0.00%     98.90% # Ticks for misses to the cache
system.cache.missLatency::131072-163839             0      0.00%     98.90% # Ticks for misses to the cache
system.cache.missLatency::163840-196607             0      0.00%     98.90% # Ticks for misses to the cache
system.cache.missLatency::196608-229375             0      0.00%     98.90% # Ticks for misses to the cache
system.cache.missLatency::229376-262143             0      0.00%     98.90% # Ticks for misses to the cache
system.cache.missLatency::262144-294911             2      0.55%     99.45% # Ticks for misses to the cache
system.cache.missLatency::294912-327679             1      0.27%     99.73% # Ticks for misses to the cache
system.cache.missLatency::327680-360447             1      0.27%    100.00% # Ticks for misses to the cache
system.cache.missLatency::360448-393215             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::393216-425983             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::425984-458751             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::458752-491519             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::491520-524287             0      0.00%    100.00% # Ticks for misses to the cache
system.cache.missLatency::total                   364                       # Ticks for misses to the cache
system.cache.hitRatio                        0.960894                       # The ratio of hits to the total access