在Intel Kaby Lake架构上获取最后一级缓存未命中次数的确切代码是什么

本文介绍了在Intel Kaby Lake架构上获取最后一级缓存未命中次数的确切代码是什么的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我读了一篇有趣的论文，标题为对末级缓存的高分辨率侧通道攻击，并想找出我自己的机器（即Intel Core i7-7500U（Kaby Lake）的索引哈希函数

I read an interesting paper, entitled "A High-Resolution Side-Channel Attack on Last-Level Cache", and wanted to find out the index hash function for my own machine—i.e., Intel Core i7-7500U (Kaby Lake architecture)—following the leads from this work.

要对哈希函数进行反向工程，本文提到的第一步是：

To reverse-engineer the hash function, the paper mentions the first step as:

 for (n=16; ; n++) 
 {
   // ignore any miss on first run
   for (fill=0; !fill; fill++) 
   {
     // set pmc to count LLC miss
     reset_pmc();
     for (a=0; a<n; a++)
       // set_count*line_size=2^19
       load(a*2^19);
   }

   // get the LLC miss count
   if (read_pmc()>0) 
   {
     min = n;
     break;
   }
 }

如何对进行编码C语言中的reset_pmc（）和 read_pmc（）？到目前为止，从我在线阅读的所有内容来看，我认为它需要内联汇编代码，但是我不知道要使用什么指令来获取LLC未命中计数。如果有人可以指定这两个步骤的代码，我将不得不承担责任。

How can I code the reset_pmc() and read_pmc() in C++? From all that I read online so far, I think it requires inline assembly code, but I have no clue what instructions to use to get the LLC miss count. I would be obliged if someone can specify the code for these two steps.

我正在VMware工作站上运行Ubuntu 16.04.1（64位）。

I am running Ubuntu 16.04.1 (64-bit) on VMware workstation.

PS：我发现提到了这些 LONGEST_LAT_CACHE.REFERENCES 和 LONGEST_LAT_CACHE.MISSES

P.S.: I found mention of these LONGEST_LAT_CACHE.REFERENCES and LONGEST_LAT_CACHE.MISSES in Chapter-18 Volume 3B of the Intel Architectures Software Developer's Manual, but I do not know how to use them.

`推荐答案`

您可以使用 perf ，正如Cody建议从代码外部衡量事件，但是我怀疑您的代码示例中需要对性能计数器进行细粒度的编程访问。

You can use perf as Cody suggested to measure the events from outside the code, but I suspect from your code sample that you need fine-grained, programmatic access to the performance counters.

为此，您需要启用计数器的用户模式读取，并且还需要对它们进行编程的方式。由于这些操作是受限制的操作，因此您至少需要OS内核提供一些帮助。推出自己的解决方案将非常困难，但是幸运的是，Ubunty 16.04现有几种解决方案：

To do that, you need to enable user-mode reading of the counters, and also have a way to program them. Since those are restricted operations, you need at least some help from the OS kernel to do that. Rolling your own solution is going to be pretty difficult, but luckily there are several existing solutions for Ubunty 16.04:

 
 安迪·克莱恩（Andi Kleen）的，它使您可以从用户空间读取PMU事件。我还没有亲自使用过pmu-tools的这一部分，但是我使用的东西是高质量的。似乎使用现有的系统调用进行计数器编程，因此并且不需要内核模型。
 
  库是从头开始的内核模块和用户态代码的实现，该代码允许用户态读取性能计数器。我已经用过了，效果很好。您安装了允许您对PMU进行编程的内核模块，然后使用libpfc公开的API从用户空间读取计数器（这些调用归结为 rdpmc 指令）。这是读取计数器的最准确，最精确的方法，它包括开销减法功能，可以通过减去由PMU读取代码本身引起的事件，为您提供被测区域的真实PMU计数。您需要固定在一个核心上才能使计数有意义，如果您的进程被中断，您将得到虚假结果。
 
 英特尔开源库。我没有在Linux上尝试过，但是我使用了它的前身库，即非常相似的名称 在Windows上有效。在Windows上，它需要一个内核驱动程序，但是在Linux上，您可以使用驱动器，也可以通过 perf_events 进行驱动。
 
 使用库的功能。 Likwid已经存在了一段时间，并且受到了很好的支持。我过去曾经使用过likwid，但仅用于测量整个过程，类似于 perf stat 而不是使用标记API。要使用标记API，您仍然需要以likwid测量过程的子级运行过程，但是您可以通过编程方式读取过程中的计数器值，这正是您所需要的（据我所知）。我不确定使用标记API时likwid如何设置和读取计数器。

Andi Kleen's jevents library, which among other things lets you read PMU events from user space. I haven't personally used this part of pmu-tools, but the stuff I have used has been high quality. It seems to use the existing perf_events syscalls for counter programming so and doesn't need a kernel model.
The libpfc library is a from-scratch implementation of a kernel module and userland code that allows userland reading of the performance counters. I've used this and it works well. You install the kernel module which allows you to program the PMU, and then use the API exposed by libpfc to read the counters from userspace (the calls boil down to rdpmc instructions). It is the most accurate and precise way to read the counters, and it includes "overhead subtraction" functionality which can give you the true PMU counts for the measured region by subtracting out the events caused by the PMU read code itself. You need to pin to a single core for the counts to make sense, and you will get bogus results if your process is interrupted.
Intel's open-sourced Processor Counter Monitor library. I haven't tried this on Linux, but I used its predecessor library, the very similarly named Performance Counter Monitor on Windows, and it worked. On Windows it needs a kernel driver, but on Linux it seems you can either use a drive or have it go through perf_events.
Use the likwid library's Marker API functionality. Likwid has been around for a while and seems well supported. I have used likwid in the past, but only to measure whole processes in a matter similar to perf stat and not with the marker API. To use the marker API you still need to run your process as a child of the likwid measurement process, but you can read programmatically the counter values within your process, which is what you need (as I understand it). I'm not sure how likwid is setting up and reading the counters when the marker API is used.

所以您有很多选择！我认为它们都可以使用，但是我个人可以为 libpfc 提供担保，因为我在Ubuntu 16.04上将其用于同一目的。该项目正在积极开发中，可能是上述项目中最准确的（开销最小）。因此，我可能会从那个开始。

So you've got a lot of options! I think all of them could work, but I can personally vouch for libpfc since I've used it myself for the same purpose on Ubuntu 16.04. The project is actively developed and probably the most accurate (least overhead) of the above. So I'd probably start with that one.

上面的所有解决方案都应该能够为Kaby Lake工作，因为每个相继的性能监控体系结构都具有功能。似乎通常是先前版本的超集，并且通常会保留API。但是，对于 libpfc ，作者具有它仅支持Haswell的体系结构（PMA v3），但是您只需要更改。

All of the solutions above should be able to work for Kaby Lake, since the functionality of each successive "Performance Monitoring Architecture" seems to generally be a superset of the prior one, and the API is generally preserved. In the case of libpfc, however, the author has restricted it to only support Haswell's architecture (PMA v3), but you just need to change one line of code locally to fix that.

的确，它们通常都以首字母缩写 PCM ，我怀疑新项目只是旧PCM项目的正式开源延续（也可以以源代码形式使用，但没有社区贡献机制）。

Indeed, they are both commonly called by their acronym, PCM, and I suspect that the new project is simply the officially open sourced continuation of the old PCM project (which was also available in source form, but without a mechanism for community contribution).

                        这篇关于在Intel Kaby Lake架构上获取最后一级缓存未命中次数的确切代码是什么的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！