• 4926阅读
  • 0回复

Cachegrind: a cache and branch profiler [复制链接]

上一主题 下一主题
离线steinlee
 

只看楼主 倒序阅读 楼主  发表于: 2008-11-01
— 本帖被 XChinux 执行加亮操作(2009-01-06) —
Cache and branch profiling

To use this tool, you must specify --tool=cachegrind on the Valgrind command line.

Cachegrind is a tool for finding places where programs interact badly with typical modern superscalar processors and run slowly as a result. In particular, it will do a cache simulation of your program, and optionally a branch-predictor simulation, and can then annotate your source line-by-line with the number of cache misses and branch mispredictions. The following statistics are collected:

    *

      L1 instruction cache reads and misses;
    *

      L1 data cache reads and read misses, writes and write misses;
    *

      L2 unified cache reads and read misses, writes and writes misses.
    *

      Conditional branches and mispredicted conditional branches.
    *

      Indirect branches and mispredicted indirect branches. An indirect branch is a jump or call to a destination only known at run time.

On a modern machine, an L1 miss will typically cost around 10 cycles, an L2 miss can cost as much as 200 cycles, and a mispredicted branch costs in the region of 10 to 30 cycles. Detailed cache and branch profiling can be very useful for improving the performance of your program.

Also, since one instruction cache read is performed per instruction executed, you can find out how many instructions are executed per line, which can be useful for traditional profiling and test coverage.

Branch profiling is not enabled by default. To use it, you must additionally specify --branch-sim=yes on the command line.
Looking for remote C/C++ and Qt 兼职
快速回复
限100 字节
 
上一个 下一个