Table of Contents

Performance Measuring using Hardware Performance Monitoring Counters

Student: Alex Ray
Mentor: Arun Thomas
SVN branch name: src.20090525.r4372.alexray

Abstract

I'll be porting/implementing the PMCTools kit from FreeBSD to Minix. This uses on-board hardware performance counters to track down and identify performance sinks. This can then be used to improve the speed/efficiency of Minix.

Design

This project is broken up into three parts:

The first part will be a lot of assembly (and even some binary/hex for the instructions not in the assembler). The rest should be mostly C.

Finer design points (I'll add to this as I get to the latter parts of the project)

hwpmc

hwpmc is a kernel-module, a set of hooks, functions, structures, and data, that gets shoved into the FreeBSD kernel and when the kernel catches a pmc call, it goes to hwpmc code. My pmc server is going to be used as a surrogate kernel, and will catch (PMC) messages and use the appropriate hwpmc code. Initially I'll cut out most of the functionality and just get a working model that implements system-wide counting functionality, and after I finish an end-to-end prototype, I'll expand functionality from there.

pmclib

This is a set of library procedures that make appropriate system calls to the FreeBSD kernel. I'll have to modify these to make system calls to my surrogate kernel (my pmc server), and I'll have more news on that later.

Test Plan and Evaluation

Test-driven development is the plan, so I'll try to put tested and proved features here.

Schedule and Deliverables

Throughout the Summer:

(pre-Coding and through midterm and final exam periods and afterwards as well)

Deliverables: Create a blog for the purpose of day-to-day work tracking, so my mentor(s) and the community can keep tabs on where I am and what I'm doing.

Update the Wiki with information I find useful that may be of use to other users/developers.

Pre-Coding Period:

Become very familiar with the intel and AMD processor manuals (http://www.intel.com/products/processor/manuals/). These are the references on which hwpmc is based. hwpmc is the hardware driver that actually touches the processor's performance monitor counters, so this is VERY architecture specific. I will only be dealing with a subset of all of the architectures covered by hwpmc currently, as it already has the *86 architectures covered (as well as ARM and a bunch of others).

Contact the PMCTools developers to possibly collaborate with on some of this project. One of the bigger ToDo's for PMCTools is find better ways of presenting and analysing the data, which is what I propose to do in the latter part of the summer. It would be awesome to contribute to two Open Source projects with one summer project.

Thoroughly read through the PMCTools source code. After final exams I will switch my primary workstation over to FreeBSD/Minix to facilitate development. Here is something I can do that touches on the three main packages (hwpmc, libpmc, and pmcstat) without actual coding for my project:

Week 0: Install and use PMCTools under FreeBSD. Write some simple unit tests that confirm that the hwpmc driver can successfully communicate with my processor. Also write some unit tests that demonstrate functions in libpmc. Finally use pmcstat to measure the performance of specific processes, multiple processes of the same program, and the system as a whole.

Deliverables: Demonstrated use of PMCTools (hwpmc, libpmc, pmcstat) in FreeBSD via simple unit tests.

Coding begins:

Week 1: Start Porting hwpmc for i686 (http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/dev/hwpmc/).

Method:

Deliverable:

Accomplished:

ToDo:

Week 2:

Week 3-4: System-wide Counting

Currently Counting: Branching (total branches versus mispredict branches). I'd really love to hear what people want to have measured, so please let me know if you have ideas. Testing Scenarios: Idle (just sitting there) and during my regular workflow.

Known Issues: (I'm not mentioning all of the C99 etc issues with the FreeBSD code here)

Week 5 Started Porting Libpmc

Week 6: userland PMC allocation via libpmc API

Userland programs can now call the proc_allocate_pmc functions for the 'p4' and 'iaf' processors (Pentium 4 and Intel Core/Core2 Fixed Function counters).

Explained in more detail at (Documentation): http://ajray.wordpress.com/2009/07/07/proc_allocate_pmc-functionality/

Week 7: more libpmc allocation functions (it goes without saying, but theres more detail at http://ajray.wordpress.com)

Last arch I'm going to port (libpmc-wise) for now is the other half of Core(2), called 'iap'. These are the general-purpose counters on Core(2) procs (theres only two of them), and have one event register each, similar to the AMD procs. After that I can start working on porting userland functions that do something useful with the output of libpmc functions (pmcstat, etc.).

Week 8

Week 9

Week 10

Week 11

Updates

I'll be updating my branch constantly as I work on the code, and I'll be keeping track of my progress in my blog. I'll also be haunting the IRC channel and the mailing list all summer, so feel free to contact me there as well (ajray on irc.freenode.net).

Weekly Status

Blog: http://ajray.wordpress.com/

Daily Status

Twitter: http://twitter.com/alexjray

License Info

I will be keeping clear distinctions between the PMCTools and my personal code, and I'll be releasing my code under the BSD license.