DPCs execute on their own call stack (x86 Edition)

April 29th, 2010

Deferred Procedure Call (DPCs) are callbacks to an arbitrary thread context at IRQL DISPATCH_LEVEL. There is a DPC queue per processor, and queueing a DPC performs two steps:

1) Inserts the DPC onto the DPC queue of the current processor.

2) Requests a DISPATCH_LEVEL software interrupt on the current processor.

Note that there are exceptions to both of those, though I’m not interested in talking about them at this moment

When the operating system is about to return to an IRQL < DISPATCH_LEVEL, the DISPATCH_LEVEL software interrupt is delivered to the processor. On XP, the ISR for this interrupt is hal!HalpDispatchInterrupt, which does some interrupt management work and calls nt!KiDispatchInterrupt. You can get a feel for how this works by setting a breakpoint on KiDispatchInterrupt and checking out a few call stacks, which should look like the following:

kidispatch

While KiInterruptDispatch serves a few different purposes, for our discussion all we care about is the beginning of the function shown here:

kidispatch_asm

Note the call near the end of the listing to nt!KiRetireDpcList. This is the function that will sit in a loop dequeing DPCs from the current processor’s DPC queue and calling the callbacks. There’s some interesting code leading up to that call though, so let’s go line by line and figure out exactly what this code is doing.

nt!KiDispatchInterrupt:
mov     ebx,dword ptr fs:[1Ch]

This line is moving the contents of offset 0x1c from the far segment into EBX. In kernel mode, the base of the far segment is the base address of what is called the PCR for the current processor:

fs_pcr

Thus, this code is grabbing whatever field is at offset 0x1c from the base of the PCR structure. Luckily we have the type information for the PCR, which is nt!_KPCR so we can easily see what is at that offset in the structure:

pcr_1c

That is the SelfPcr field, which is just the flat address of the PCR (in this case that would be 0xffdff000). Let’s move on to the next fragment:

nt!KiDispatchInterrupt+0x7:
lea  eax,[ebx+980h]
cli
cmp  eax,dword ptr [eax]
je   nt!KiDispatchInterrupt+0x2f (805459df)

Here, we add 0×980 to the base address of the PCR and store the result in EAX. We then disable interrupts on the current processor and check to see if the contents of the pointer match the pointer address.

The CMP instruction will do a logical subtract of the two values and set the Z-Flag to one if the result is zero, which would mean that the two values are the same. The JE instruction will, “Jump if the Z-Flag Equals one”, so if the contents of the pointer match the address of the pointer then this code will jump over the code segment that calls KiRetireDpcList.

If you’ve never looked at much assembly that might seem a bit weird, so let’s see what’s add offset 0×980 from the PCR and see if we can figure out what this code is doing.

If you go to a full listing of the PCR structure, you’ll notice that the last offset given is 0×120 and that is the PrcbData field:

pcr_prcb

Thus, in order to figure out what’s at offset 0×980 from the base of the PCR we’ll need to go to offset 0×860 into the PRCB. We’ll find this by doing a dt nt!_kprcb and scanning the output:

prcb_queue

Aha! That field is labeled as the DpcListHead (a.k.a. the DPC queue) and the type is a LIST_ENTRY, which is the standard type for a doubly linked list in the kernel.

LIST_ENTRY structures have two fields, a Flink field that points to the next entry and a Blink field that points to the previous entry. When a list is empty, the Flink field points back to the address of the head of the list. So our previous check above is testing the value of the Flink field against the address of the list head, in other words it is checking to see if the list is empty. If it is, the code avoids draining the DPC queue (which makes sense).

If the list is not empty, then the code sets up to call KiRetireDpcList:

nt!KiDispatchInterrupt+0x12:
push    ebp
push    dword ptr [ebx]
mov     dword ptr [ebx],0FFFFFFFFh
mov     edx,esp
mov     esp,dword ptr [ebx+988h]
push    edx
mov     ebp,eax
call    nt!KiRetireDpcList (80545e0e)

I’m going to save the first three instructions for another time if I ever get to talk about Structured Exception Handling (SEH). Right now it’s sufficient to set that the code there prevents kernel mode exceptions from being raised to user mode exception handlers.

The next two instructions are interesting though:

mov     edx,esp
mov     esp,dword ptr [ebx+988h]

Note that the code saves the current stack pointer and then overwrites ESP with a different pointer value from the PCR. We saw previously that the last offset in the PCR is 0×120, which is the beginning of the PRCB. So, whatever value is at offset 0×868 from the PRCB is what we put into the stack pointer register. If you scroll up to the previous graphic, you’ll see that field labeled as DpcStack:

   +0x868 DpcStack         : Ptr32 Void

Thus, each processor has its own DPC stack that is used when DPCs are executed. Shortly this is going to lead to an unexpected problem that this post will hopefully help you solve.

Lastly, the old stack pointer is pushed onto the stack and finally the call to KiRetireDpcList occurs. When it completes, the old stack is restored and all is right in the World.

However, there’s an interesting issue that can arise in your crash analysis. What if the system crashes inside a DPC? Due to the stack swap that occurs in KiRetireDpcList you’ll get this when you try to dump the call stack:

stackswapped

In other words, you’ll get a listing for the DPC stack and you won’t necessarily be able to see the actual kernel stack of the current thread. While in 99% of the cases the DPC stack will be the only stack that you care about, there’s that 1% where knowing the current thread stack will provide the insight necessary to solve the crash (in almost 10 years I’ve seen two). Luckily, it’s going to be relatively straightforward to get the stack back. Even more luckily, it’s mostly formulaic so even if you’re not sure why you can get it back you’ll still be able to :)

First thing you need is the old stack pointer, which is the first thing on the stack before the return address in the call to nt!KiRetireDpcList:

oldesp

Then we’re going to dump this out with the dps command and find the return address to hal!HalpDispatchInterrupt that the nt!KiDispatchInterrupt will return to. We’ll also want the first thing on the stack after the return address:

prevebp_halp

In my case, I have 0xf715da0c and hal!HalpDispatchInterrupt+0xbb. Now all that’s left is to feed those two values into the special k syntax that allows you to specify your own EBP, ESP, and EIP overrides:

origstack

Note that there’s a cheater shortcut, I could have just done k = f715da00 f715da00 @eip in this case and gotten a slightly busted but still legible stack. The technique above gives a more attractive and correct stack in the end

Possibly we can cover why this command works in the future, but for now hopefully that’s enough of a guide for you to go experiment yourselves. Don’t forget that you can always play with this on a live system where you can verify your results by simply stepping out of nt!KiRetireDpcList.

Random Other Points

1) The DISPATCH_LEVEL software interrupt isn’t always requested, so the DPC isn’t always drained when returning to an IRQL < DISPATCH_LEVEL.

2) The Idle thread also checks the DPC queue and, if it isn’t empty, drains the queue by dequeueing entries and calling the callbacks. In this case, the DPCs execute on the Idle thread’s stack

3) It is possible to target a DPC to a processor other than the current processor

Driver Speak: WinDBG

April 26th, 2010

Everyone loves the the Windows Kernel Debugger WinDBG, but how do you pronounce it when you’re talking about it? There are, in fact, two commonly accepted pronounciations:

Win D-B-G

And:

Wind bag

I exclusively use the latter as it rolls off of the tongue a bit better in my opinion.

Bonus Material: Driver Speak Spelling Bee

WinDBG also has the distinction of having two spellings: WinDBG and WinDbg. WinDbg appears to be the “correct” spelling, as it is how the name appears in the application’s title bar. However, for no good reason other than asthetics I prefer WinDBG, so that’s what I use.

New post series coming up: Driver Speak

April 26th, 2010

I’m going to start a new mini-post series called Driver Speak. In this series I’m going to write posts that describe how to pronounce all those strange words and acronyms you’ll come across in your driver writing. First up will be everyone’s favorite kernel debugger, “WinDBG.”

Getting WER crash dumps on Windows 7

April 16th, 2010

I had an application crashing on my Windows 7 system and couldn’t find a resulting DMP file anywhere. After some fruitless Googling, I finally found the magic incantation that I needed to get the Windows Error Reporting mechanism to write out a dump for me.

The trick for me was the DumpFolder registry value, described here:

http://msdn.microsoft.com/en-us/library/bb787181(VS.85).aspx

Setting this value to a path on the local machine and setting the DumpType value to 2 finally got me the crash dump that I was looking for.

Undocumented !verifier flags value (!verifier 0×200)

April 14th, 2010

Starting with Windows Vista, Driver Verifier has been updated to include circular trace buffers for interesting events. My favorite up until this point has been the pool allocate and free log, which records the call stack, calling thread, and address of pool allocations and frees. If the system then crashes due to a double free or access to a freed pool block, the debugger’s !verifier 0×80 command can be used to dump the alloc/free log. Even better, the command takes an optional address value that will show only the allocations and frees of the pool block containing that address.

You can see the results in this example from the WinDBG docs:

0: kd> !verifier 80 a2b1cf20
Parsing 00004000 array entries, searching for address a2b1cf20.
=======================================
Pool block a2b1ce98, Size 00000168, Thread a2b1ce98
808f1be6 ndis!ndisFreeToNPagedPool+0x39
808f11c1 ndis!ndisPplFree+0x47
808f100f ndis!NdisFreeNetBufferList+0x3b
8088db41 NETIO!NetioFreeNetBufferAndNetBufferList+0xe
8c588d68 tcpip!UdpEndSendMessages+0xdf
8c588cb5 tcpip!UdpSendMessagesDatagramsComplete+0x22
8088d622 NETIO!NetioDereferenceNetBufferListChain+0xcf
8c5954ea tcpip!FlSendNetBufferListChainComplete+0x1c
809b2370 ndis!ndisMSendCompleteNetBufferListsInternal+0x67
808f1781 ndis!NdisFSendNetBufferListsComplete+0x1a
8c04c68e pacer!PcFilterSendNetBufferListsComplete+0xb2
809b230c ndis!NdisMSendNetBufferListsComplete+0x70
8ac4a8ba test1!HandleCompletedTxPacket+0xea
=======================================
Pool block a2b1ce98, Size 00000164, Thread a2b1ce98
822af87f nt!VerifierExAllocatePoolWithTagPriority+0x5d
808f1c88 ndis!ndisAllocateFromNPagedPool+0x1d
808f11f3 ndis!ndisPplAllocate+0x60
808f1257 ndis!NdisAllocateNetBufferList+0x26
80890933 NETIO!NetioAllocateAndReferenceNetBufferListNetBufferMdlAndData+0x14
8c5889c2 tcpip!UdpSendMessages+0x503
8c05c565 afd!AfdTLSendMessages+0x27
8c07a087 afd!AfdTLFastDgramSend+0x7d
8c079f82 afd!AfdFastDatagramSend+0x5ae
8c06f3ea afd!AfdFastIoDeviceControl+0x3c1
8217474f nt!IopXxxControlFile+0x268
821797a1 nt!NtDeviceIoControlFile+0x2a
8204d16a nt!KiFastCallEntry+0x127

In the output, the most recent event is at the top. Thus, here you can see that the buffer was allocated with ndisAllocateFromNPagedPool and freed with ndisAllocateFromNPagedPool.

In addition to the pool allocation log, !verifier 0×100 shows the IRP log, which logs all IoCallDriver, IoCompleteRequest, and IoCancelIrp calls.

Based on the docs you’d think that’s all there is, but there’s an undocumented log that can be accessed with !verifier 0×200 and that is the critical region log.

This is not to be confused with the user mode concept of critical regions. In a driver, one can call KeEnterCriticalRegion and KeExitCriticalRegion in order to disable and re-enable APC delivery. Without getting too much in to why a driver needs to disable APC delivery, what’s important to note is that every call to KeEnterCriticalRegion must be matched with a call to KeExitCriticalRegion. If a driver gets this wrong, then the system will crash with an APC_INDEX_MISMATCH bugcheck when it notices that the enter/exit count is off.

The way this works is that entering a critical region decrements a field of the KTHREAD structure and exiting a critical region increments the field of the structure. At various points in the O/S, the field of the KTHREAD is checked to make sure that it is zero. If it isn’t, then the system crashes with the previously mentioned APC_INDEX_MISMATCH bugcheck code. One such place that this is checked is in the system service dispatcher before returning back to the caller, which is why you’ll see these bugchecks come from KiSystemServiceExit.

What makes these crashes particularly difficult to track down is that the crash is a secondary failure, by the time the system notices that the count field is incorrect the code that caused the bad state is gone. Enter the critical region log, which will trace every call to KeEnterCriticalRegion and KeLeaveCriticalRegion for the Verified drivers. Now, when the system crashes you can just type !verifier 0×200 in the debugger and find the mismatched call.

Note that this only works with Driver Verifier enabled, just another reason to make sure that you’re always testing with Verifier!

Expressing negative decimal numbers in WinDBG

April 6th, 2010

Surprisingly, after almost a decade of using WinDBG I have never had to use a negative decimal value in a WinDBG expression. Until recently, that is…Because the MASM syntax was a bit trickier than expected, I thought I’d record it here for posterity.

When using the MASM evaluator (which is the default in WinDBG), hex values are the default and decimal numbers are indicated by using the 0n override.  Thus, for example, if I wanted to evaluate 123 in an expression I could use 0n123:

3: kd> ?0n123
Evaluate expression: 123 = 00000000`0000007b

However, if I want to use -123 there’s a bit of a catch. My natural inclination was to put the minus sign to the right of the 0n and the left of the 123, such as this: 0n-123. However, this yields an unexpected result:

3: kd> ?0n-123
Evaluate expression: -291 = ffffffff`fffffedd

That’s actually -0×123, not -123. To make matters even worse, the latest debugger even shows this syntax when displaying negative decimal values:

3: kd> dt nt!_mdl fffffa80077f1df0
   +0×000 Next             : (null)
   +0×008 Size             : 0n56
   +0x00a MdlFlags         : 0n-32701
   +0×010 Process          : 0xfffffa80`069dab30 _EPROCESS
   +0×018 MappedSystemVa   : 0xfffffa80`06797294 Void
   +0×020 StartVa          : 0×00000000`0404f000 Void
   +0×028 ByteCount        : 0x1c
   +0x02c ByteOffset       : 0x79c

However, the correct syntax is to put the minus sign to the left of the 0n:

3: kd> ?-0n123
Evaluate expression: -123 = ffffffff`ffffff85

For the C++ evaluator, things are much easier because the default radix is decimal. Thus, all you need to do is specify -123:

3: kd> ? @@c++(-123)
Evaluate expression: -123 = ffffffff`ffffff85

The trick now comes when you want to specify a negative hex number, which also requires the minus sign to be on the left of the modifier:

3: kd> ? @@c++(0x-123)
Evaluate expression: -123 = ffffffff`ffffff85
3: kd> ? @@c++(-0×123)
Evaluate expression: -291 = ffffffff`fffffedd

WinDBG caches data from the target (.cache)

April 5th, 2010

Been a bit of an unexpected  break for a while now, but hopefully back to regular posting…

You might never need to know this, but WinDBG will actually cache data read from the target. For example, this means that if you dd a memory location multiple times only the first dump of the memory will actually be read from target. Obviously this cache is invalidated when the target is resumed, so most of us won’t have an issue with this caching. However, if the memory you’re dumping is something like mapped device memory then this is an issue as the cache could be stale.

Enter the .cache command, which controls the size and state of the local cache (amongst other things, we’ve seen .cache before). Turning off the cache is as easy as executing .cache 0, which sets the size of the local cache to zero. This causes all of your reads to hit the target and ensure that you’re seeing the latest data. There are also the flushall, flushu, and flush parameters, which allow for flushing all or some of the cache from the local machine.

New WinDBG is finally here

February 27th, 2010

After almost a year without updates, the latest version of WinDBG (6.12.2.633) is available and ready for consumption.

Unfortunately, in order to get it you’re going to need to sit through a 600MB download since WinDBG is now only distributed along with the Windows Driver Kit. There’s also rumblings that it is available with the SDK, though I have not confirmed that yet.

Direct link to WDK download

Update

The WinDBG installer is available from the SDK as well as the WDK, however the SDK ships with an older version and not the latest. So, as of now the WDK is going to be the only way to get the latest WinDBG.

x64 Crash Dump Analysis: Every bit counts

February 14th, 2010

Happy Valentine’s Day! And what’s more romantic than a post about analyzing an x64 crash dump? If you haven’t picked up a card already, feel free to print this out and hand it to your significant other.

Way back in December, we started looking at the fundamentals of x64 crash analysis so that we could work up to analyzing an actual x64 crash. If you haven’t already, I suggest that you read them in order since we’ll put all of those posts in practice here:

x64 Trap Frames

x64 Calling Convention

x64 Stack Frame layout

Reconstructing parameters from x64 crash dumps

With that out of the way, we can start our analysis the way we always do with any crash by running !analyze -v:

pfnpa

The bugcheck code in this case is PAGE_FAULT_IN_NONPAGED_AREA (0×50). In order to solve this crash we should probably cover what exactly this crash code is indicating.

The kernel virtual address space in Windows contains lots of different memory regions that serve different purposes. Regardless of the purpose of the region, one characteristic that all kernel memory has is whether or not the memory is pageable. When a memory region is pageable, the Memory Manager (Mm) is free to take the contents of the physical page of memory, write it out to disk, and then invalidate the virtual address. The next time someone tries to read the contents of that address, a page fault occurs and the contents are brought back into memory from disk. Once the memory is again resident, the Mm fixes the virtual address pointer and resumes the thread. When a memory region is non-pageable, it means that the Mm promises to never page out the memory and invalidate the virtual address in this manner.

Having non-pageable memory is important in Windows because it is the only kind of memory that you are allowed to access at IRQL DISPATCH_LEVEL or above (see my previous post here for more on IRQL). The reason for this is that you are not allowed to perform any wait operations at IRQL DISPATCH_LEVEL or above and by using pageable memory you’re implicitly stating that you can wait if the memory that you’re trying to access is not resident in memory.

With that out of the way, we can understand what the particular bugcheck code means. These non-pageable regions are only guaranteed to be valid if you have a valid outstanding resource allocation from the Mm. Take, for example, non-paged pool, which is the kernel equivalent to the user mode heap with the exception that memory allocated from this pool is guaranteed to never be paged out to disk. However, that does not mean that every address within the non-paged pool area is valid at all times. The Mm may delay programming a particular virtual address in this region until he is going to return a pool allocation to a particular caller, or mark the virtual address as invalid when someone frees a valid pool allocation. If someone tries to access one of these invalid addresses, a page fault will occur and the Mm will inspect the invalid virtual address to decide what needs to be done with it. If this virtual address corresponds to a region that is guaranteed to not page fault when valid, the Mm calls KeBugCheck with a bugcheck code of PAGE_FAULT_IN_NONPAGED_AREA. If you think about it, this is the only reasonable thing that the Mm can do since there is no solution to this state (you can argue that there are things that could be done, but that’s in the realm of fault tolerant systems and not relevant to the discussion).

We can now break down the text associated with the bugcheck code and understand a bit more of what it means:

Invalid system memory was referenced.  This cannot be protected by try-except, it must be protected by a Probe.  Typically the address is just plain bad or it is pointing at freed memory.

Invalid system memory was referenced.

This bugcheck is always the result of dereferencing a bad kernel virtual address.

This cannot be protected by try-except, it must be protected by a Probe.

If you dereference a bad user virtual address, a structured exception is raised that your driver can catch in a structured exception handling (SEH) block. When an invalid kernel address is accessed, there is no structured exception raised and the system simply bugchecks. The comment here about being protected by a probe is misleading. There is no way to validate a kernel address other than dereferencing it and hoping for the best. The idea is that kernel callers are trusted, thus if a kernel component hands you a kernel virtual address you must assume that it is valid. What the comment here is referring to is that you should not be touching kernel virtual addresses that originated from user mode. You can avoid this situation by calling ProbeForRead or ProbeForWrite on any address handed to you from user mode, which will raise an exception if the address is a kernel virtual address. This is only useful if you’re performing METHOD_NEITHER I/O and is not relevant to our conversation.

Typically the address is just plain bad or it is pointing at freed memory.

This is the short version of what we’ve been talking about up until now. If you have page faulted on an address in the non-paged area it means that you have dereferenced something that is not a valid memory allocation. Generally this means that it’s a garbage value (e.g. uninitialized pointer), freed memory, or some kind of corruption (e.g. a pointer value from a corrupted data structure).

Now we should have a much better idea as to what we’re looking at and the kind of bug that we’re looking for. According to the !analyze output, the invalid address that caused the wreck was 0xffffba80`07122a88 and we were attempting to write to the address (parameter 2 from the bugcheck information). Let’s look at the trap frame output and attempt to identify the invalid address in the faulting instruction:

traprtl

If you haven’t read my previous post about x64 trap frames, the output above will likely be confusing. There are two pointer dereferences in the above output, the instruction pointer RIP and RBX. Neither one of these values is 0xffffba80`07122a88 and, in fact, this looks to be a NULL pointer dereference since RBX is zero. However, as we know, the trap frames on the x64 do not contain non-volatile register state and RBX is a non-volatile register. So, in order to get the value of RBX at the time of the crash back we’ll need to scroll back through the assembly and find another volatile register that either shadows RBX in this frame or we can use to derive RBX.

The first step to getting this information will be to execute the .trap command and get ourselves into the correct trap frame. We don’t need to find the trap frame address ourselves since it was already present in the !analyze output:

usetrap

Now that our registers are back, we can go back through the disassembly and figure out where RBX came from prior to the dump. I generally do this by bringing up the disassembly window (Alt+7) since I find that a bit more convenient in this situation than trying to use the keyboard shortcuts to navigate the assembly. Bringing up the window and scrolling back a bit shows RBX coming from RDX a few instructions earlier:

rdxrbx

If we go back and view the RDX register value contents, we’ll see that they match the pointer value from the first parameter to the bugcheck:

rdxbugcheck

We could have intuited the value of RBX based on the bugcheck information and our knowledge of x64 trap frames, though I like to take this extra step to make sure that I understand these things and also get my bearings with the dump. Also, in this case I know that the RBX value came from RDX, which is likely the second parameter to the function. Thus, if I can find a function prototype I can know what the type of the structure should be. This all provides me with greater context for the dump and a greater chance that I’ll have some success in analyzing it.

Let’s review what we have up to this point:

1) Someone has tried to write to an invalid system address

2) The address was passed as the second parameter to this routine

The next logical step that we need to explore is, “what kind of address is this supposed to be and why is this address invalid?

For the first part of this question, we can typically use the !address extension and figure out what region this address lies in. Unfortunately, at the time of this writing that command does not work on Windows 7 and determining this without that extension is beyond the scope of this article. There is one thing we can quickly check though and that is whether or not this address lies in one of the various executive pools, which we can do with the !pool command:

notpool

Based on that output, it’s likely that this is not a pool allocation. The fact that it states that it is corrupt or free pool doesn’t really mean much, it should really say, “I have no idea what this is, but it doesn’t look like pool to me.” While that doesn’t provide us much positive information, we at least know that it’s not likely a bad pool address due to being used after it was freed. This at least removes a class of bugs and narrows our search a bit.

Since it’s not pool and !address doesn’t work, the best we can now is inspect the PTE contents and see if there’s anything interesting there:

reallybadptr

According to the !pte output, this address is not only bad it’s really bad. At no level of the page translation process does this address have valid information. To me, this screams that this address is the result of an uninitialized pointer reference or a data corruption. Since the crash occurs in a Microsoft supplied component, the fact that this crash would come from an uninitialized pointer is very unlikely and so my sights are set firmly on some type of corruption. But what kind?

When I first analyzed this dump, I stayed at this point for almost 24 hours (luckily not straight!). I just couldn’t see what the corruption was or where it came from. I spent the time to go through every other thread in the system and even searched the kernel address space for other references to this address hoping for some light to shine on a clue. Since this was a Filter Manager structure that was corrupt, I also checked all of the current Filter Manager mini-filters with the !fltkd.filters command and was bummed to find only in box Microsoft supplied filters running.

At this point I took the advice that I give to all students: I walked away from the dump. Anyone who claims that crash dump analysis isn’t difficult is either lying or doesn’t get presented with many challenging dumps. Sometimes walking away from the dump gives you the fresh perspective and eyes that you need to spot something miniscule, such as a missing bit.

Before calling it quits on the dump, I gave it one last look and noticed something curious about the faulting pointer value when compared to the other values in the trap frame: only the high four bytes at the top of the address were 0xf, not the high five bytes like the other registers.

fournotfive

I felt like Archimedes in the bathtub, though I expressed my excitement in a slightly less dramatic fashion (and without all of the nudity). What if this was a single bit flip error? It’s possible that due to some sort of cosmic error there was a single bit in this address that should have come back as 1 but instead read as 0. So, I flipped the bit and found what was an entirely plausible pool address that indicated it was a valid Filter Manager allocation:

flipthebit

This gave me a plausible explanation for the dump: hardware failure. I wanted to collect at least one other piece of information to support this, so I decided to check to see if there were any physical pages marked as bad in this system. This this wouldn’t provide any type of solution, it would at least be another indicator that this machine was having hardware related issues. Thus, I inspected the state of the Page Frame Database in this machine using the !memusage command and did indeed find ~7MB of bad pages:

badpages

 

 

How to install xperf

February 13th, 2010

I’m finally living the full Windows 7 lifestyle, with everything from our home computers to my workstation running some flavor of Windows 7. Since I now don’t have any legacy platforms to deal with, I’m buying in to using all of the various features and utilities that are available on Vista and later (I skipped Vista entirely and have been clinging to XP).

One of the things that I’ve been most interested in getting a chance to play with is xperf, which is a performance measurement utility that leverages the Event Tracing for Windows (ETW) instrumentation that has been added to Windows over the years. I figured that with all of these Win7 systems around it would be worthwhile to have xperf installed on every machine in case there is some mysterious slowdown that needed investigation. While I thought this would be a straightforward download and install, it turns out that one needs to take a fairly circuitous route to get the necessary data capture and viewing tools installed.

The first thing you’ll need to do is install the Windows SDK, available here. Yes, the SDK…For some reason, the installation package of the tool is provided only in the SDK’s \Bin directory. Thus, in order to install the tool you’ll need to install the SDK so that you can get access to the MSI package that will actually install xperf.

While this is a pain, I suspect that you can get away with a fairly minimal SDK install (though I haven’t bothered to try that yet). Also, once you have the SDK installed in one location you’ll have the MSI you need and can just carry that around with you.

Once installed, you’ll need to navigate to the Bin directory and actually find the installation package. This was not intuitive to me, which is why I figured I’d bother writing this post. The files that you’re interested in are the \Bin\wpt_arch.msi files, where arch is whatever version of Windows you’re currently running: x86, x64, or IA64. Just double click on the file that is appropriate for your architecture and you’re on your way to drilling down to whatever performance issue you’re having.

For a good intro to xperf, check out the following Ntdebugging Blog post.