Tuesday, June 26, 2007

Debugging Tools for Windows - Crash Analysis

Once you’ve created a snap, you’re ready to analyze the data! See Debugging Tools for Windows - Snapping a Process, Part 1 and Part 2 for steps that led up to this point.

Open Windbg, select File | Open Crash Dump… and select the *.dmp you want to analyze.

Since we are analyzing managed code we are going to need a little help in the form of a Windbg extension. The extension we are interested in is the SOS.DLL (Son of Strike), which is included in the installation of Debugging Tools for Windows, allows Windbg to understand CLR constructs such as the managed heap.

To load the SOS extension, type the following in Windbg:

0:068> .load clr10\sos.dll

Alternatively, you can path the explicit path to SOS for the version of the .NET framework you’re targeting.

0:068> .load C:\WINDOWS\Microsoft.NET\Framework\v1.1.4322\sos.dll

Typically, when a snap is created as the result of an exception (-crash), the debugger will identify which thread has the exception and position the Windbg context to that thread. One of the first things you want to do is take a look at the call stack. The managed call stack can be viewed by using !clrstack. This will produce something similar to below.

0:068> !clrstack
Thread 68
ESP EIP
0x0edfe7fc 0x77e55dea [FRAME: GCFrame]
0x0edfeb2c 0x77e55dea [FRAME: ECallMethodFrame] [DEFAULT] R8 System.Number.ParseDouble(String,ValueClass System.Globalization.NumberStyles,Class System.Globalization.NumberFormatInfo)
0x0edfeb3c 0x79a14bef [DEFAULT] R8 System.Double.Parse(String,ValueClass System.Globalization.NumberStyles,Class System.IFormatProvider)
0x0edfeb78 0x79a0e1e1 [DEFAULT] R8 System.Convert.ToDouble(String,Class System.IFormatProvider)
0x0edfeb80 0x00c54d35 [DEFAULT] [hasThis] Boolean Microsoft.Xslt.CompiledScripts.CSharp.ScriptClass_1.IsNumeric(String)
0x0edfeba8 0x00c55e05 [DEFAULT] [hasThis] Boolean Microsoft.Xslt.CompiledScripts.CSharp.ScriptClass_1.LogicalEq(String,String)
0x0edfec68 0x791b33cc [FRAME: GCFrame]
0x0edfeddc 0x791b33cc [FRAME: ECallMethodFrame] [DEFAULT] [hasThis] Object System.Reflection.RuntimeMethodInfo.InternalInvoke(Object,ValueClass System.Reflection.BindingFlags,Class System.Reflection.Binder,SZArray Object,Class System.Globalization.CultureInfo,Boolean,Class System.Reflection.Assembly,Boolean)
0x0edfee04 0x799dd299 [DEFAULT] [hasThis] Object System.Reflection.RuntimeMethodInfo.InternalInvoke(Object,ValueClass System.Reflection.BindingFlags,Class System.Reflection.Binder,SZArray Object,Class System.Globalization.CultureInfo,Boolean)
0x0edff4c0 0x00000000 [FRAME: ContextTransitionFrame]


Note, that in order to get great detail around the stack, you’ll need to have debug symbols available for Windbg to resolve some of the names and information. If you have source code available as well, you can get to the actual source and line data!

A quick review of the managed threads will show if there are any threads that have exceptions. For example, the output of !threads shows that we have two threads with exceptions; TID’s 63 & 68. Notice how the IDE has already positions us to TID 68?

0:068> !threads
ThreadCount: 32
UnstartedThread: 0
BackgroundThread: 11
PendingThread: 0
DeadThread: 4

PreEmptive GC Alloc Lock
ID ThreadOBJ State GC Context Domain Count APT Exception
1 0xc0c 0x000f2c10 0xa220 Enabled 0x00000000:0x00000000 0x000e7400 0 MTA
6 0xc54 0x000f2dd8 0xb220 Enabled 0x00000000:0x00000000 0x000e7400 0 MTA (Finalizer)
13 0xe18 0x000f3330 0x220 Enabled 0x00000000:0x00000000 0x000e7400 0 MTA
[...snip...]
62 0xde0 0x0bf1ae48 0x2001020 Enabled 0x00000000:0x00000000 0x000e7400 0 Ukn
63 0x1a8 0x0bf1bc88 0x1800220 Enabled 0x051bb3c8:0x051bd120 0x000e92a8 0 MTA (Threadpool Worker) System.FormatException
17 0xf54 0x0bf1be50 0x220 Enabled 0x00000000:0x00000000 0x000e7400 0 Ukn
19 0xf5c 0x0bf1c1e0 0x220 Enabled 0x00000000:0x00000000 0x000e7400 0 Ukn
67 0x168c 0x0bf1bac0 0x220 Enabled 0x00000000:0x00000000 0x000e7400 0 Ukn
68 0x10a0 0x0bf1b3a0 0x1800220 Enabled 0x0110fa48:0x0110ffc4 0x000e92a8 0 MTA (Threadpool Worker) System.FormatException
69 0x101c 0x0bf1c018 0x1800220 Enabled 0x00000000:0x00000000 0x000e7400 0 MTA (Threadpool Worker)


To locate the FormatException on the threads stack, execute the !dumpstackobjects command.

0:068> !dumpstackobjects
Thread 68
ESP/REG Object Name
0xedfe720 0x110f8f0 System.FormatException
0xedfe738 0x110f8f0 System.FormatException
0xedfe74c 0x110f8f0 System.FormatException
0xedfe7b4 0x110f8f0 System.FormatException
0xedfe7dc 0x110f8f0 System.FormatException
0xedfe7ec 0x110f8f0 System.FormatException

0xedfe84c 0x110e24c System.String Format_InvalidString
0xedfe850 0x110e15c System.Resources.ResourceManager
0xedfe890 0x110f8f0 System.FormatException
0xedfe894 0x110f9e4 System.String Input string was not in a correct format
0xedfe89c 0x79b96950 System.Exception
0xedfe91c 0x110e24c System.String Format_InvalidString
0xedfe9d4 0x110e24c System.String Format_InvalidString
0xedfea48 0x110f88c System.String Input string was not in a correct format
0xedfea70 0x5163a38 System.Globalization.NumberFormatInfo
0xedfeabc 0x110e09c System.String val2
0xedfeb08 0x5163a38 System.Globalization.NumberFormatInfo
0xedfeb10 0x51a1a28 System.String MC
0xedfeb18 0x51a1a28 System.String MC
[…snip…]


Here we notice that we have multiple System.FormatException on the stack at address 0x110f8f0. We can dump the object by using the !dumpobj command, passing it the address of the object in question.

0:068> !dumpobj 0x110f8f0
Name: System.FormatException
MethodTable 0x79bacb74
EEClass 0x79bacbec
Size 64(0x40) bytes
GC Generation: 0
mdToken: 0x02000090 (c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll)
FieldDesc*: 0x00000000
MT Field Offset Type Attr Value Name
0x79b96824 0x400001d 0x4 CLASS instance 0x00000000 _className
0x79b96824 0x400001e 0x8 CLASS instance 0x00000000 _exceptionMethod
0x79b96824 0x400001f 0xc CLASS instance 0x00000000 _exceptionMethodString
0x79b96824 0x4000020 0x10 CLASS instance 0x0110f9e4 _message
0x79b96824 0x4000021 0x14 CLASS instance 0x00000000 _innerException
0x79b96824 0x4000022 0x18 CLASS instance 0x00000000 _helpURL
0x79b96824 0x4000023 0x1c CLASS instance 0x00000000 _stackTrace
0x79b96824 0x4000024 0x20 CLASS instance 0x00000000 _stackTraceString
0x79b96824 0x4000025 0x24 CLASS instance 0x00000000 _remoteStackTraceString
0x79b96824 0x4000026 0x2c System.Int32 instance 0 _remoteStackIndex
0x79b96824 0x4000027 0x30 System.Int32 instance -2146233033 _HResult
0x79b96824 0x4000028 0x28 CLASS instance 0x00000000 _source
0x79b96824 0x4000029 0x34 System.Int32 instance 0 _xptrs
0x79b96824 0x400002a 0x38 System.Int32 instance -532459699 _xcode
-----------------
Exception 0110f8f0 in MT 79bacb74: System.FormatException
_message: Input string was not in a correct format.


Notice that we’ve already got useful information, the _message object has already been dumped for us. However, we could use !dumpobject 0x0110f9e4 and inspect it ourselves. If we do so, we find that it’s actually a reference to a System.String.

0:068> !dumpobj 0x0110f9e4
Name: System.String
MethodTable 0x79b94638
EEClass 0x79b94984
Size 100(0x64) bytes
GC Generation: 0
mdToken: 0x0200000f (c:\windows\microsoft.net\framework\v1.1.4322\mscorlib.dll)
String: Input string was not in a correct format.
FieldDesc*: 0x79b949e8
MT Field Offset Type Attr Value Name
0x79b94638 0x4000013 0x4 System.Int32 instance 42 m_arrayLength
0x79b94638 0x4000014 0x8 System.Int32 instance 41 m_stringLength
0x79b94638 0x4000015 0xc System.Char instance 0x49 m_firstChar
0x79b94638 0x4000016 0 CLASS shared static Empty
>> Domain:Value 0x000e7400:0x05010224 0x000e92a8:0x05010224 <<>> Domain:Value 0x000e7400:0x05010238 0x000e92a8:0x0509c078 <<>!clrstack -a
Thread 68
ESP EIP
ESP/REG Object Name
0x0edfe7fc 0x77e55dea [FRAME: GCFrame]
ESP/REG Object Name
0xedfe84c 0x110e24c System.String Format_InvalidString
0xedfe850 0x110e15c System.Resources.ResourceManager
0xedfe890 0x110f8f0 System.FormatException
0xedfe894 0x110f9e4 System.String Input string was not in a correct format
0xedfe91c 0x110e24c System.String Format_InvalidString
0xedfe9d4 0x110e24c System.String Format_InvalidString
0xedfea48 0x110f88c System.String Input string was not in a correct format
0xedfea70 0x5163a38 System.Globalization.NumberFormatInfo
0xedfeabc 0x110e09c System.String val2
0xedfeb08 0x5163a38 System.Globalization.NumberFormatInfo
0xedfeb10 0x51a1a28 System.String MC
0xedfeb18 0x51a1a28 System.String MC
0xedfeb1c 0x51a1a28 System.String MC
0x0edfeb2c 0x77e55dea [FRAME: ECallMethodFrame] [DEFAULT] R8 System.Number.ParseDouble(String,ValueClass System.Globalization.NumberStyles,Class System.Globalization.NumberFormatInfo)
ESP/REG Object Name
0x0edfeb3c 0x79a14bef [DEFAULT] R8 System.Double.Parse(String,ValueClass System.Globalization.NumberStyles,Class System.IFormatProvider)
EDI 0x051a1a28 ESI 0x000000e7 EBX 0x0bf1b3a0 EDX 0x00000000 ECX 0x0bf1b3a0
EAX 0x0edfe704 EBP 0x0edfeb6c ESP 0x0edfeb3c EIP 0x79a14bef
ESP/REG Object Name
0xedfeb3c 0x5163a38 System.Globalization.NumberFormatInfo
0xedfeb44 0x51a1a28 System.String MC
0xedfeb48 0x51a1a28 System.String MC
0xedfeb4c 0x5163a38 System.Globalization.NumberFormatInfo
0xedfeb50 0x51a1a28 System.String MC
0xedfeb58 0x501590c System.RuntimeType
0xedfeb5c 0x509b7f0 System.Globalization.CultureInfo
0xedfeb68 0x51a1a28 System.String MC
0xedfeb74 0x5163a38 System.Globalization.NumberFormatInfo
0x0edfeb78 0x79a0e1e1 [DEFAULT] R8 System.Convert.ToDouble(String,Class System.IFormatProvider)
EDI 0x051a1a28 ESI 0x051a1a28 EBX 0x0bf1b3a0 EDX 0x00000000 ECX 0x0bf1b3a0
EAX 0x0edfe704 EBP 0x0edfeba0 ESP 0x0edfeb78 EIP 0x79a0e1e1
ESP/REG Object Name
0xedfeb78 0x51a1a28 System.String MC
0x0edfeb80 0x00c54d35 [DEFAULT] [hasThis] Boolean Microsoft.Xslt.CompiledScripts.CSharp.ScriptClass_1.IsNumeric(String)
EDI 0x051a1a28 ESI 0x051a1a28 EBX 0x0bf1b3a0 EDX 0x00000000 ECX 0x0bf1b3a0
EAX 0x0edfe704 EBP 0x0edfeba0 ESP 0x0edfeb80 EIP 0x00c54d35
ESP/REG Object Name
0xedfeb84 0x10f2524 Microsoft.Xslt.CompiledScripts.CSharp.ScriptClass_1
0xedfeb88 0x51a1a28 System.String MC
0xedfeb9c 0x10f2524 Microsoft.Xslt.CompiledScripts.CSharp.ScriptClass_1
0x0edfeba8 0x00c55e05 [DEFAULT] [hasThis] Boolean Microsoft.Xslt.CompiledScripts.CSharp.ScriptClass_1.LogicalEq(String,String)
EDI 0x051a1a28 ESI 0x010f2524 EBX 0x0bf1b3a0 EDX 0x00000000 ECX 0x0bf1b3a0
EAX 0x0edfe704 EBP 0x0edfebb8 ESP 0x0edfeba8 EIP 0x00c55e05


Looking at the contents of the EDI register, we have the address 0x051a1a28. If we dump the address in this register, we see that it is indeed a string with the value of ‘MC’. This value obviously cannot be converted into a double, so a System.FormatException is thrown. The astute observer will notice that the address can be traced back up the call stack to the very top. Recognizing that sort of pattern, and a little insight as to what’s going on, will go a long way!



Conclusion

I actually contrived this issue just a little. Not much granted, an unhandled System.FormatException will terminate a process just as fast as a System.OutOfMemoryException. However, I didn’t have any sample code illustrating an unhandled exception. Each of us can create that in our sleep. What I did have was a live test case in our BizTalk staging environment where, while doing a little issue resolution on another exception type, I noticed this little guy being thrown. Not often, but I was seeing them.

This System.FormatException obviously wasn’t terminating the process; we’d have noticed it much sooner if it had! That means it was being handled somewhere, either in our code base, or in BizTalk. By attaching Windbg to the BTSNTSvc.exe process, and giving the debugger first chance at any exceptions (Adplus –crash –p xxxx -FullOnFirst) I was able to catch the exception in action. Reviewing the call stack produced shows that this exception was happening during the execution of a BizTalk map, from an orchestration, specifically when a particular functoid was being invoked: IsNumeric. It seems that IsNumeric is simply attempting to parse the value into a System.Double. Presumably, if it can, it returns true. If it can’t, the System.FormatException is thrown, but then caught in IsNumeric and then IsNumeric returns false. This catch uncovered a bug in the map that hadn't yet been flushed out in testing.

No comments:

Post a Comment