Monday, November 15, 2010

What the "-1.#QNB"? Debugging We Go

I recently compiled the code I have been writing about and ran a regression data set (aka an old operational run). The original had been run on a Linux box using Lahey's Fortran compiler, and we also had an older Solaris version.

Aside from the expected differences (end of line, time and date stamps, CPU time, etc), I ran across the text in the title of this post. Specifically, an average of some reasonable parameters was coming out:

"-1.#QNB" instead of "0.1327"

WTF?

The rest of the run looked fine, and the final estimates of the parameters were all correct.
I googled the "#QNB", and got a number of hits for the Qatar National Bank, Google not paying attention to the "#" at the beginning.

Adding the "1." yielded something useful. I found an old discussion thread at "www.rhinocerous.net" (apparently a site about parallel computing). The actual page is gone, but I found a cached version of the thread. Long story short, this is a "not a number" (NaN) message.

Both I and the original poster were using MinGW. Specifically, I'm using g77. The number in question was printed out using F8.4. Apparently cygwin does show this as NaN. It turns out so will MinGW, but only if there are enough digits of precision. When I print the quantity in question using WRITE(*,*), I get "-1.#QNAN". With only 4 decimal places, however, it appears that this rounds to "-1.#QNB". I never would have guessed.

For the record, I tried the varying the format from F8.0 to F10.8, and here's what I found:

F8.0 -1.
F8.1 -1.$
F8.2 -1.#R
F8.3 -1.#QO
F8.4 -1.#QNB
F8.5 -1.#QNAN
F10.6 -1.#QNAN0
F10.7 -1.#QNAN00
F10.8 ************

which makes sense, if you think of rounding the integer equivalent of the ASCII codes. Presumably more precision will simply yield more trailing 0's.

The F8.0 result is concerning, because in that case there is no indication that there is anything wrong.

And people wonder why I mumble when they ask me what I did all day.

No comments:

Post a Comment