FLServer & VirtualBox & crashing ever 18 minutes
-
A while ago I posted about a crash I was having. Eventually I gave up trying to figure out what was wrong. Now that I understand a bit more about how FLServer works and debugging I attempted to look at the issue again.
The setup is FLServer on a Windows XP guest running on VirtualBox on a 64 bit Windows Server 2008 host.
After approximately 18ish minutes FLServer would stop responding. The update event loop was freezing on a call to SleepEx in FLServer@0xCDCC. This is the main event loop for FLServer. It turns out that this FLServer was being told to sleep for over 100000 ms. This is not good. The normal sleep varies at around 0-20 ms
It seems that the Timing::read_ticks() function is calculating invalid times. This function uses the QueryPerformanceCounter functions and it seems likely the VirtualBox + XP are not entirely happy with each other - specifically, QueryPerformanceFrequency is only called once when they application starts.
After patching flhook to stop the stupid call to SleepEx, I observed a crash in engbase.dll at offset 0xb8ae. I have seen this before and it has been reported by other people here.
I understand why there is a crash in Engbase - it’s called out of ElapseTime and ElapseTime has a silly time as a result of my quick flhook patch.
I’ll hack away at this for a bit longer. I’ll probably need to patch the Timing functions I guess. If anybody feels motivated to fix this for me, feel free to do so.
-
-
This is just a small test server and it’s not worth upgrading it or moving it to a dedicated host, i.e. it’s easier for me to patch flserver/hook than build a new VM.
I suspect that this issue occurs on many multiprocessor/multcore machines although much less often. I have seen crashes in exactly this area on server machines. MS state that QueryPerformanceCounter may return invalid data on some multiprocessor machines due to HAL/bios bugs
The patch is pretty much done and is simple, I guess I’ll post it in a few minutes.
Besides I can run virtual box on Linux and everybody knows Linux is better than Windows.
puts on flame suit and runs away
-
Well, if you dont use queryperformancetimer, than what do you use? Because IIRC all the other timing functions are a lot more imprecise:
Function Units Resolution --------------------------------------------------------------------------- Now, Time, Timer seconds 1 second GetTickCount milliseconds approx. 10 ms TimeGetTime milliseconds approx. 10 ms QueryPerformanceCounter QueryPerformanceFrequency same
-
Here’s the patch - I adjust the tick delta when I detect an invalid value. I patch the call to the function that converts the delta ticks to seconds. This is convenient as I have a reference to where the delta is stored in the main loop stack.
static double __cdecl HkCb_TimingSeconds(__int64 &ticks_delta) { double seconds = Timing::seconds(ticks_delta); if (seconds < 0 || seconds > 2.0) { AddLog("ERROR: Time delta invalid seconds=%f ticks_delta=%I64i\n", seconds, ticks_delta); ConPrint(L"ERROR: Time delta invalid seconds=%f ticks_delta=%I64i\n", seconds, ticks_delta); ticks_delta = 1000000; seconds = Timing::seconds(ticks_delta); } return seconds; } In DllMain...() { // Patch the time functions to work around bugs on multiprocessor // and virtual machines. FARPROC fpTimingSeconds = (FARPROC)HkCb_TimingSeconds; ReadProcMem((char*)GetModuleHandle(0) + 0x1B0A0, &fpOldTimingSeconds, 4); WriteProcMem((char*)GetModuleHandle(0) + 0x1B0A0, &fpTimingSeconds, 4); }
Every ~9 minutes I get:
ERROR: Time delta invalid seconds=1199.879824 ticks_delta=4295023823
alternating with at the next 9 minutes
ERROR: Time delta invalid seconds=-1199.839638 ticks_delta=-4294879978Interesting eh? (Well I think it is interesting but I wouldn’t recommend using this to impress women)
-
And so to finally lay this whole issue to rest, I have discovered the follow after speaking to a colleague at work who mentioned VirtualBox, XP and host time synchronisation don’t work well.
Disable host to guest time sync and the problem goes away:
on linux:
vboxmanage setextradata <vmname>“VBoxInternal/Devices/VMMDev/0/Config/GetHostTimeDisabled” “1”on a windows host:
vboxmanage setextradata <vmname>VBoxInternal/Devices/VMMDev/0/Config/GetHostTimeDisabled 1Replace <vmname>with the name of your vm.
Crappy VirtualBox and crappy windows :)</vmname></vmname></vmname>