Crash Offsets
-
/ONECPU - use one CPU on multiprocessor system
http://en.wikipedia.org/wiki/NTLDR -
I still have one problem - flserver frozens with window whitening and flhook window is fine but ofcourse without any response to commands.
All the same - no logs, no events, nothing.
Because all my server are on physical machine i dunno how it will be after restart with /usepmtimer key in boot.ini
Also will try to disable apic/acpi functions in bios - google says must help.
-
usepmtimer and /onecpu did not help my problem.
Still crashing in dalib.dll.
What causes hash collisions?
-
Well, here’s my issue in a nutshell and the specific post which refers to my interest in the causes for a ‘hash collision’.
-
content.dll - 0x490a5 : formation errors, check faction_prop.ini formation values
-
I am stilled behind this crash at engbase.dll (+0x0124bd). I attached an debugger and the debugger always halt at this stack back trace:
# Memory ChildEBP RetAddr Args to Child 00 00129544 066123db 00d1cec8 08e5c568 0012d5e4 EngBase+0x124bd 01 20 00129564 0661ae06 06612567 08e5c568 00c46318 EngBase+0x23db 02 4 00129568 06612567 08e5c568 00c46318 08e07aac EngBase+0xae06 03 a8 00129610 4fdf4e22 00002800 00000000 0b662220 EngBase+0x2567 04 14 00129624 4fd9c47d 0b662220 00000000 00000008 d3d9!CD3DDDIDX8::LockVB+0x32 (FPO: [2,0,0]) 05 14 00129638 06d12996 0b662220 00006300 00000120 d3d9!CDriverVertexBuffer::Lock+0x4d (FPO: [5,0,4]) 06 40a8 0012d6e0 7c9201db 77bfc3c9 00330000 00000000 RP8+0x12996 07 234 0012d914 77bfc3c9 00330000 00000000 77bfc3ce ntdll!RtlAllocateHeap+0xeac (FPO: [Non-Fpo]) 08 40 0012d954 77bfc3e7 00000014 0012d970 77bf9cd4 msvcrt!_heap_alloc+0xe0 (FPO: [Non-Fpo]) 09 c 0012d960 77bf9cd4 00000014 00000001 06be0038 msvcrt!_nh_malloc+0x13 (FPO: [2,0,0]) 0a 10 0012d970 06b73f73 00000014 00000000 09061c28 msvcrt!operator new+0xf (FPO: [1,0,0]) 0b 14 0012d984 06b71935 0012d9a8 090603c0 0012d9ac ReadFile+0x3f73 0c 00000000 00000000 00000000 00000000 00000000 ReadFile+0x1935
The ReadFile is as far as i could find out something with hudframe021004123005 << dunno maybe something from there…
Has anyone an idea what is done here? Has it something to do with graphical problems due to d3d9!CD3DDDIDX8??
–> that seems to be addressed when the server connection is lost
Correction: The stackback trace is:
# Memory ChildEBP RetAddr Args to Child WARNING: Stack unwind information not available. Following frames may be wrong. 00 00127434 0628a428 00d1cec8 0932d098 00000000 EngBase+0x124bd 01 18 0012744c 06293cce 0932d098 00000000 0932cff0 Common!PhySys::FindSphereCollisions+0x438 02 2c 00127478 0629ed87 00127464 77c05c94 001274f0 Common!CNonPhysAttachment::Disconnect+0x26e 03 1c 00127494 77bfc2e3 0932cff0 0629ed87 00000000 Common!CExternalEquip::IsConnected+0x7 04 70 00127504 0629b344 00000001 093292bc 093291d8 msvcrt!free+0xc8 (FPO: [Non-Fpo]) 05 20 00127524 062a922e 093293a8 093291d8 09329548 Common!CEquipManager::Clear+0x94 06 34 00127558 062b0c21 00000000 093291d8 093233d4 Common!CEqObj::~CEqObj+0x9e 07 2c 00127584 06288222 00000000 00000000 062af65b Common!CShip::~CShip+0x271 08 c 00127590 062af65b 00000001 09323344 00539599 Common!BaseWatcher::~BaseWatcher+0x52 09 00000000 00000000 00000000 00000000 00000000 Common!CObject::Release+0x1b
Still i have no clue what happens at the engbase.dll is there a way to find out what routines are working there? The dissambly looks like:
crash offset is this line:
066224bd 8b4010 mov eax,dword ptr [eax+10h] ds:0023:09054e20=00000000
066224a9 90 nop 066224aa 90 nop 066224ab 90 nop 066224ac 90 nop 066224ad 90 nop 066224ae 90 nop 066224af 90 nop 066224b0 8b442408 mov eax,dword ptr [esp+8] 066224b4 83f8ff cmp eax,0FFFFFFFFh 066224b7 740b je EngBase+0x124c4 (066224c4) 066224b9 85c0 test eax,eax 066224bb 7407 je EngBase+0x124c4 (066224c4) 066224bd 8b4010 mov eax,dword ptr [eax+10h] ds:0023:09054e20=00000000 066224c0 85c0 test eax,eax 066224c2 7503 jne EngBase+0x124c7 (066224c7) 066224c4 83c8ff or eax,0FFFFFFFFh 066224c7 c20800 ret 8 066224ca 90 nop 066224cb 90 nop 066224cc 90 nop 066224cd 90 nop 066224ce 90 nop 066224cf 90 nop
Any hints would be appreciated. I am bad in understanding this assembler stuff ;(
-
Are these servers running mods at all? Some of these, if they are mods, sound like simple testing should have picked them up and been recognised…
I know it’s a non contributory post, just this whole thread seemed an interesting concept and made me immediately suspect it’d replace testing and experience (not trying to offend folks )
-
@Huor: FindSphereCollisions could suggest a sur problem. Then again, looking at it, it looks like the warning is right, so I wouldn’t put too much credence on the trace. It’s a really strange error, since it already tests for ERROR and NULL, so eax appears legitimate, but is not. Furthermore, it looks like it’s telling you what is there, so it really is legitimate, so where’s the error coming from? Or is this just from a breakpoint, not a crash? Is there something I can test myself, or a remote connection?
-
I am not an server operator - just someone who might understand a bit of that coding stuff - but not at the level of Adoxa ;D
The stack back trace is made from a breakpoint and i have set the breakpoint to the offset where the server is causing crashes (engbase.dll + 0x0124bd) that we are hunting now for some weeks. I overstepped the breakpoint several times but the stack trace was looking always and nearly the same. So i assume that when it really crashes it must be one of these calling routines that may lead to the crash. And i tested it only client sided - the crash happens at the flserver - so it may be wrong what i wrote anyway.
We tried several stuff and it seems this crash offset is the only one remaining. We are using vanilla surs on the server for some weeks so normally that should not be related to it. Spheres are used for several stuff - so could it also have something to do with NPCs crashing into planet or something like this? As we did disable NPCs for some time the error wasnt there. So its really annoying to dont find the reason for this crash.
-
After tracing it myself, it appears to be related to cmp reading - it seems to return the parent object. It’s been called from GetRoot, Hierarchy::GetDepth and CEGun::ComputeTurretFrame. It appears there’s either something wrong with your cmp file, or with something that uses it. I’m afraid I can’t be more specific, without knowing where it’s actually crashing. If you look at [eax+0x0C], that should point you to which object is going wrong. For example, my current breakpoint has EAX = 0xA478770, [0xA47877C] is 0x9FF2550; [0x9FF2554] is 0x9FF25A1, a pointer to “equipment\models\weapons\li_laser_beam.cmp”. [eax+0x08] is similar, pointing to the particular .3db within the .cmp.
-
Here’s a plugin to log what’s happening with engbase at 0x124bd. Add it to dacomsrv.ini and you’ll get EXE\EngBase-0124BD-YYYY-MM-DD.hhmmss.txt (the time when the server was started). Since there’s a lot of data, I reset it every 100 calls, so there’s a slight possibility the crash will occur with no context. I also try another test for a bad address (thus preventing the crash); if it occurs, the file is renamed as *-bad_N_.txt (at least, I hope it is, didn’t actually test it).
-
Now we have sometimes 000c45a2 in content.dll - something wrong with npc, but what?
-
adoxa wrote:
That’s a really strange address for a crash - cmp dword[ecx+34], 1 when there’s mov [ecx+2c], eax a few instructions earlier.i use http://the-starport.net/freelancer/forum/viewtopic.php?post_id=31645#forumpost31645 patch but think it is of wrong encounter parameters
-
adoxa wrote:
Ah, that explains it. I did a better patch in an IM: 0C457F, 9981E2FF->7411EB05. Don’t forgot to undo the other one.Undo #34 and apply this?
-
adoxa wrote:
Ah, that explains it. I did a better patch in an IM: 0C457F, 9981E2FF->7411EB05. Don’t forgot to undo the other one.Tried on vanilla content dll without patches - crash at 000c458f