Opened 8 months ago
Closed 8 months ago
#15015 closed defect (fixed)
COMMON: AGS games lead to a reproducable crash on exiting ScummVM
Reported by: | raziel- | Owned by: | PushmePullyu |
---|---|---|---|
Priority: | normal | Component: | Engine: AGS |
Version: | Keywords: | ||
Cc: | Game: |
Description
ScummVM 2.9.0git (Mar 11 2024 21:02:00)
Using SDL backend with SDL 2.30.1
Features compiled in: Vorbis MP3 RGB zLib MPEG2 MikMod Theora VPX AAC A/52 FreeType2 FriBiDi JPEG PNG GIF cloud (servers, local) ENet SDL2 TinyGL OpenGL (with shaders)
Loading any AGS game (tried with "A Golden Wake" and "Gemini Rue"), while only loading into the options/start menu of the game, then RTL and quitting ScummVM, will lead to a certain crash.
Crash and serial logs attached
Any AGS game
AmigaOS4 - PPC - BE
gcc (adtools build 11.3.0) 11.3.0
Attachments (5)
Change History (28)
by , 8 months ago
Attachment: | AGS crash.txt added |
---|
by , 8 months ago
Attachment: | Crashlog_scummvm_2024-03-11_18-17-02.txt added |
---|
comment:1 by , 8 months ago
I did some digging and now have a suspicion of what could be causing the crash.
Could you test with this branch?
https://github.com/PushmePullyu/scummvm/tree/amigaos-agi-crash
Note: This also includes the atexit fixup from https://github.com/scummvm/scummvm/pull/5696
comment:2 by , 8 months ago
I built ScummVM with one engine (AGS) to test and it threw up this very strange error on start
ELF library Unable to resolve symbol '' in Development:Porting/ScummVM/scummvm-amigaos-ags-crash/install/plugins/ags.plugin
followed by
ELF library Failed to resolve symbol at runtime. Process 'Shell Process' bas been suspended.
followed by a crash, see log
After that i built the AGI engine with the exact same branch and it worked
by , 8 months ago
Attachment: | Crashlog_scummvm_2024-03-14_16-22-14.txt added |
---|
comment:4 by , 8 months ago
i *think* it has to do with a libsdl update...wasted a whole day already and slowly getting there, new builds always crash like that...trying to get back a revision and go from there
comment:5 by , 8 months ago
@PushmePullyu
Would it be possible to split the two branches?
one with the AGS exit fix and one with the unfreed signals?
something smelly hit the fan really hard over here and right now im trying to figure out why only this branch of yours is driving me mad...
could those changes affect plugin behaviour?
thank you
comment:6 by , 8 months ago
@raziel-
atexit fixup only:
https://github.com/PushmePullyu/scummvm/tree/amigaos-atexit-fixup-only
ags fix only, without any atexit cleanup (i.e. this includes your revert commit):
https://github.com/PushmePullyu/scummvm/tree/ags-fix-without-atexit-cleanup
ags fix with original atexit cleanup:
https://github.com/PushmePullyu/scummvm/tree/ags-fix-with-orig-atexit
The only thing the AGS fix does is to replace a local static with a field, which removes the need to run its dtor on program exit / DSO unload.
My branches are based on 8366119da3ff867621b9164575c9df36225eabcd "AMIGAOS: clean up compiler flags", so if you were testing with a previous commit, maybe the problems are caused by some other recent changes to master.
Might be
06a87ccaf07716f3498eaa988bfe1fb83381bcb1 "AGS: Add MAD to Engine Dependencies".
Did you try a commit before that?
This branch has the atexit fixup and the AGS fix, but rebased onto the parent commit of the MAD change: https://github.com/PushmePullyu/scummvm/commits/amigaos-agi-crash-rebase-before-mad
comment:7 by , 8 months ago
Thank you for all the test branches, i'll get to it
yes, i was only testing branches earlier to your branch, but after the compiler flag changes, but those are not the culprit, i reverted those locally and it still broke.
i'll test with a fresh master branch and yours (without mad) and report back
Thanks a lot
comment:8 by , 8 months ago
@PushmePullyu
First of all...great news...and great work :-D
The "amigaos-agi-crash-rebase-before-mad" branch works and the AGS exit crash is gone as well :-)
Need to test a current master build to see if it really was the MAD implementation.
Your other branches i could not (yet) download, will do eventually
Thanks a ton
comment:9 by , 8 months ago
Yep, current master is bugged
I'll bisect tomorrow...but what would i do if it really is the MAD addition?
I do have libmad.so in sobjs/
comment:10 by , 8 months ago
@PushmePullyu
Your assumption was spot on :-)
https://github.com/scummvm/scummvm/commit/4ea65b1a963fb6988117cf7b7ed94a03c6d7d101
is fine (before AGS: Add MAD to engine dependancies)
https://github.com/scummvm/scummvm/commit/66bc819014bb675b685eabec60781442a7b2a660
is bugged (after AGS: Add MAD to engine dependancies)
Should i open a new tracker item?
NB:
Funny though https://github.com/scummvm/scummvm/commit/06a87ccaf07716f3498eaa988bfe1fb83381bcb1 which introduced the change works and starts fine...it's the one commit AFTER that crashes :-/
comment:11 by , 8 months ago
@PushmePullyu
Please go ahead and PR the unfreed and AGS exit fix...maybe it can be backported, but i guess not, master is already frozen
...and disregard the last three posts regarding libmad and/or crashes...the reason lies somewhere else and definitely stems from my compiler flag changes/PR.
Investigating...but i think it might be the shared detection that causes it...maybe AmigaOS cannot work with accessing two different plugins at once (detection and ags.plugin in this case)?
Thank you
comment:12 by , 8 months ago
For reference: PR is at https://github.com/scummvm/scummvm/pull/5713
Since the removal of the static fixed the AGS crash, there seems to be some problem with DSO unloading and dtors of statics.
Dtors registered from a DSO _seem_ to run on program exit instead of SO unload (or maybe they are run twice). Perhaps your libc / build is not using __cxa_atexit?
From the gcc manpage:
-fuse-cxa-atexit
Register destructors for objects with static storage duration with the "__cxa_atexit" function rather than the "atexit" function. This option is required for fully standards-compliant handling of static destructors, but only works if your C library supports "__cxa_atexit".
comment:13 by , 8 months ago
should -fuse-cxa-exit be default or do I need to add it to the compiler flags?
but it wouldnt do anything if the c lib doesn't support it...
comment:14 by , 8 months ago
I have pushed some code that tests if statics are destroyed on DSO unload here:
https://github.com/PushmePullyu/sotest/tree/main
You will have to adjust the compiler flags in build-linux.sh to your system (and maybe change the SONAME define in sotest.cpp).
Output on my system:
./sotest
sotest.cpp:35: int main(): Registering cleanup function sotest.cpp:43: int main(): Loading SO sotest.cpp:48: int main(): Getting function pointer from SO sotest.cpp:53: int main(): Calling SO function libfoo.cpp:11: void someSOFunc(): This is someSOFunc() libfoo.cpp:6: void someSOFuncWithStatic(): This is someSOFuncWithStatic() testclass.cpp:9: TestClass::TestClass(const char*): TestClass ctor: context: libfoo, this=0x7f5647ca4080 sotest.cpp:56: int main(): Calling local function sotest.cpp:27: void localFuncWithStatic(): This is localFuncWithStatic() testclass.cpp:9: TestClass::TestClass(const char*): TestClass ctor: context: sotest, this=0x4040c0 sotest.cpp:59: int main(): Unloading SO testclass.cpp:13: TestClass::~TestClass(): TestClass dtor: context: libfoo, this=0x7f5647ca4080 sotest.cpp:62: int main(): Calling SDL_Quit sotest.cpp:65: int main(): Returning from main testclass.cpp:13: TestClass::~TestClass(): TestClass dtor: context: sotest, this=0x4040c0 sotest.cpp:23: void cleanup(): This is cleanup()
To see whether gcc was built with cxa_atexit, you can use -v:
g++ -v 2>&1 | grep -io '[^[:space:]]*cxa_atexit'
Output:
--enable-__cxa_atexit
If it was, -fuse-cxa-atexit
should be passed by default.
Or:
g++ -Q --help=c++ | grep -iF atexit
Output:
-fuse-cxa-atexit [available in C++, ObjC++]
You could also check your C lib / scummvm executable for the cxa_atexit symbol.
E.g. on a Linux system using glibc:
nm -D /lib64/libc.so.6 | grep -iF cxa_atexit
Output:
000000000003fb70 T __cxa_atexit@@GLIBC_2.2.5
comment:15 by , 8 months ago
with just a little addition the script ran perfectly :-)
unfortunately, not the same output
/Development/Porting/sotest-main> ./sotest sotest.cpp:35: int main(): Registering cleanup function sotest.cpp:43: int main(): Loading SO sotest.cpp:18: void fail(): error: DLOpen failed for './libfoo.so' sotest.cpp:23: void cleanup(): This is cleanup()
---
++ -v 2>&1 | grep -io '[^[:space:]]*cxa_atexit'
that doesn't work here, but i get this with a different method
Binary file SDK:gcc/bin/g++ matches
so i guess it's inside?
---
g++ -Q --help=c++ | grep -iF atexit
Output:
-fuse-cxa-atexit [available in C++, ObjC++]
---
nm -D /lib64/libc.so.6 | grep -iF cxa_atexit
that doesn't work either, so i guess not?
I can find a "atexit" when looking with a hex editor, but nothing "cxa"
comment:16 by , 8 months ago
sotest.cpp:18: void fail(): error: DLOpen failed for './libfoo.so'
Does libfoo.so exist after building? If yes, you probably need to change sotest.cpp:8
#define SONAME "./libfoo.so"
to an AmigaOS path (can be absolute or relative).
comment:17 by , 8 months ago
i missed the path remark, sorry, went a little further :-)
sotest.cpp:35: int main(): Registering cleanup function sotest.cpp:43: int main(): Loading SO sotest.cpp:48: int main(): Getting function pointer from SO sotest.cpp:53: int main(): Calling SO function libfoo.cpp:11: void someSOFunc(): This is someSOFunc() libfoo.cpp:6: void someSOFuncWithStatic(): This is someSOFuncWithStatic() testclass.cpp:9: TestClass::TestClass(const char*): TestClass ctor: context: libfoo, this=0x5c65d714 sotest.cpp:56: int main(): Calling local function sotest.cpp:27: void localFuncWithStatic(): This is localFuncWithStatic() testclass.cpp:9: TestClass::TestClass(const char*): TestClass ctor: context: sotest, this=0x5c660870 sotest.cpp:59: int main(): Unloading SO sotest.cpp:62: int main(): Calling SDL_Quit sotest.cpp:65: int main(): Returning from main testclass.cpp:13: TestClass::~TestClass(): TestClass dtor: context: sotest, this=0x5c660870
after that line it crashes, no cleanup()
comment:19 by , 8 months ago
Yes, the output confirms that static dtors are not run on DSO unload, but at program exit. Since the SO is already unloaded, this causes a crash.
Could you test if adding -fuse-cxa-atexit
to both g++ invocations changes anything?
comment:20 by , 8 months ago
sure, linker error though
/T/ccfTyWod.o:(.got2+0x20): undefined reference to `__dso_handle' Development:Coding/SDK/gcc/ppc-amigaos/bin/ld: libfoo.so: hidden symbol `__dso_handle' isn't defined Development:Coding/SDK/gcc/ppc-amigaos/bin/ld: final link failed: Bad value
---
#!/bin/sh g++ -Wall -fuse-cxa-atexit -fPIC -shared -o libfoo.so libfoo.cpp testclass.cpp -fuse-cxa-atexit && \ g++ -Wall -fuse-cxa-atexit -o sotest sotest.cpp testclass.cpp $(sdl2-config --cflags) $(sdl2-config --libs) -athread=native
comment:21 by , 8 months ago
Alright, thanks! Looks like -fuse-cxa-atexit
only works for glibc currently.
So, on your platform, unloading a DSO that registers destructors/functions to be run on exit will always crash when the main program exits.
The good news is that the ScummVM coding conventions forbid constructs which require such registration (see https://wiki.scummvm.org/index.php?title=Coding_Conventions#Use_of_global_.2F_static_variables).
They still creep into the code sometimes, so should you get similar on-exit crashes in the future, this might be the cause.
In case you want to open an issue/feature request for this somewhere (adtools maybe), there is a simplified version of the repro code at https://github.com/PushmePullyu/sotest/tree/repro-dlclose-dtors, which removes the SDL dependency. Not sure if it builds on AmigaOS without modifications. Note: The build script needs a Bourne-compatible shell.
I suppose this ticket can be closed then?
P.S. Sorry for the delay, has been a super busy week.
comment:23 by , 8 months ago
Owner: | set to |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Thanks a lot for your work! Closing then :)
Grim Reaper log 1