Opened 14 months ago

Last modified 10 months ago

#14633 new defect

AGS: AVX2 optimizations are not enabled in Win 2.8.0 release builds

Reported by: tag2015 Owned by: fracturehill
Priority: high Component: Engine: AGS
Version: Keywords:
Cc: Game:

Description

This is an already known issue, I'm opening this bugreport so users are aware of the problem, and to track the status of relevant games.

Daily builds of ScummVM 2.8.0 include optimizations (using processor extensions) for the graphics code in the AGS engine, which increase noticeably the FPS but also introduce various problems.

The games have been tested on Win10 with an i7 processor.

The following games are unplayable. Segfaults occur in drawInner4BppWithConv, unless otherwise noted:

  • Strangeland (Steam): Won't even load and crashes silently (gdb reports a heap corruption error)
  • Old Skies (Steam Demo): Crashes after the Wadjet Eye logo
  • Rock'n'Roll will never die! (Steam): Crashes during the intro, while the two guys are talking
  • The Cat Lady (Steam): Crashes in main menu
  • The Excavation of Hob's Barrow (GOG): Intro works. Crashes in the first location (station) while talking to the old woman
  • Murder Cases (Steam Demo): Main menu works. Crashes during intro
  • Downfall (Demo): Crashes when leaving the room
  • Nightmare Frames (Steam): Crashes when starting a new game
  • Unavowed (GOG): Crashed in main menu or during intro
  • 5 Days a Stranger (SE): Crashes during intro
  • King's Quest III (AGDI): Crashes during first talk with the wizard

(...)

Other games basically work but may occasionally show glitches (typically in the mouse cursor or sprites walkcycles) that may lead to crashes:

  • If on a winter's night... (Steam): Loading "flame" may glitch, game may crash with "invalid color depth!"
  • Dreams in Witch House: main character sprite glitches
  • Heroine's Quest: sprite glitches (es. snowflakes)
  • Ben There, Dan That!

(...)

Attachments (1)

asanOutputHQ1.txt (7.9 KB ) - added by antoniou79 13 months ago.

Download all attachments as: .zip

Change History (22)

comment:1 by antoniou79, 13 months ago

I'm currently trying to run Heroine's Quest (latest Steam version, which uses the AGS 3.6.0 engine) and get some crashes during the starting sequence/options of the game. This is on Windows 10 x64 and on Linux (Ubuntu 22.04.3) with a local dev build of ScummVM 2.8git from current master HEAD.

It does not always crash on the same spot. It once crashed after the landslide intro sequence, once after character configuration and once after character selection.

So, I think this might be related to this ticket, albeit I haven't yet noticed glitches or graphical artifacts.

I'm attaching the output from running a build on Linux with ASAN, after crashing (at character selection screen)

by antoniou79, 13 months ago

Attachment: asanOutputHQ1.txt added

comment:2 by tag2015, 13 months ago

Quick update, using ScummVM 2.8.0git8302-ga67dd406dd4 (Oct 8 2023 04:51:30).

Strangeland: no change. Still crashes on boot
Old Skies: seems to work (played till the diner). Occasionally affected by bug #14634
Rock n Roll: no change. Always crashes during initial dialogue
The cat lady: Now goes in-game, but had an early crash in the fields/forest
Hob Barrow: First test (watched intro, etc) crashed when entering the inn. Tried again skipping a bit and managed to get further, should be tested more extensively. Affected by bug #14634
Murder cases: no change. Crashes during intro after the line "darling, are you hurt?" or something
Nightmare frames: no change. Crashes when starting a new game
Unavowed: seems to work, but I only tested the intro/character "creation"
5 days a stranger: no change. still crashes on start/main menu
king's quest III: no change. crashes on first interaction

Version 0, edited 13 months ago by tag2015 (next)

comment:3 by tag2015, 12 months ago

Update, using daily build ScummVM 2.8.0git9003-g2db0c24ce62 (Nov 3 2023 05:04:01) on Win10

  • Strangeland: Now goes in game, but crashes soon after entering the carnival
  • Old Skies: seems to work
  • Rock n' Roll will never die: Seems to work, but crashed in the first location (about 5 min in-game)
  • The Cat Lady: crashed earlier than before, after the tunnel collapse
  • Hob Barrow: Seems to work, played till the second day (after the cat cutscene)
  • Murder Cases: Seems to work
  • Nightmare Frames: Seems to work
  • Unavowed: Seems to work, but crashed during the first puzzle (dumpster)

New bug: Games using flipped sprites are glitchy, such as (but not limited to): Maniac Mansion Deluxe, 5 days a stranger (and the 2 sequels), Apprentice, Twelve Thirteen, Broken Windows, Terror of the Vampire, etc.

comment:4 by sev-, 12 months ago

Priority: blockerhigh

Would be nice to get fixed before 2.8.0 release.

If it still crashes, then we disable the CPU optimizations by default

comment:5 by tag2015, 12 months ago

Owner: set to fracturehill
Resolution: fixed
Status: newclosed

Following the latest fixes, everything seems to be stable and glitch-free!

Closing

comment:6 by tag2015, 12 months ago

Resolution: fixed
Status: closednew

Well, that was premature. Reopening due to recent crash reports

comment:7 by fracturehill, 12 months ago

Yes, I take full responsibility for those; my "fixes" for stretched/flipped blits ended up being broken in a REALLY stupid way. I just pushed some changes and can confirm that both crashes that were reported today have been cleared.

comment:8 by criezy, 12 months ago

There is still some issues with the latest code.
Commit: 8c9bf72f50 (Monday Nov 27)
Platform: macOS M1
Game: QFGII AGDI

How to reproduce: In the hero type selection screen before starting a game select a type (for example fighter) and then press 'h'

ASAN report:

=================================================================
==98302==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x0001383366bb at pc 0x000105adaf5c bp 0x00016b948f20 sp 0x00016b948f18
READ of size 16 at 0x0001383366bb thread T0
    #0 0x105adaf58 in void AGS3::DrawInnerImpl_NEON::drawPixelSIMD<4, 2>(unsigned char*, unsigned char const*, __simd128_uint32_t, __simd128_uint32_t, __simd128_uint32_t, __simd128_uint32_t, int, int, int, int, bool, bool, __simd128_uint32_t) surface_neon.cpp:399
    #1 0x105ac105c in void AGS3::DrawInnerImpl_NEON::drawInner4BppWithConv<4, 2, false>(AGS3::BITMAP::DrawInnerArgs&) surface_neon.cpp:518
    #2 0x105ab7bf4 in void AGS3::BITMAP::drawNEON<false>(AGS3::BITMAP::DrawInnerArgs&) surface_neon.cpp:929
    #3 0x105292914 in AGS3::BITMAP::draw(AGS3::BITMAP const*, Common::Rect const&, int, int, bool, bool, bool, int, int, int, int) surface.cpp:205
    #4 0x105284448 in AGS3::blit(AGS3::BITMAP const*, AGS3::BITMAP*, int, int, int, int, int, int) gfx.cpp:106
    #5 0x105360504 in AGS3::AGS::Shared::Bitmap::CreateCopy(AGS3::AGS::Shared::Bitmap*, int) allegro_bitmap.cpp:105
    #6 0x105364be4 in AGS3::AGS::Shared::BitmapHelper::CreateBitmapCopy(AGS3::AGS::Shared::Bitmap*, int) bitmap.cpp:71
    #7 0x105499aa4 in AGS3::AdjustBitmapForUseWithDisplayMode(AGS3::AGS::Shared::Bitmap*, bool)+0x2cc (scummvm:arm64+0x100ff5aa4)
    #8 0x105499f78 in AGS3::PrepareSpriteForUse(AGS3::AGS::Shared::Bitmap*, bool)+0x28 (scummvm:arm64+0x100ff5f78)

0x0001383366bb is located 3 bytes to the right of 40-byte region [0x000138336690,0x0001383366b8)
allocated by thread T0 here:
    #0 0x122ea3074 in wrap_calloc+0x9c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x3f074)
    #1 0x11214c4cc in Graphics::Surface::create(short, short, Graphics::PixelFormat const&) surface.cpp:79
    #2 0x1120c5eb8 in Graphics::ManagedSurface::create(short, short, Graphics::PixelFormat const&) managed_surface.cpp:153
    #3 0x1120c4908 in Graphics::ManagedSurface::ManagedSurface(int, int, Graphics::PixelFormat const&) managed_surface.cpp:60
    #4 0x1052965fc in AGS3::Surface::Surface(int, int, Graphics::PixelFormat const&) surface.h:332
    #5 0x10529594c in AGS3::Surface::Surface(int, int, Graphics::PixelFormat const&) surface.h:332
    #6 0x10529581c in AGS3::create_bitmap_ex(int, int, int) surface.cpp:368

Same screen, do not press 'h', finish your character selection, and start a game

=================================================================
==98321==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x000140a392bb at pc 0x00010243ef5c bp 0x00016efe3d60 sp 0x00016efe3d58
READ of size 16 at 0x000140a392bb thread T0
    #0 0x10243ef58 in void AGS3::DrawInnerImpl_NEON::drawPixelSIMD<4, 2>(unsigned char*, unsigned char const*, __simd128_uint32_t, __simd128_uint32_t, __simd128_uint32_t, __simd128_uint32_t, int, int, int, int, bool, bool, __simd128_uint32_t) surface_neon.cpp:399
    #1 0x10242505c in void AGS3::DrawInnerImpl_NEON::drawInner4BppWithConv<4, 2, false>(AGS3::BITMAP::DrawInnerArgs&) surface_neon.cpp:518
    #2 0x10241bbf4 in void AGS3::BITMAP::drawNEON<false>(AGS3::BITMAP::DrawInnerArgs&) surface_neon.cpp:929
    #3 0x101bf6914 in AGS3::BITMAP::draw(AGS3::BITMAP const*, Common::Rect const&, int, int, bool, bool, bool, int, int, int, int) surface.cpp:205
    #4 0x101be8448 in AGS3::blit(AGS3::BITMAP const*, AGS3::BITMAP*, int, int, int, int, int, int) gfx.cpp:106
    #5 0x101cc4504 in AGS3::AGS::Shared::Bitmap::CreateCopy(AGS3::AGS::Shared::Bitmap*, int) allegro_bitmap.cpp:105
    #6 0x101cc8be4 in AGS3::AGS::Shared::BitmapHelper::CreateBitmapCopy(AGS3::AGS::Shared::Bitmap*, int) bitmap.cpp:71
    #7 0x101dfdaa4 in AGS3::AdjustBitmapForUseWithDisplayMode(AGS3::AGS::Shared::Bitmap*, bool)+0x2cc (scummvm:arm64+0x100ff5aa4)
    #8 0x101dfdf78 in AGS3::PrepareSpriteForUse(AGS3::AGS::Shared::Bitmap*, bool)+0x28 (scummvm:arm64+0x100ff5f78)

0x000140a392bb is located 3 bytes to the right of 40-byte region [0x000140a39290,0x000140a392b8)
allocated by thread T0 here:
    #0 0x11f80f074 in wrap_calloc+0x9c (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x3f074)
    #1 0x10eab04cc in Graphics::Surface::create(short, short, Graphics::PixelFormat const&) surface.cpp:79
    #2 0x10ea29eb8 in Graphics::ManagedSurface::create(short, short, Graphics::PixelFormat const&) managed_surface.cpp:153
    #3 0x10ea28908 in Graphics::ManagedSurface::ManagedSurface(int, int, Graphics::PixelFormat const&) managed_surface.cpp:60
    #4 0x101bfa5fc in AGS3::Surface::Surface(int, int, Graphics::PixelFormat const&) surface.h:332
    #5 0x101bf994c in AGS3::Surface::Surface(int, int, Graphics::PixelFormat const&) surface.h:332
    #6 0x101bf981c in AGS3::create_bitmap_ex(int, int, int) surface.cpp:368

These two crashes are likely to be caused by the same bug given that the call stack is the same.

comment:9 by lotharsm, 12 months ago

Should we disable the CPU optimizations for the release? That's what I also have to do for the Win32 builds because I couldn't fix it on the toolchain side of things in time - I noticed this way too late on my end and now I'm afraid that playing around with the build system is too risky :/

comment:10 by tag2015, 12 months ago

Thanks to recent changes (not yet in-tree) it seems that the ASAN crashes are fixed, so maybe we can deploy the release with the optimizations enabled :)

comment:11 by lotharsm, 12 months ago

Thank you! Do we have a PR for the changes?

in reply to:  11 comment:12 by fracturehill, 12 months ago

Replying to lotharsm:

Thank you! Do we have a PR for the changes?

We do now! https://github.com/scummvm/scummvm/pull/5485 The SSE2/AVX2 paths have been tested (thanks tag2015!) and seem to work correctly. The NEON path, however, has not, and needs testing by someone with access to an ARM machine.

comment:13 by criezy, 11 months ago

With this latest pull request (which has now been merged) I have not encountered any issue running ScummVM on a mac M1 with neon optimisation enabled and ASAN. However I only tested about 10 minutes in each game.
Games tested:

  • Quest for Glory II
  • 5 days a stranger
  • Trilby's notes
  • The blackwell deception
  • The blackwell legacy
  • Shardlight
  • Strangeland
  • Gemini Rue
  • Lamplight City
  • Nelly Cootalot
  • If on a winter's night four travellers

comment:14 by lotharsm, 11 months ago

Thank you so much for your work!

This should also fix the problems I had with my "buggy" GCC on Win32, right? Just asking so I can turn the optimizations back on :-D

in reply to:  14 comment:15 by fracturehill, 11 months ago

Replying to lotharsm:

Thank you so much for your work!

This should also fix the problems I had with my "buggy" GCC on Win32, right? Just asking so I can turn the optimizations back on :-D

Frankly, I have no idea; I never got around to testing the Win32 builds before the optimizations were disabled. It's definitely worth a shot, though!

comment:16 by lotharsm, 11 months ago

If you want to give it a try, I have builds for Win32 with AVX/SSE2 enabled here:

https://scummvm.serra.me/scummvm-snapshot-2.8.0git-general-avx2-win32.7z

The builds are created hourly, this is just a symlink to the very latest version, including both the 32bit and 64bit version.

in reply to:  16 comment:17 by tag2015, 11 months ago

Replying to lotharsm:

If you want to give it a try, I have builds for Win32 with AVX/SSE2 enabled here:

https://scummvm.serra.me/scummvm-snapshot-2.8.0git-general-avx2-win32.7z

The builds are created hourly, this is just a symlink to the very latest version, including both the 32bit and 64bit version.

I tested quickly on Win10, the 32bit version seems to work normally. The 64bit version instead crashes immediately with any AGS game

comment:18 by lotharsm, 11 months ago

Alright, thank you very much!

So I think I'll keep the AVX builds disabled for this release...

in reply to:  18 comment:19 by fracturehill, 11 months ago

Replying to lotharsm:

Alright, thank you very much!

So I think I'll keep the AVX builds disabled for this release...

You're running your own build, correct? Your 64-bit build is (unfortunately) expected to crash at the moment due to a memory alignment issue caused by a GCC bug. The buildbot has already been patched to sidestep the issue, so those builds won't have any issues.

If you're willing to patch the toolchain for the 32-bit build, you should ask lephilousophe which msys2 patch ended up being used (iirc there were two). If not, disabling AVX2 is the way to go; however, the SSE2 path isn't affected by the same compiler bug, so that can stay on.

Ideally this whole thing should be fixed at code level at some point, but I don't think it's feasible in time for the release...

comment:20 by lotharsm, 11 months ago

Yep, the builds that are currently crashing are made by the same toolchain that is building the official Win32 release builds.

My hope was that the GCC bug was being fixed or worked around with the latest patches.

Unfortunately, updating the toolchain is not exactly easy and too risky to do before this release.

Thank you for confirming that only AVX2 is affected, so I'll disable that while leaving the SSE2 optimizations enabled.

Updating GCC is on my to-do for the time after the release.

comment:21 by tag2015, 10 months ago

Summary: AGS: In daily 2.8.0 builds many games are unplayable and/or affected by graphical glitches or crashesAGS: AVX2 optimizations are not enabled in Win 2.8.0 release builds
Note: See TracTickets for help on using tickets.