#8660 closed patch (outdated)
ARM: Inline assembly for saturated addition of sound
Reported by: | SF/robinwatts | Owned by: | SF/knakos |
---|---|---|---|
Priority: | normal | Component: | Audio |
Version: | Keywords: | ||
Cc: | Game: |
Description
The "clampedAdd" function is used to do saturated addition of sound sample values. There is a trick that can be used on ARM chips to do this operation that cannot easily be expressed in C.
This patch uses gcc's inline assembler to implement the trick.
Robin
Ticket imported from: #1717959. Ticket imported from: patches/765.
Attachments (1)
Change History (14)
by , 18 years ago
comment:1 by , 18 years ago
Owner: | set to |
---|
comment:2 by , 18 years ago
Summary: | WINCE Inline ARM assembly for saturated addition of sound → ARM: Inline assembly for saturated addition of sound |
---|
comment:3 by , 18 years ago
1) I'm dropping the 'WINCE' summary prefix as this applies to all arm ports.
2) The routine looks good, implementing the clamped addition by means of the overflow flag and conditional execution trickery. Note that the ST_SAMPLE_MAX and ST_SAMPLE_MIN are implied since all calculations take place in the 16 msbits of the regs. Aside from the code compactness, the main gain here is the non-conditional branching way this is coded, which results in no instruction pipeline stalls. The compiler won't implement such stuff this efficiently although I haven't check it to make sure.
So two questions arise: 1) Does this make so much a difference in speed of execution? I get that it's real-time etc, but do we really need it? Don't get me wrong, I get anything I can take and would incorporate such a routine immediately, but considering also question 2) Should we clutter the scummvm core with port-specific routines? This is not something I can answer and would expect a response from the team leaders before going through with it. Also to be taken into account is the fact that this code resides in a header file, which gets less in the way when browsing code.
comment:4 by , 18 years ago
1) Does it make that much difference to the speed of execution - well, staring at the assembler output, it looks to save a couple of cycles per sample. Normally, I'd reckon that was well worth having in a sound fill routine - but looking at how the rest of the routine is compiled by gcc, I'm not sure it'll actually make that much difference :(
2) Should we clutter the core with port specific routines - I understand your reservations here, and I agree with them in principle - nothing can be more nasty to read than a huge mass of ifdefs. If the scummvm team wish to refuse the patch for this reason I'd entirely understand.
Other parts of the system (such as scalers) do appear to have assembler versions, but these are mostly in separate files - maybe I should consider just ARM coding the complete sound fill routines?
comment:5 by , 18 years ago
I have ARM code versions of the Simple and Copy rate controllers working here. I intend to look at an ARM version of the Linear rate controller this weekend. So this patch is probably irrelavent now.
comment:6 by , 18 years ago
Deassigning myself until:
1) Robin submits another patch with more functions (in which case we can close this), or 2) Until we get a response from the team leaders about their intention include such code fragments in the core (see cluttering argument below)
comment:7 by , 18 years ago
Owner: | removed |
---|
comment:8 by , 18 years ago
Well, clobbering our core code with assembly code is not acceptable.
It could be addressed at least this way:
#if !defined(ARM) static inline void clampedAdd(int16& a, int b) { ... #else #include "my/arm/specific/code.cpp" #endif
At least this will be less distractive.
comment:9 by , 18 years ago
Status: | new → closed |
---|
comment:10 by , 18 years ago
I have raised a new patch (1721826) with fully ARM code optimised versions of the rate conversion/mixing code. These work on WinCE (for all the games I have tested them with), and should work on other ARM based platforms too.
Hopefully the way that patch is structured (all changes in new files) will be acceptable to your 'neatness' criteria.
I am therefore closing this patch. I trust this is OK?
Thanks,
Robin
comment:12 by , 18 years ago
Owner: | set to |
---|---|
Resolution: | → outdated |
comment:13 by , 6 years ago
Component: | → Audio |
---|
Inline assembly ARM version of clampedAdd