Audio SPU Presentation

Audio Rendering Parameter Generation on the SPU

A tale of porting a heavyweight function.

SoundEngine::Process()

Fills out all sound rendering parameters that are passed into the synthesis engine.

– Eg. Volume, azimuth, elevation, reverb send levels, filter cutoff.

Rendering parameters are calculated from:

– listener position,

– listener direction, and

– sound positions,

– output from indirect audio.

Called once per sound, per frame.

Perfect candidate for parallelization.

SPURS Jobs: A convenient fit

Appropriate for short bits of code that is executed many times on independent data.

SPURS will take care of scheduling and will play nicely with other SPURS workloads.

SPURS takes care of running multiple SoundEngineProcessJob’s which have been queued up in a job list in parallel!

SPURS takes care of pipelining asynchronous I/O.

– While the current job is running, the output from the last job is being DMA’s back to PPU, and/or the input for the next job is being DMA’d to the SPU.

Determining Inputs

Listener Position, direction. Sound Position. Static sound rendering parameters.

– Defined for each individual sound <sound_bank>.csv– Encapsulated in “Spatial Info” struct

Filtered render state.– Output data from indirect audio, smoothed over time.

Reverb preset for sound’s position in space. Is the sound visible to the listener?

– Causes direct sound to be filtered.

Is the sound in the same reverb region as the listener.– Causes reverberated sound to be filtered.

Determining Outputs

SoundParams struct.– Passed into SCREAM to tell NextSynth how to

render the sound.

– Defines vol, pan, pitch, SCREAM registers, various filters, send levels, special FX, etc.

Reverberation accumulation– There are many sounds, but only 6 reverb

units.

– Reverb units “accumulate” directional gain from each currently playing sound.

Movin’ stuff around.

Pointers are no longer valid after data has been copied to the SPU!

Classes with virtual functions will not work without some v-table patching trickery. They are best avoided.

SPU’s like data in flat arrays which are aligned to 16-byte boundaries, so they can be easily copied.

Simplify complex classes into structs for data that is copied to/from the SPU.

Use structures of arrays where appropriate for vectorization.

/*

* SoundsEmitter

*/

class SoundEmitter: public SoundEmitterBase

{

public:

struct SoundEntryVector

{

SoundEntryVector(uint size);

~SoundEntryVector();

SoundInstancePtr AddSound( … );

void Erase(unsigned int i);

SoundInstancePtr* mSoundsArray;

SoundParams* mParamsArray;

float* mElapsedTimesArray;

float* mTriggerTimesArray;

SpatialInfo* mSpatialInfoArray;

const SoundDef** mSoundDefsArray;

unsigned int size;

unsigned int maxSize;

};

SoundEntryVector mSoundEntries;

/*

* SoundsEmitter

*/

class SoundEmitter: public SoundEmitterBase

{

public:

struct SoundEntry: public AudioPtrBase<SoundEntry>::type

{

typedef AudioPtr<SoundEntry>::type Ptr;

SoundEntry();

~SoundEntry();

SoundInstancePtr mSound;

PlayParams mParams;

float mElapsedTime;

float mTriggerTime;

};

typedef List<SoundEntry::Ptr> SoundEntryList;

SoundEntryList mSoundEntries;

…

};

A SoundEmitter now keeps track of the sounds that it is emitting in a struct of arrays, instead of a list. This allows the SoundParams to be DMA’s directly into the array.

class IIR : public AlignedObject

{

public:

IIR();

virtual ~IIR();

float GetValue() const;

void SetValue(float value);

void SetConstant(float constant);

virtual void AddSample(float newsample);

protected:

float m_curval;

float m_constant;

float m_invConstant;

};

class TimeDependentIIR : public IIR

{

public:

TimeDependentIIR();

virtual ~TimeDependentIIR();

void SetHalfLife(float halflife);

float GetHalfLife() const;

virtual void AddSample(float sample, float dt);

protected:

float m_halfLife;

};

class IIRArray : public AlignedObject {

public:

IIRArray();

float GetValue(unsigned int idx) const;

void GetValuesRange(u32 offs, float* out, u32

n) const;

void SetValuesRange(u32 offs, const float*

value, u32 n);

void SetHalfLifes(const float* halflife);

float GetHalfLife(unsigned int idx) const;

void AddSamples(const float* newsamples, float

deltatime);

protected:

static const uint ARRAY_SIZE = 24;

static const uint NUM_VECS = ARRAY_SIZE/4;

union{vector float m_curval[NUM_VECS];

float m_curval_s[ARRAY_SIZE]; };

union{vector float m_constant[NUM_VECS];

float m_constant_s[ARRAY_SIZE];};

union{vector float m_invConstant [NUM_VECS];

float m_invConstant_s[ARRAY_SIZE];};

union{vector float m_halfLife[NUM_VECS];

float m_halfLife_s[ARRAY_SIZE];};

};

The IIR class gets flattened out, and vectorized. Turns out there is only one type of IIR

class FilteredRenderState : public AlignedObject, public NonCopyable{…Private:SMath::Vector m_unfilteredIndirectDirection;SMath::Point m_unfilteredIndirectPosition;

float m_unfilteredIndirectDistance;TimeDependentIIR m_filteredIndirectDistance;

float m_unfilteredDirectDistance;TimeDependentIIR m_filteredDirectDistance;

float m_unfilteredDirectFocus; TimeDependentIIR m_filteredDirectFocus;

Float m_unfilteredIndirectFocus; TimeDependentIIR m_filteredIndirectFocus;

TimeDependentIIR m_filteredIndirectDirectionX;TimeDependentIIR m_filteredIndirectDirectionY;TimeDependentIIR m_filteredIndirectDirectionZ;

TimeDependentIIR m_filteredIndirectPositionX;TimeDependentIIR m_filteredIndirectPositionY;TimeDependentIIR m_filteredIndirectPositionZ;

float m_unfilteredDirectOcclusionLevel;TimeDependentIIR m_filteredDirectOcclusionLevel;

float m_unfilteredIndirectOcclusionLevel;TimeDependentIIR m_filteredIndirectOcclusionLevel;

float m_unfilteredIndirectObstructionLevel;TimeDependentIIR m_filteredIndirectObstructionLevel;

float m_unfilteredDirectObstructionLevel;TimeDependentIIR m_filteredDirectObstructionLevel;

float m_unfilteredSourceReverbGains[NUM_REVERB_GAINS];TimeDependentIIR m_filteredSourceReverbGains[NUM_REVERB_GAINS];

};

class FilteredRenderState : public AlignedObject, public NonCopyable

{

public:

enum

{

DirectDistanceIdx,

DirectFocusIdx,

DirectOcclusionLevelIdx,

DirectObstructionLevelIdx,

IndirectDistanceIdx,

IndirectFocusIdx,

IndirectOcclusionLevelIdx,

IndirectObstructionLevelIdx,

IndirectDirection0Idx,




IndirectPosition0Idx,




SourceReverbGain0Idx,








NUM_INDICIES

};

…

float m_unfilteredValues[NUM_INDICIES];IIRArray m_filteredValues;

There…

..much better!

Comparison of PPU/SPU methods• 3-Player game.

• After data structure reorganization.

• SoundEngine::PreUpdate() w/ PPU Process() call (PreUpdate calls Process directy.)

• Min 1.27 ms

• Max 5.51 ms

• Avg 2.959 ms

•SoundEngine::PreUpdate() w/ SPU Process() call. (PreUpdate queues up SPU jobs)

•Min 529.2 us•Max 3.18 ms•Avg 1.16 ms

Technology

Audio SPU Presentation