STx 3.9 Documentation

Book

SPU Reference

Previous

ADANA1 - basic spectral descriptors

Next

ADD

ADASEG - RMS based automatic segmentation

Automatic segmentation using adaptive background amplitude estimation.

The segmentation algorithm is as follows. The input spectrum is truncated to the selected frequency range (fmin, fmax) and the A-weighting function is applied (if aw=1). Then the long-time RMS value PL and the short-time RMS value PS are computed from the input spectra (see note 1). A segment (event) is detected, if the short-time RMS PS is higher than PL+lsignal for a duration of at least tmin seconds. The segment time values and some parameters are stored in the output table.

Notes

  1. Measurement method vor PL and PS: The RMS values computed from the preprocessed spectra are delayed and buffered for the specified time tla/tsa. The value PL/PS is set to the level where plrms/psrms percent of the delayed RMS values are greater than PL/PS.
  2. Signal regions with a short-time RMS PS that is higher than PL+lpause are not included in the long-time level measurement.
  3. During the long-time measurement also a long-time spectrum amplitude AL is computed. For each input spectrum included in the long-time measurement, the amplitude aL is set to the spectral amplitude where plamp percent of the spectrum amplitudes are greater than aL. The amplitude AL ist the running average of all aL over the time tla. E.g. this value can be used as offset level (floor) for the computation of centroid and spread of the center-segment/s.
  4. The center-segment is the part of the segment where the RMS level is higher than P01-oamax. The center-segment boundaries (tbc and tec) are deteced by scanning the RMS track backward and forward starting at the maximum RMS value inside the segment.
  5. The RMS percentage levels pXX are computed for each channel over the whole RMS track of the segment. The value pXX is the RMS level where XX percent of the RMS values are higher than pXX.
  6. All computations of RMS values are performed on the squared energy track and than converted to dB.
  7. Stereo signal processing is performed if the input A2 is connected. For the segmentation algorithm the both channels are mixed, but the center-segment detection and parameter extraction is performed for each channel.

This SPAtom was developed for the NOIDESc project in 2006.

Usage:

ADASEG A1 [ A2 ] TABPAR TABDAT TABSEG

Inputs:

A1

The FFT amplitude spectrum (linear) of the 1st channel.

A2

The FFT amplitude spectrum (linear) of the 2nd channel.

TABPAR

The control table. This is an extended or parameter table with 1 numeric field and at least 2 rows.

Row/Column Index

Name

Description

Default Value

[0,0]

dt

frame hopsize in seconds

no default

[1,0]

dfz,

FFT frequency resolution in Hz

no default

[2,0]

aw

enable (1) or disable (0) spectral A weighting

1

[3,0]

aref

reference amplitude (for dB conversions)

20e-6

[4,0]

fmin

lower boundary of analysis band in Hz

0

[5,0]

fmax

upper boundary of analysis band in Hz

8000

[6,0]

tla

long-time average time in seconds

60

[7,0]

plrms

percentage for long-time RMS in %

95

[8,0]

plamp

percentage for long-time amplitude in %

25

[9,0]

tsa

short-time average time in seconds

1

[10,0]

psrms

percentage for short-time RMS in %

95

[11,0]

lsignal

signal offset level in dB

10

[12,0]

lpause

pause offset level in dB

6

[13,0]

tmin

minimum segment duration in seconds

3

[14,0]

oamax

offset level for segment center in dB

10

The following conditions apply:

df > 0, 0 = fmin < fmax < nA*df (nA = length of A1, A2)

dt > 0, tsa = 20*dt, tla = 10*tsa, tmin = 2*tsa

1 = plrms = 99, 1 = psrms = 99

lpause = 3, lsignal = lpause, oamax = 3

TABSEG

The segment output table. This is an extended or parameter table with 10 numeric fields and must be empty on initialization.

General segment parameters for the i-th segment:

[i,0]

tb

segment begin (frame index) n seconds

[i,1]

te

segment end time in seconds

[i,2]

tl

segment duration in seconds

[i,3]

pl

long time RMS in dB

[i,4]

al

long time spectral cut-off amplitude in dB

Center-segment parameters for channel 1 (always computed):

 

(for each channel where ch = 1..n)

[i,5]

tbc

begin time in seconds

[i,6]

tbe

end time in seconds

[i,7]

tcl

duration in seconds

[i,8]

toc

offset to segment begin (tbc-tb)

[i,9]

p95

RMS level reached by 95% of the segment frames (dB). See note 5.

[i,10]

p05

RMS level reached by 5% of the segment frames (dB).

[i,11]

p01

RMS level reached by 1% of the segment frames (dB).

[i,12]

pamax

The average energy level of the center-segment (dB).

[i,13]

peq

The average energy level for the whole segment (dB).

Center-segment parameters for channel 2 (computed only if A2 is connected).

© 2009 The Austrian Academy of Sciences Acoustics Research Institute