Article Index

In order to start a recording session in STx, proceed as follows:

  • Select STx-Recorder Setup
  • Specify record settings (sampling frequency, # of channels etc.)
  • Select preferred recorder layout. Note: auto-run RTAnalyser is only available under Windows XP and Vista due to system limitations in Windows NT and 2000.
  • Specify theTemp-Directory for the recorder soundfile location (or use default)
  • Start recording:  

record_start..... Stop recording and save recorder sound file to destination directory and desired name (*.wav) The recorded soundfile appears in the STx sound file list.

Note: The default soundfile naming uses the actual date and time, such as rec-2008.06.05-12.55.37.wav. 

Note: externally recorded (i.e., already existing) sound files can be imported into STx by using either of the following methods:

  • From the Windows Explorer, drag the respective file(s) into the STx-workspace and drop them there.
  • Use the FindFile tool of STx to import them into the STx-DataSet.
  • Use the File→Soundfile→Open option from the pop up menu.

Note: The STx-Recorder automatically creates a soundfile segment named Signal. All  in the soundfile segment list with the start address 0 samples and the duration of the recorded signal in xxxxyyy samples.

Image

STx currently supports sampling frequencies from 500 Hz to 2 MHz at word lengths of 8, 16 and 24 Bits, provided that these formats are supported by the hardware and driver of your Windows sound subsystem. Up to 64 channels (48 kHz, 24 Bits) have been recorded simultaneously by means of an 8 x 8 microphone array. A binary word length of > 24 bits, higher sampling frequencies and alternative soundfile formats are available on request. For further hints about performing digitization see the Sections STx-Recorder Setup, I/O and General Soundfile Setup (the latter are only available in the STx documentation/help).


  

Digitization  Prelininaries

Digital Signal Processing is based on the acquisition and storage of data; for the most part these data consist of digitized analogue signals. That’s why the digitization should be done very carefully in a mindset never to do it twice. This mindset implies the selection of several digitization parameters and data management conditions to be laid down at the very beginning of each project.


Name

 

Processing unit

 

Parameter Range


Condition(s)

Note(s)

sampling
frequency

analogue to digital converter (ADC) and
digital to analogue converter (DAC), anti-aliasing filters

from DC to a value (Hz) considerably more the double of the highest frequency component of the analogue signals to be digitized (system
bandwidth)

as standard sound cards of personal computers don’t guarantee accurate and stable sampling frequencies, calibration is advisable 

common sampling frequencies in audio are:

8000 / 11025 / 22050 / 32000 / 44100 / 48000 / 96000 / 192000 Hz

quantization,
Bit resolution, dynamic range

ADC /DAC, at uniform quantization the digital range (DR) is computed by n Bits: DR(dB)=
20 log10 (2n)

linear coding from 8 to 24 Bit usually in steps of 8 Bit and
floating point

the dynamic range finally depends on the analogue preamplifier signal to noise ratio (SNR) available

a dynamic range of
120 dB corresponds to a voltage range such as from
10 µV to 10 V

nominal level,
alignment level

ADC and
analogue
pre-amplifier

In audio the nominal level usually is set to +4 dBu
(1.228 VRMS)

As reference a sine wave of
1 kHz is applied

For sinusoids RMS(root mean square) and peak to peak values are related by:
Vpp = 1.41 VRMS

Input level
control

Peak Program Meter (PPM) and analogue
pre-amplifier

Up to the absolute (peak) full scale value of the ADC

to avoid signal clipping a peak value of  at least  -12 dB
(re full scale) is commonly used

Important: in contrast to analogue recording devices ADCs have no headroom available

Signal
Calibration

In audio:
Microphone plug on

Calibrated
analogue signal source
~94 dB SPL

As reference a sinusoidal tone of 1 kHz is
applied

Each recording should be associated with a proper calibration signal

Table 1. Some commonly used digitization parameters for the transfer of analogue signals and caveats.

Hints for proper analog to digital conversion level control

As the STx recorder level control provides a true peak level reading, analog input should be set from -12 dB to -20 dB in average in order to maintain enough head room for signal peaks. More information on standard input levels for analog to digital conversion see the entry Commonly used Voltage and Audio Levels in the STx manual.

Principles of Digitization and Soundfiles

In order to digitise analogue signals properly - which turns out to be a lengthy process - a few preliminaries should be considered. Whenever an analogue signal is to be digitised, the process should be planned and executed with a mindset not to do so twice. This mindset implies:

  • Performing analogue to digital (A/D) conversion and digital to analogue (D/A) conversion at the highest sample rate appropriate to the nature and the information content of the originals (i.e. > 2 times the highest signal frequency component of the sound source to be captured).
  • Performing analogue to digital conversion at the highest resolution
    appropriate to the dynamic range and sound quality of the originals. This avoids re-transferring and re-handling the originals in the future (i.e. 16 Bit = 96.33 dB, 24 Bit = 144.49 dB).
  • Creating and storing a linear-coded master soundfile that can be used to produce derivative filtered and/or compressed or otherwise processed soundfiles, in order to serve a range of current and future needs.
  • Creating backup copies, on a stable medium, of all soundfiles that are created.
  • Creating meaningful metadata for soundfiles and associated documents, including (if required) cataloguing information according to a scheme that has been thought through ahead of time.
  • Monitoring the conversion and recopying data if necessary.
  • Outlining a migration strategy for transferring data to alternative sites, including the next generations of file servers.

Additional Preliminaries

Before starting a series of recording sessions involving more than a couple of sound recordings, the user should plan a careful considered soundfile naming convention and soundfile directory structure (work directory) The planning should involve the outline of a migration strategy for transferring data to alternative sites, including the next generation of file servers. One practical approach simply includes the date (yy/dd/hh/min/) and the application identification already in the soundfile name(s). In this way and following the previously defined file directory structure, all backup and work copies receive all information, to reconstruct the original data base in chronological order.

In addition, to create meaningful metadata for soundfiles (file annotations) and associated documents, including (if required) cataloguing information according to a scheme that has been thought through ahead of time, has been proved highly advantageous. Although metadata are processed as dynamic information and can be updated or appended later on, standardisation is much easier achieved if some annotation or any other meaningful data description is available from the beginning.


Handling of Large Binary Objects (LBO)

The digitization of continuous analogue signals produces digital streams of considerable duration and a significant number of large size binary files in the framework of a projects database. As the capacities of storage devices recently developed from medium size in the GigaByte range into mid TeraByte volumes, more and more attention has to be focussed on the management and structuring of large binary objects. Data access, search and retrieval, as well as consistent update strategies have to be organised along new challenging circumstances. Although recent developments in computer technology have accelerated the re-organization and migration of large data bases considerably, new and elaborated concepts of data management still face urgent demands.

From practical experience, essence needs metadata – and ones metadata is the others essence – however both have to be coordinated thoroughly. New methods of automatic created content related annotations of signals have been introduced in the last years to meet the new requirements of handling excessive large data volumes and their content.

Disk and File Volume Management

The table below shows the maximum record length in 24h-days to be stored on a 1 TByte hard disk in dependence of the sampling frequency (first row in kHz, wordlength 16 bit) and the number of channels (first column). The usual file size under Windows is 2 GByte, STx accepts file sizes up to 8 GByte.  Values  with (*) have not been tested up to now. For multi-channel recording a STx macro language script is available, contact the STx software team.
 RecDiskVolume1


 

Upcoming Events

Improving speech technology with the open source VOiCES dataset

ARI guest talk by Michael R. Lomnitz

19. September 2019

14.30

Seminar Room, Wohllebengasse 12-14 / Ground Floor

Read more ...
 

SSW10

The 10th ISCA Speech Synthesis Workshop

20. - 22. September 2019

Vienna, Austria

 

News