CONCERTED ACTION ON MUSIC
INFORMATION IN LIBRARIES
HARMONICA
Proposal for Phase 2
WP 3: Networking and Digitisation
Deliverable 3.5
Version : 1
Date : Sept. 18th, 1997
Author : OEAW-WAD
Confidentiality : Public
Status : final
Table of Contents:
1. Analog
Media containing music information
Harmonica Phase 1 has been successful in identifying music information
content related document carriers and formats; the examination of relevant
music information volumes based on a questionnaire has provided the following
main results:
-
The majority of published and unpublished pictorial music information is
stored on still images. Video can be found in special library categories;
CD-ROM ranges still low but is a coming format.
-
Unpublished sound and historic recordings are stored on 78 rpms and on
cylinders. Some collections have more information stored on reel-to-reel
tapes than on compact cassettes and DAT. Published sound documents are
most frequently found on LPs and on CDs.
-
A lot of concern has to be given to analog carrier storage and preservation
issues, especially on saving acetate discs, cylinders and some products
of magnetic (reel-to-reel) tapes.
Harmonica Phase 2 should cover the following tasks:
-
Issue of guidelines for the development of preservation and digitisation
workplans, conservation treatments and collection rearrangements of analog
media containing music information such as:
-
still images
-
bound, unbound paper
-
78 rpms
-
cylinders
-
acetate discs
-
magnetic tapes
-
others
2. Digitisation
Harmonica Phase 1 has identified a vast variety of sampling rates, file
formats and resolution levels in use, depending on hard- and software characteristics
abroad. Standardisation of sound file formats is on the way as "The Broadcast
Wave Format" for audio, drafted by EBU (1996). This type of file is specified
in the Microsoft "Resource Interchange Format" (RIFF) and is appropriate
for uncompressed sound. As has been expressed in the name of the draft
already, it is intended as a broadcast format and not as a standard of
many sound studio production formats, nor as a music sound archive format,
which may be different in many characteristics. It is also not forseen
to replace the generally accepted digital sound transmission file format
AES/EBU. Standardisation of image file formats (moving and still images)
is even less effective. To some extent, several graphic file formats are
interchangeable with little or no loss of picture quality, provided the
data compression algorithms applied are code regenerating.
Harmonica Phase 2 should provide:
-
Principles for digital file format and naming conventions, for file systems,
directory and folder structures in which files of digitised originals have
to be placed. Development of a Document Type Definition (DTD) for materials
such as:
-
bound and unbound paper
-
searchable text
-
still images
-
moving images
-
sound
-
others
Standardised digital-image, digital-text, digital-sound/video/image filenames
have to be assigned as a part of the initial digitisation process. Certain
arrangements of directories and subdirectories have to follow the specifications
of the libraries, faciliating future access to images, sound and text.
This task should be performed quite ahead of digital conversion starts.
-
Development of guidelines for Digital Conversion (Transfer) of various
types of originals including handling of vulnerable and fragile documents
of standard and large sheet formats, with or without special handling required.
Specification of gray scale or color source images to be scanned with appropriate
resolutions. These procedures may be relevant, especially for uncopyrighted
sheet music from the 19th century! The task should also include requirements
for scanning, sound and video reproduction equipment for safe handling
of originals, i.e. books should be opened between 90 and 130 degrees only,
they should not be pressed flat against a glass surface etc. The handling
of reel-to-reel magnetic tapes has to avoid extreme tape tensions and strong
contact to reproducing heads etc. Special instructions for handling techniques
of fragile items have to be developed as a primary concern in order to
prevent damage to the original documents.
-
Survey of scanning and reproduction equipment and procedures
-
Survey of special approaches for the digitisation of recorded sound and
moving images
-
Development of work flow schemata
-
Development of quality control requirements
Scanning and digitisation Logs should include all relevant data of the
transfer process (transfer protocol). Problems, irregularities, filter
characteristics, reproduction devices, scanning machines etc. should be
listed, file by file or section by section. Anomalies of the digitisation
procedure or exceptions due to document characteristics (extremly reduced
signal to noise ratios, missing parts etc.) should also be available in
machine-readable form.
Quality control has to guarantee that the requirements for delivery
and accuracy of the digitisation procedure have been met. Quality control
includes the sound- and image quality, the document integrity, the completeness
of the transfer etc. When text conversion is applied, a level of accuracy
(99%?) has to be stated.
3. Metadata
Metadata may include full encoding of converted texts, usually with Standard
Generalized Markup Language (SGML), structural elements, highlighted text,
video and sound sequences etc., as well as technical data about the document
type itself, its coding, size, usability and priority of access. Metadata
can reflect highly structured information about the technical characteristics
of a document and/or as an optimum, a perfect representation of the content
(semantic retrieval). Automated indexing and metadata generation for music
sound and video documents is still an issue of basic research.
Harmonica Phase 2 should focus on the task:
-
Development of guidelines of the generation and the management of technical
and content related metadata or other descriptions of the data in the database
(schemas).
-
Semiautomatic metadada generation of text and images.
-
Still unsolved: metadata for the musical content of soundfiles
-
Survey on methods of schema evolution or other methods applicable to massive
metadatabases.
-
Survey on browsing, navigation, access and retrieval methods with respect
to large binary objects (sound, video).
4. Music sound- image, text
archive systems
Automated backup systems and software are already available from several
sources. Platforms usually supported are: Windows NT, NetWare, SunOS Solaris,
AIX, HP-UX etc. Data backup utilities frequently are tailored for standalone
servers only. Multiple server environments may become more and more important
in library applications.
-
Identify industrial solutions suitable for storage of continuous data as
musical sounds, video etc. including;
-
on-line storage for real time access (LAN)
-
backup systems for medium size and terabyte archive applications
-
long term off-line storage systems for large database access
-
high availability large scale archive systems
High performance data storage products range up to several Petabytes using
thousands of robotic controlled magnetic tape cassettes. Harmonica Phase
2 should address the usability of scalable solutions for libraries.
5. Networking
Networking in libraries can be seen as a threefold issue. (1) Libraries
have to install their own infrastructure, typically a LAN with data acquisition
workstations, servers and local storage devices. (2) Since libraries can
use backup and storage capacities at remote computer centers, they need
high speed data communication in a WAN-like environment. (3) Distribution
of binary documents via global networks, as via online services and the
internet incorporates tasks like catalogue search, listening (seeing) in
advance, acquiring rights clearance, paying and downloading.
"Bringing the Search to the Net" involves the transition from document
search to concept search via networks. Harmonica Phase 2 should address
technical aspects of network installation as well as developments of new
communication interfaces using www-downloadable Java applets. As one example
Java/CORBA (Common Object Request Broker Architecture) provides advantages
in higher flexibiltiy of the user interface implementation, automatic maintainability,
easy server configuration and platform independent client handling.
Go back to Acoustics Research
Acoustics Research Laboratory of the
Austrian Academy of Sciences
Liebiggasse 5, A-1010 Vienna, Austria;
Tel. +43-1-4277-29500
Page maintainer: Werner
A. Deutsch. Last update: 1997/09/08