View
237
Download
1
Tags:
Embed Size (px)
Citation preview
Representation
OperationsHypertextStructured TextMarked-up Text
ASCIIISO Character Sets
Pattern-matching & searchingFormattingEditing
String Operations
EncryptionLanguage-specific operations
Character Operations
SortingCompression
TextText
ASCII 7-bit code 128 values in ASCII character set use of 8th bit in text editors/word processors creates
incompatibility
ISO character sets extended ASCII to support non-English text ISO Latin provides support for accented characters
à, ö, ø, etc. ISO sets include Chinese, Japanese, Korean & Arabic
UNICODE 16 bit format 32768 different symbols
Text - RepresentationText - Representation
Marked-up text nroff, troff LaTEX SGML
HTML HyTime XML, XSL, XLL
Structured Text structure of text represented in data structure, usually tree-
based ODA, structure embedded in byte-stream with content
Hypertext non-linear graph or “web” structure : nodes and links currently subject of intensive ISO standards activity
Text - RepresentationText - Representation
Character operations basic data type with assigned value permits direct character comparison (a<b)
String operations comparison concatenation substring extraction and manipulation
Editing perhaps the most familiar set of operations on text cut/copy/paste strings v. blocks, dependent on document structure
Text - OperationsText - Operations
Formatting interactive or non-interactive (WYSIWYG v. LaTEX) formatted output
bitmap page description language (Postscript, PDF)
font management typeface point size (1 point = 1/72 of an inch) TrueType fonts : geometric description + kerning
Pattern-matching and Searching search and replace wildcards regular expressions for large bodies of text, or text databases, use of inverted
indices, hashing techniques and clustering.
Text - OperationsText - Operations
Sorting numerous varieties of sort, all of them extensively studied in
basic programming sort complexity is a major factor in data handling
performance
Compression ASCII uses 7 bits per character, though most word-processors
actually use the 8th bit to use up a byte per character Information theory estimates 1-2 bits per character to be sufficient
for natural language text This redundancy can be removed by encoding :
Huffman : varies the numbers of bits used to represent characters, shortest codes for highest frequency characters
Lempel-Ziv : identifies repeating strings and replaces them by pointers to a table
Both techniques compress English text at a ratio of between 2:1 and 3:1
Text - OperationsText - Operations
Encryption text encryption is widely used in electronic mail and networked
information systems most widely-used techniques :
DES RSA public-key PGP
subject of major controversy : key escrow systems Clipper chip “strong” encryption now being legally outlawed in a number of
countries
Language-specific operations spell-checking parsing and grammar checking style analysis
Text - OperationsText - Operations
Representation
Operations
InterlacingChannel DepthNumber of Channels
Colour ModelAlpha Channels
Point operationsEditing
CompressionPixel Aspect Ratio
Geometric transformationsConversion
Indexing
FilteringCompositing
ImageImage
Image - RepresentationImage - Representation
Colour Model 2 main types
colour production on output device theory of human colour perception
CIE colour space international standard used to calibrate other
colour models developed in 1931, as CIE XYZ, based on
tristimulus theory of colour specification
RGB numeric triple specifying red, green and blue intensities convenient for video display drivers since numbers can be easily
mapped to voltages for RGB guns in colour CRTs
HSB Hue - dominant colour of sample, angular value varying from red to
green to blue at 120° intervals Saturation - the intensity of the colour Brightness - the amount of gray in the colour
CMYK displays emit light, so produce colours by adding red, green and blue
intensities paper reflects light, so to produce a colour on paper one uses inks
that subtract all colours other than the one desired printers use inks corresponding to the subtractive primaries,
cyan, magenta and yellow (complements of RGB)
Image - RepresentationImage - Representation
additionally, since inks are not pure, a special black ink is used to give better blacks and grays
YUV colour model used in the television industry also YIQ, YCbCr, and YPbPr Y represents luminance, effectively the black-and-white portion
of a video signal UV are colour difference signals, form the colour portion of a
video signal, and are called chrominance or chroma YUV makes efficient use of bandwidth as the human eye has
greater sensitivity to changes in luminance than chrominance, so bandwidth can be better utilised by allocating more to luminance and less to chrominance
Alpha Channels images may have one or more alpha channels defining regions
of full or partial transparency
Image - RepresentationImage - Representation
can be used to store selections and to create masks and blends
Number of channels the number of pieces of information associated with each pixel usually the dimensionality of the colour model plus the number of
alpha channels
Channel depth number of bits-per-pixel used to encode the channel values commonly 1,2,4 or 8 bits, less commonly 5,6,12 or 16bits in a multiple channel image, different channels can have different
depths
Interlacing storage layout of a multiple channel image could separate channel
values (all R values, followed by all G, followed by all B) or could use interlacing (all RGB for pixel 1, all RGB for pixel 2.........)
Image - RepresentationImage - Representation
Indexing pixel colours can be represented by an index in a colour map or a
colour lookup table (CLUT)
Pixel aspect ratio ratio of pixel width to height square pixels are simple to process, but some displays and scanners
work with rectangular pixels if the pixel aspect ratios of an image and a display differ the image
will appear stretched or squeezed
Compression a page-sized 24-bit colour image produced by a scanner at 300dpi
takes up about 20 Mbytes many image formats compress pixel data, using run-length coding,
LZW, predictive coding and transform coding many image formats : JPEG, GIF, TIFF, BMP most widely used
Image - RepresentationImage - Representation
These operations can operate directly on pixel data or on higher-level features such as edges, surfaces and volumes
Operations on higher-level features fall into the domain of image analysis and understanding and will not be considered here
Editing changing individual pixels for image touch-up, forms the basis
of airbrushing and texturing cutting, copying and pasting are supported for groups of pixels,
from simple shape manipulation through to more complex foreground and background masking and blending
Point operations consists of applying a function to every pixel in an image
Image - OperationsImage - Operations
only uses the pixels current value, neighbouring pixels cannot be used
Thresholding a pixel is set to 1 or 0 depending on whether it is above or below
a threshold value - creates binary images which are often used as masks when compositing
Colour Correction modifying the image to increase or reduce contrast, brightness,
gamma effects, or to strengthen or weaken particular colours
Filtering like point operations, operate on every pixel in an image, but
use values of neighbouring pixels as well used to blur, sharpen or distort images, producing a variety
of special effects
Image - OperationsImage - Operations
Compositing the combining of two or more images to produce a new image generally done by specifying mathematical relationships between
the images
Geometric Transformations basic transformations involve displacing, rotating, mirroring or
scaling an image more advanced transformations involve skewing and warping
images
Conversions conversions between image formats are commonplace and a
number of p.d, shareware and commercial tools exist to support these
other forms of conversion include compression and decompression, changing colour models, and changing image depth and resolution
Image - OperationsImage - Operations
Graphics
Representation
Operations
Drawing ModelsEmpirical Models
Physically-based Models
Geometric ModelsSolid Models
ShadingStructural EditingPrimitive Editing
ViewingRendering
External formats for Models
MappingLighting
The central notion of graphics, as opposed to image data, is in the rendering of graphical data to produce an image. A graphics type or model is therefore the combination of a data type plus a rendering operation
Graphics Representation Please note - object in graphics modelling usually refers to
an element of the scene being modelled, unless you are using object-oriented graphics programming
Geometric Models consist of 2D and/or 3D geometric primitives 2D primitives include lines, rectangles, ellipses plus more
general polygons and curves 3D primitives include the above plus surfaces of various forms.
Curves and curved surfaces described by parameterised polynomials
Graphics - RepresentationGraphics - Representation
primitives are first described in local or object co-ordinates, then arranged in groups in a common world co-ordinate system by applying modelling transformations
transformations include rotation, translation and scaling primitives can be used to build structural hierarchies, allowing
each structure thus created to be broken down into lower-level structures and primitives (i.e. blueprinting)
Several standard device-independent graphics libraries are based on geometric modelling
GKS (Graphic Kernel System(ISO)) PHIGS (Programmers Hierarchical Interactive Graphic System (ISO)) -
see also PHIGS+ and PEX OpenGL - portable version of Silicon Graphics library
Solid Models Constructive Solid Geometry (CSG) : solid objects are combined
using the set operators union, intersection and difference.
Graphics - RepresentationGraphics - Representation
Surfaces of revolution : a solid is formed by rotating a 2D curve about an axis in 3D space - lathing
Extrusion : a 2D outline is extended in 3D space along an arbitrary path
Using the above techniques will produce models much faster than building them up from geometric primitives, but rendering them will be expensive
Physically-based Models realistic images can be produced by modelling the forces,
stresses and strains on objects when one deformable object hits another, the resulting shape
change can be numerically determined from their physical properties
Empirical Models complex natural phenomena (clouds, waves, fire, etc.) are difficult
to describe realistically using geometric or solid modelling
Graphics - RepresentationGraphics - Representation
while physically based models are possible, they may be computationally expensive or intractable
the alternative is to develop models based on observation rather than physical laws, such models do not embody the underlying physical processes that cause these phenomena but they do produce realistic images
fractals, probabilistic graph grammars (used for branching plant structures) and particle systems(used for fires and explosions) are examples of empirical models
Drawing Models describing an object in terms of drawing or painting actions the description can be seen as a sequence of commands to an
imaginary drawing device - Postscript, LOGO turtle graphics
External formats for Models need for export/import formats between graphics packages CGM & CAD are OK. Postscript and RIB are render-only
Graphics - RepresentationGraphics - Representation
Primitive editing specifying and modifying the parameters associated with the model
primitives e.g. specify the type of a primitive and the vertex coordinates and
surface normals
Structural editing creating and modifying collections of primitives establish spatial relationships between members of collections
Shading the modelling techniques described so far have provided the means
to specify the shape of objects, but shading provides further information for the image in describing the interaction of light with the object. This interaction is described in terms of the colour of an object, how it reflects light and if it transmits light
Graphics - OperationsGraphics - Operations
several general-purpose methods exist to describe shading, most initially describe the surface of the object using meshes of small, polygonal surface patches
flat shading - each patch is given a constant colour Gouraud shading - colour information is interpolated across a patch Phong shading - surface normal information is interpolated across a
patch Ray tracing & Radiosity - physical models of light behaviour are used
to calculate colour information for each patch, giving highly realistic results
for photorealistic images extremely flexible shading is required, tools such as RenderMan actually provide programmable shaders which can be attached to objects, simulating different light effects and surface normals.
Mapping techniques for enhancing the visual appearance of objects
Graphics - OperationsGraphics - Operations
Texture mapping an image, the texture map, is applied to a surface requires a mapping from 3D surface coordinates to 2D image
coordinates, so given a point on the surface the image is sampled and the resulting value used to colour the surface at that point
shaders can also provide solid textures, where the texture is obtained from 3D rather than 2D space, and procedural textures, where the texture is calculated rather than sampled
Bump mapping as texture mapping, but used to change the vector of the surface
rather than the colour used to describe minor surface changes such as scratches or scrapes
Displacement mapping local modifications to the position of a surface produces ridges or grooves
Graphics - OperationsGraphics - Operations
Environment mapping also known as reflection mapping, used to handle limited forms of
reflection more primitive technique than ray-tracing
Shadow mapping similar to environment mapping in that it provides a primitive
lighting effect without the expense of ray-tracing produces shadows
Lighting within a model, in addition to the graphics objects, there are
lights to illuminate the scene. There are various forms of light source, each of which can be parametrically specified
ambient light - background lighting, comes from all directions with equal intensity
point lights - come from specific points in space, intensity governed by inverse square law
Graphics - OperationsGraphics - Operations
directional lights - located at infinity in some direction, intensity is constant
spot lights - illuminating a cone-shaped volume
Viewing to produce an image of a 3D model we require a transformation
which projects 3D world coordinates onto 2D image coordinates transformation applied to viewing volume, that part of the
model that appears in the image view specification consists of selecting the projection
transformation, usually from parallel or perspective projections although camera attributes can be specified in some renderers, and the view volume
Rendering rendering converts a model, including shading, lighting and
viewing information, into an image software allows selection and fine-tuning of control parameters
Graphics - OperationsGraphics - Operations
output resolution - the width and height of the output image in pixels, and the pixel depth
rendering time - quick and low-quality v. slow and high resolution
Graphics - OperationsGraphics - Operations
Representation
Operations
Frame rateData rate
Sample size and quantisation
Analog formats sampledSampling rate
RetrievalStorage
MixingConversion
Compression
SynchronisationEditing
Support for interactivityScalability
Digital VideoDigital Video
Analog formats sampled Digital video frames can obtained in two ways :
Synthesis - usually by a computer program Sampling - of an analog video signal. Since analog video
comes in various different flavours, according to frame rate, scan rate, composite v component, sampling rate and size vary.
Digital Video - RepresentationDigital Video - Representation
Sampling rate the value of the sampling rate determines the storage
requirement and data transfer rate the lower limit for the frequency at which to sample in order to
faithfully reproduce the signal, the Nyquist rate, is twice the highest frequency within the signal
video processing is simplified if each frame and each scan line give rise to the same number of samples, requiring the sampling frequency to be an integer multiple of the scan rate
Sample size and quantisation sample size is the number of bits used to represent sample
values quantisation refers to the mapping from the continuous range of
the analog signal to discrete sample values choice of sample size is based on :
signal to noise ratio of sampled signal sensitivity of medium used to display frames
Digital Video - RepresentationDigital Video - Representation
sensitivity of the human eye digital video commonly uses linear quantisation, where
quantisation levels are evenly distributed over the analog range (as opposed to logarithmic quantisation)
Data rate high data rate formats can be reduced to lower data rates by a
combination of : compression reducing horizontal and vertical resolution reducing the frame rate
for example : start with broadcast quality digital video at 10Mbytes/s divide the horizontal and vertical resolutions by 2, giving VHS quality
resolution divide the frame rate by 2 compress at a rate of 10:1 data rate becomes 1Mbit/s, suitable for use on LANs and on optical
storage devices (i.e. CD-ROM)
Digital Video - RepresentationDigital Video - Representation
Frame rate 25 or 30 fps equates to analog frame rate, or full-motion video at 10-15 fps motion is less accurately depicted and the image
flickers, but the data rate is much reduced
Compression we have already considered compression techniques, in digital
video we can compare methods by three factors : Lossy v. lossless Real-time compression - trade-off between symmetric models and
asymmetric models with real-time decompression Interframe (relative) v. Intraframe (absolute) compression (i.e.
MPEG-1 v. Motion JPEG)
Support for interactivity random access to frames differential rate and reverse playback cut and paste capability
Digital Video - RepresentationDigital Video - Representation
Scalability scalable video allows control over video quality, we can identify
2 forms : Transmit scalability - encoded data rate is chosen at compression
time from a range of rates, governed by transmission and processing constraints and/or storage capacity. Currently in use for low rate digital video
Receive scalability - decoded data rate is chosen at decompression time to match playback requirements. Attractive concept but not yet available in current video coding standards
current approaches to low rate digital video include : DVI (Digital Video Interactive) - two forms, Production Level
Video (PLV) and Real-Time Video (RTV). PLV only really intended for playback, RTV produces poorer quality but is intended for compression. Both use interframe compression to achieve rates of 1Mbit/s, but require costly hardware.
MPEG-1 - 1Mbit/s
Digital Video - RepresentationDigital Video - Representation
MPEG-2 - broadcast quality video at rates between 2-15Mbit/s
MPEG-4 - low data rate video MPEG-7 - metadata standard for video representation Motion JPEG px64 (CCITT H.261) - intended for video applications using
ISDN (Integrated Services Digital Network). Known as px64 since it produces rates that are multiples of ISDNs 64Kbits/s B channel rate. Uses similar techniques to MPEG but, since compressions and decompression must be real-time, quality tends to be poorer.
H.263 - based on H.261, but offers 2.5 times greater compression, uses MPEG-1 and MPEG-2 techniques.
Digital Video - RepresentationDigital Video - Representation
Storage to record or playback digital video in real-time, the storage system
must be capable of sustaining data transfer at the video data rate 4 main forms of storage for digital video are :
Magnetic tape - at present only magnetic tape can provide the vary high capacity storage required for digital video at practical costs ( 1 hour of CCIR 601 4:2:2 uses 72 Gbytes, while 1 hour of digital HDTV requires nearly 1 Tbyte)
Special purpose magnetic storage systems - useful for short durations of high data rate digital video, can be connected direct to external equipment and are thus useful for capture and editing (see diagram)
Video memory boards - specialist boards with large amounts of semiconductor memory (several hundred Mbytes or more), capable of storing short durations of uncompressed digital video, useful for capture and editing.
Digital Video - OperationsDigital Video - Operations
General purpose magnetic and optical storage systems - most low data rate video representations (MPEG, etc.) were designed to support the use of conventional storage media for real-time video playback. Problem is size of storage, even using MPEG-1 13 minutes of video will fill a 100Mbyte disk.
Retrieval uses frame addressing, as in analog video, but there are some
problems : low data rate formats result in variable sized frames, so an index
giving frame offsets needs to be maintained to support random access
interframe compression techniques, i.e. MPEG, only code key frames independently, other frames are derived from these key frames. So random access requires to first find the nearest key frame and then use this to decode the desired frame, again using the index but enhancing it with key frame locations
Digital Video - OperationsDigital Video - Operations
Synchronisation suffers same problems as analog video, so uses same
techniques digital video also has some additional techniques not available in
analog video, such as changing resolution to maintain frame rate
Editing 2 types :
tape-based - same procedures as with analog video, except no generation loss and the players are on the same machine
nonlinear - basically a clips-library, using cut and paste techniques to build a video sequence
Mixing real-time effects, such as tumbles, wipes and fades, are
calculated in the same way as for analog video, in fact for the majority of such effects whether the original source is analog or digital, the effects are digitised
Digital Video - OperationsDigital Video - Operations
non-real-time effects are only possible using digital video, and obviate the need for specialist equipment, being only dependent on the speed of the processor and the patience of the user, storage considerations can be overcome with the use of pointers and single frame editing
Conversion variety of formats demands conversion formats real-time conversion requires specialist hardware compression/decompression within a single format also
requires specialist software/hardware
Digital Video - OperationsDigital Video - Operations
Representation
Operations
Negative samplesInterleaving
Number of channels (tracks)
Sampling frequencySample size and quantisation
RetrievalStorage
Effects and filteringConversion
Encoding
Editing
Digital AudioDigital Audio
Digital Audio Representation 2 main areas :
telecommunications entertainment (audio CD)
Produced by sampling a continuous signal generated by a sound source. An analog-to-digital converter (ADC) takes as input an electrical signal corresponding to the sound and converts it into a digital data stream. The reverse process, to generate the sound through an amplifier and speakers, involves a digital-to-analog converter (DAC)
Sampling frequency (rate) sampling theory shows that a signal can be reproduced without
error from a set of samples, providing the sampling frequency is at least twice the highest frequency present in the original signal
Digital Audio - RepresentationDigital Audio - Representation
telephone networks allocate a 3.4kHz bandwidth to voice-grade lines, thus a sampling rate of 8kHz is used for digital telecommunications
the human ear is sensitive to frequencies of up to about 20kHz, so to digitise any perceivable sound a sampling rate of over 40kHz is required
Sample size and quantisation during sampling, the continuously varying amplitude of the
analog signal is approximated by digital values, this introduces a quantisation error, being the difference between the actual amplitude and the digital approximation
quantisation error is apparent when the signal is reconverted to analog form as distortion, a loss in audio quality
quantisation error can be reduced by increasing the sample size, as allowing more bits per sample will improve the accuracy of the approximation
Digital Audio - RepresentationDigital Audio - Representation
quantisation refers to breaking the continuous range of the analog signal into a number of unique digital intervals, based on one of a number of schemes :
linear quantisation - uses equally spaced intervals, so if the sample size is 3 bits and the maximum signal variation is 5.0 then the quantisation interval would be 0.625 units of signal amplitude
nonlinear quantisation (especially logarithmic quantisation) - uses non-equally spaced intervals, lower amplitude intervals are more closely spaced than higher amplitude, results in greater sensitivity to lower amplitude sound where the human ear is most sensitive
Number of channels (tracks) speech quality audio is mono (1 track) stereo audio requires 2 tracks some consumer audio equipment use 4 tracks (quadrophonic) professional audio equipment uses 16, 32 or more
Digital Audio - RepresentationDigital Audio - Representation
Interleaving a multi-channel audio value can be encoded by interleaving
channel samples or by providing separate streams for each channel
the advantage of interleaving is in synchronisation, and it also offers some benefits in storage and transmission
the disadvantages of interleaving are that it can be wasteful of space or bandwidth if not all channels are needed, it freezes the synchronisation between channels thus preventing temporal shifts, and it may not allow variation in the number of channels
Negative samples the voltages found in analog audio signals alternate between
positive and negative values negative values can be encoded successfully for processing in
twos complement, ones complement or sign-magnitude representation
Digital Audio - RepresentationDigital Audio - Representation
Encoding encoding audio data reduces storage and transmission costs,
and compressed audio also provides better quality when compared to uncompressed audio at the same data rate
2 commonly-used methods : PCM (Pulse Code Modulation) - uses the fact that a
digital signal can be formed from a series of pulses. PCM values are simply sequences of uncompressed samples, so they provide a reference format for comparison with more complex coding methods
ADPCM (Adaptive Delta Pulse Code Modulation) - reduces PCM data rate by encoding the differences between samples. ADPCM is widely used and is associated with some encoding standards, such as CCITT G.721.
Digital Audio - RepresentationDigital Audio - Representation
Storage it is possible to record digital audio, even at the data rates of the
high quality formats, on general purpose magnetic storage theoretically, a magnetic disk with a sustainable transfer rate of
5 Mbytes per second could playback 50 channels of CD-quality digital audio. In practice this would not be possible without a highly optimised layout, but one or two channels are easily within the reach of small computer systems
since an hour of stereo digital audio, at the CD data rate, requires over half a Gigabyte of storage, tertiary storage in the form of DAT tapes, CD discs or optical disks is normally adopted, with the information being mounted onto the system manually or through a jukebox
Retrieval need to support random access and ensure continuous flow of
data to DAC
Digital Audio - OperationsDigital Audio - Operations
portions of audio sequences, segments, are identified by their starting time and duration, these can be located is by mapping the starting time to a segment address, which the file system then maps to a physical address on disk
where there is no direct mapping to enable segment location by time code, an index of segments must be separately maintained
continuous flow of data is easy to maintain with a dedicated storage system, but requires careful control where storage is scheduled for a number of such tasks
Editing as with digital video, 2 types :
tape-based disk-based
to avoid audible clicks when inserting one sample into another, cross-fades are used, where the amplitudes of the original segment and the inserted segment are added and scaled about the insertion point
Digital Audio - OperationsDigital Audio - Operations
digital audio also supports non-destructive editing, where the segments of data are accessed through a data structure known as a play-list, which essentially contains a set of pointers to the data and details on ordering and other forms of edit to be performed on the data when it is joined
Effects and filtering digital filtering techniques permit a number of effects on audio :
Delay Equalisation & Normalisation Noise reduction & Time compression and expansion Pitch shifting Stereoisation Acoustic environments
Conversion one format to another (uncompressing ADPCM->PCM) altering encoding parameters (i.e. resampling at lower frequency)
Digital Audio - OperationsDigital Audio - Operations
Representation
OperationsSMDL
Operational v. SymbolicMIDI
TimingPlayback & Synthesis
Editing & Composition
MusicMusic
The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music.
Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities.
Representation of Music Operational v. Symbolic
operational representations specify exact timings for music and physical descriptions of the sounds to be produced
symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation
both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music
Music - RepresentationMusic - Representation
The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music.
Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities.
Representation of Music Operational v. Symbolic
operational representations specify exact timings for music and physical descriptions of the sounds to be produced
symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation
both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music
Music - RepresentationMusic - Representation
The existence of powerful, low-cost, digital signal processors mean that many computers can now record, generate and process music.
Music is also widely used in multimedia applications, so we require a media type for music to focus on the computers musical capabilities.
Representation of Music Operational v. Symbolic
operational representations specify exact timings for music and physical descriptions of the sounds to be produced
symbolic representations use descriptive symbolism to describe the form of the music and allow great freedom in the interpretation
both types are described as structural representations, since instead of representing music by audio samples there is information about the internal structure of the music
Music - RepresentationMusic - Representation
To illustrate the structural representations, we can consider two : MIDI - a widely use protocol allowing the connection of computers
and musical equipment, an operational representation SMDL - a proposal for a standard structure for documents containing
musical information, having both operational and symbolic aspects
MIDI the Musical Instrument Digital Interface was developed in the
early ‘80s by musical equipment makers Devices :
electronic keyboards and synthesisers drum machines sequencers (to record and play back MIDI messages) music<->film and music<->video synchronisation equipment
Music - RepresentationMusic - Representation
Connection ports : MIDI OUT - allows a device to send MIDI messages it has produced to
other MIDI devices MIDI IN - receives MIDI messages from other MIDI devices MIDI THRU - repeats received messages, permitting daisy-chaining of
MIDI devices MIDI devices process MIDI messages differently, according to their
function or to the sound palette used by the device, hence different synthesisers can produce different sounds supplied with the same MIDI messages
MIDI Concepts: Channel - a MIDI connection has 16 message channels, devices can be
set to respond to all channels or only to specific channels Key number - notes are identified by key number, 128 compared with
a standard keyboard of 88 Controller - 128 different controllers are available under the MIDI
protocol, though not all are currently defined, changing the value of a controller typically alters sound production
Music - RepresentationMusic - Representation
Patch/program - an audio palette is called a program or patch, a synthesiser capable of having a number of patches active at the same time is called multi-timbral
Polyphony - the ability of a synthesiser to play many notes at a time Song - a recorded or preprogrammed MIDI sequence Timing clock - a MIDI sequencer timestamps messages using a
timebase measured in parts per quarter note (PPQ). Typical timebase values are 24, 96 and 480 PPQ. To convert the timebase into actual time you use the tempo, measured in beats per minute (BPM) where we assume that one beat is equal to a quarter note. Thus if we have a tempo of 180 BPM, a time base of 96PPQ = 1/3 x 1/96 = 3.47ms
MIDI synchronisation - MIDI devices can be set to internal synch or external synch, when set to internal synch a device is known as a master and produces a timing clock message on its MIDI OUT at 24PPQ which slave devices use for external synch
MTC - MIDI Time Code is used to synchronise MIDI with film or video, used to trigger sound effects or musical sequences
Music - RepresentationMusic - Representation
MIDI Protocol : based on 8-bit code for messages, each message consists of a single
command byte and possibly one or more data bytes (see table) Channel voice messages (8c-Ec) - determine the actual notes
played, speed of hit and release and the values of controllers Channel mode messages (Bc, with controllers 121-127) - selects the
mode of a synthesiser, responding to one channel or all channels, each channel separately voiced or all voices used for one channel
System messages (F0-FF) - general system functions, timing clock, MIDI time code messages, system reset, start device, stop device, etc.
Limitations of MIDI : operates at 31250bps, allows 500 notes per second which may not
be enough for complex pieces limited number of channels, lack of device addressing and other
flaws make configuring large MIDI networks difficult device dependence of MIDI data
Music - RepresentationMusic - Representation
SMDL the Standard Music Description Language was developed by the
MIPS committee of ANSI SMDL encompasses representation of music for electronic
dissemination and production by software, the representation of scores and musical examples in printed documents and the representation of musical annotation and attributes used for musical analysis or by music databases
SMDL is a DTD of SGML, based on a document type called musical works or works. Each work has 4 hierarchically structured sections:
core section - musical events, such as note sequences, which form the work
gestural section - performances of the core, which may differ in interpretation
visual section - displays the core in printed, includes formatting and lyrics
analytical section - allows a number of theoretical analyses on the core, its score and performances to be included in the work
Music - RepresentationMusic - Representation
In considering music representation, we can recognise several advantages over audio :
music representation will be more compact than audio it is portable and can be synthesised with the fidelity and
complexity appropriate to the output devices used while digital audio suffers from inherent noise, musical
representations are noise free many operations can be performed on music that would be
infeasible or require extensive processing on audio Playback & Synthesis
during audio playback, the listener has limited influence over the musical aspects of the performance, beyond changing the volume or processing the audio in some way. If music is produced by synthesis from a structural representation the listener can
Music - OperationsMusic - Operations
independently change pitch and tempo, increase or decrease individual instruments volumes or change the sounds they produce
musical representations offer greater potential for interactivity than audio
Timing structural representation makes timing of musical events explicit the ability to modify tempo makes it possible to alter the timing of
groups of musical events and adjust the synchronisation of those events with other events (film, video, etc.)
Editing & Composition basic editing allows the user to modify primitive events and notes more complex editing operations operate on musical aggregates (chords,
bars, etc.) to permit phrase-repetition, melody replacement and other such functions
composition software simplifies the task of generating and combining or rearranging tracks, and prints the score
Music - OperationsMusic - Operations
Representation
Operations
Articulated objects & hierarchical models
Key framesEvent-based models
Cel modelsScene-based models
Graphics operations
Physically-based & empiricalmodels
RenderingPlayback
Scripting & procedural models
Motion & parameter control
AnimationAnimation
Separating animation and video follows the same track we took in separating image and graphic, based on modelling.
Animation types provide models which are rendered to produce video.
Animation is distinct from graphic in that it is time-dependent, but as in the image<->video relationship, sampling an animation model at a particular time will result in a graphics model, which can be rendered to produce an image
Animation Representation Cel models
early animators drew on transparent celluloid sheets or cels, different sheets contained different parts of the scene, which was assembled by overlaying the sheets
in animation, cels are digital images with a transparency channel
Animation - RepresentationAnimation - Representation
scenes are rendered by drawing the cels back to front, with movement being added by changing the position of cels from one frame to the next
a cel model is therefore a set of images, their back to front order, and their relative position and orientation in each frame
Scene-based models simply a sequence of graphics models, each representing a
complete scene highly redundant and do not support continuity of activities
Event-based models expresses the difference between successive scenes as events
that transform one scene to the next still discrete rather than continuous, but permits the
management of scenes by input devices (i.e. mouse, tablet, etc.) rather than each scene having to be entered manually
Animation - RepresentationAnimation - Representation
Key frames in essence, the animator models the beginning and end frames of a
sequence and lets the computer calculate the others by interpolation
Articulated objects & hierarchical models attempt to overcome the problems of key frames by developing
articulated objects, jointed assemblies where the configuration and movement of sub-parts are constrained
ensures proper relative positioning and constraint maintenance during interpolation (will not allow solid objects to pass through other solid objects)
Scripting and procedural models current state-of-the-art animation modelling systems have tools
allowing the animator to specify key frames, preview sequences in real time and control the interpolation of model parameters
an additional feature in many such systems are scripting languages
Animation - RepresentationAnimation - Representation
scripting languages offer the animator the opportunity to express sequences in concise form, particularly useful for repetitive and structured motion and also provide high-level operations intended specifically for animation
Physically-based models & empirical models this approach is used to produce sequences depicting
evolving physical systems a mathematical model of the system is derived from
physical principles or empirical data and the model is then solved, numerically or through simulation, at a sequence of time points, each one resulting in a single frame for the sequence
Animation - RepresentationAnimation - Representation
Graphics operations since animation models are graphics models extended in time, all the
graphics operations we have already covered are applicable here
Motion and parameter control since the essential difference between graphics and animation
operations is the addition of the temporal dimension, graphics objects become animations through the assignment of complex trajectories or behaviours over time
commercial 3D animation systems provide modelling tools and animation tools, the modelling tools produce 3D graphic models and the animation tools add temporal transformations to these objects
Rendering 2 basic forms :
real-time - model is rendered as frames are displayed, 10+ frames per second are required to avoid jerkiness, so only appropriate for simple models or with special hardware
non-real-time -frames are pre-rendered, taking as long as necessary to do so, provides higher visual quality and consistency of frame-rate
Animation - OperationsAnimation - Operations
Playback non-real-time rendering offers the same operational
possibilities in playback as digital video, over rate and direction
real-time rendering is much more interactive and modifiable, objects can be added and removed, lights turned on and off, the viewpoint changed, and so on
Animation - OperationsAnimation - Operations