A System for Processing Digitally Sampled Sound
My main hobby is writing music using my computer and synthesizer. The computer is at the centre of the system, controlling the synthesizer and recording and playing back sound. The computer records sounds digitally, and the resulting stream of numbers can be manipulated to alter the characteristics of the sound.
I currently use several programs to process sounds, but I find that none of them are flexible or powerful enough. Therefore for this project I will create a system for processing digitally sampled sounds.
In a music studio different pieces of equipment like synthesizers and effects processors are self-contained units which can be connected together in different ways with cables carrying audio signals in electrical form. This method of producing music has been refined over the past fifty years and has been shown to be flexible, powerful and successful.
However, hardware is expensive, and separate units often duplicate functions. All early equipment was analogue, based on the properties of electronic components. Nearly all new effects processors perform their actions mathematically on sounds in the digital domain and therefore they contain analogue to digital and digital to analogue converters (ADCs and DACs). The signal is often converted many times between analogue and digital forms in its path through the studio, which is both inefficient in terms of hardware cost and undesirable in terms of signal quality. Some effects processors, but still a small minority, have digital inputs and outputs but the small market implies a high price.
In the 1980s a new type of musical instrument became affordable. The sampler is essentially a digital record and replay system, potentially much more powerful and universal than traditional synthesizers, and although the first samplers only had enough memory for one or two seconds of record time they became popular for the ease in which sounds can be sampled and processed. Some popular computers could soon play back samples.
In an initially separate trend to the digitalisation of effects processors, the ownership and power of personal computers have risen dramatically. Coupled with the development of the musical instrument digital interface (MIDI) standard, originally developed so that notes played on one keyboard could trigger sounds on another synthesizer, allowing computerised recording and editing of performances, computers began to be found in music studios. At first the computers were used only for sequencing (recording and editing control data) but soon sampling and hard disk recording became popular.
At this point the final vision became clear. It would eventually be possible to have nearly all of a recording studio in one box, bar the microphones and loudspeakers. Instead of connecting hardware units with wires, audio data would pass through processing software, entirely digital and therefore with no intermediate signal degradation and interference.
Sound waves are continuous changes in air pressure, which microphones convert to varying voltages, but computers can only deal with discrete numbers. The continuous input signal is converted to a stream of numbers by sampling and quantisation. Sampling records the value of the signal at specific time instants, and quantisation converts these continuous values into numbers with finite precision. Once sampled, the sound data can be manipulated in many ways, for example changing the relative amplitude and phase of frequency components (filtering) or reducing changes in volume (compression).
Much of signal processing is based on rigorous mathematical foundations, which provides efficient algorithms for carrying out modifications and explaining how the data is being modified. As this information is rather technical, it will be considered in an appendix.
I use several existing application programs to process sound samples. Most of these have very similar user interfaces, thus sharing both its advantages and its disadvantages.
Typically a large part of the screen is occupied with a graphical waveform display. Dragging with the mouse in this section of the display marks a range on which following operations are to be performed. Controls also exist to zoom in to look more closely at a particular portion of sampled data. The use of a graphical display allows the user to identify different parts of a long sound, for example bass and snare strikes in a drum part or different syllables in speech.
Below the waveform display are the most commonly used functions, such as controlling the display and playing back and recording sounds. Common range operations are also available, the usual cut, copy and paste found in almost all application software. Readily accessible, it makes simple editing of sounds quick and easy.
Less commonly used functions tend to be hidden away in sub menus, and as each operation simply replaces the marked range with the processed version it can be hard to combine different effects easily or perform the same operations on many different sounds. An unconnected problem is that often the parameters of the various effects are obscure, with values not connected to the real world of sound.
v6.00o © Teijo Kinnunen and Ray Burt-Frost (1995.11.11)
OctaMED was developed from MED, a program originally created to allow programmers to create music for computer games. The program became much more advanced, the “Octa” part of the name coming from its ability to play eight recorded sounds at once through only four hardware sound c hannels. In addition to the tracker editor (for writing music) OctaMED has a sample editor, so the sounds used in compositions can be modified without having to use other programs.
The sample editor is of the type described above, with various windows and menus.
The interface is satisfactory for simple modifications, but the units used are not directly related to the sound. For example, the echo rate is the number of samples between echoes, so to create a specific time delay one has to perform calculations with the sample rate. The filter window has two parameters, distance is the period of the frequency to be filtered (so again the sample rate is involved if one wishes to filter a certain frequency) and averaging determines the strength of filtering. The filter window also provides access to the boost command.
OctaMED SoundStudio features control through ARexx, a system of passing textual commands between applications. This would enable batch processing scripts to be created.
Unlike many audio applications for the Amiga range of computers, OctaMED has a user interface consistent with the rest of the operating system. This is desirable as it is easier for a user to use a familiar interface style than to have to learn different symbols.
v1.0e © The Blue Ribbon Soundworks Ltd. (1991, 1992)
Bars and Pipes is a MIDI sequencer, designed to record, edit and replay data from keyboards and synthesizers. This aspect is not relevant to signal processing, however Bars and Pipes has a powerful system of tools controlled by a graphical user interface. What makes Bars and Pipes special is its system of “pipes” and “pipe tools”. The flow of data from input to standard track storage to output is represented analogously to the flow of liquid through pipes, and processing blocks can be inserted to modify or reroute the data.
Tools are dragged from the toolbox window to the pipe using the mouse. Tools are shown with symbols, but the
[?] icon at the start of the toolbox gives a list by name. A tool in place can be moved left or right in the pipe or deleted. Tools with more than one output can be connected to a tool with more than one input. Double clicking on a tool calls up its parameter window, from where the tool can be controlled.
Tools present in this distribution of Bars and Pipes (in order from left to right in the toolbox window shown above) include: Branch, Counterpoint, Echo, Invert, Keyboard Split, Merge, Modulator, MIDI In, MIDI Out, Quantize, Transpose, Triad, Flip, Loop, UnQuantize, Phrase Shaper, Sforzando, Subdivider, Spare Keys, Accompany B, Articulator, Doctor of Velocity, Easy Off, Elbow, Feedback In, Feedback Out, Harmony Generator, Note Filter, Plug, Reverse, Stop!, and Velocity Splitter.
There is developer information available, so new pipe tools can be created. The system is designed in such a way that the main program does not need to be recompiled, the tools are separate object code files. C is used with clever coding of object oriented techniques, presumably because it was the standard operating system language and because C++ was not widely available.
SOund eXchange v6.11 for Amiga
Created and maintained by Lance Norskog (firstname.lastname@example.org), Amiga port by David Champion (email@example.com).
“SOX is intended as the Swiss army knife of sound processing tools. It doesn’t do anything very well, but sooner or later it comes in very handy.”
AmiSOX originated under the UNIX operating system as a universal sound file format translator. UNIX is a text-based OS, and AmiSOX is command line driven, although a separate graphical user interface is available.
Only one effect may be applied at once, for multiple effects a pipe may be used.
Scripts are useful for hiding options, for example “X2Y file” contains the command for converting file.X to file.Y.
Command line syntax:
sox [ options ] [ format ] infile [ format ] outfile [ effect [ fxopts ] ]
Options include volume change relative to 1. Format specifiers define either recognized types (sample files with header data) or raw data (file contains no information about data so it must be given).
copy (no effect)
rate (resample at new rate (given by output format) by linear interpolation)
avg (reduce number of channels by averaging)
reverse (reverse entire sample)
echo [ delay volume ]+ (simple echo, delays given in seconds, volumes given relative to 1)
vibro speed [ depth ] (volume tremolo, speed given in cycles per second (less than 30), depth given as fraction of full modulation (less than 1))
lowp frequency (gentle low pass filter at frequency given in cycles per second)
highp frequency (gentle high pass filter at frequency given in cycles per second)
band [ -n ] freq [ width ] (band pass filter between f - w and f + w, frequency and width given in cycles per second, the manual states that “the default mode is oriented to pitched signals, the alternative -n (noise) mode is for unpitched sounds; noise is introduced in the shape of the filter”)
stat (list statistics about input file, no output file is generated)
Previously on alt.sources, now on comp.sources.misc. This is the C source library for sox containing the protocols for components (like file handlers and effects) to interact, and effects algorithms. Full developer material is included so new formats and effects can be implemented, made easier by skeleton drivers. Internal data is signed 32-bit integer.
v1.2 © Stephan Klein (1995.05.25)
This is a basic graphical interface to AmiSOX. SOXGUI is an easier way of specifying the command line for AmiSOX, using the mouse to select options. SOXGUI then executes AmiSOX with the command line it generates. SOXGUI is very simple, but effective. The interface is self explanatory, the large “AmiSOX!!!” button starts AmiSOX.
v3.0 © Steve Tibbett
This utility program saves the screen image to a file when a key combination is pressed.
v3.25 © Electronic Arts (1985, 1990)
This is a graphics program which I used to crop the images saved by ScreenX.
v1.01 © Marco Negri (1996.04.01)
This is a text editor which I used to type much of the project.
The system needs to function in a similar way to real music studios, that is, processing units are connected with paths for sound and control data.
There should be a variety of processing units available, some modelling complete audio effects devices (for example an echo unit) and others acting as simpler building blocks (for example a delay unit).
The system needs to read and write sound sample files in IFF 8SVX and RIFF WAVE formats. These are the types of files I use most frequently, and other formats can be converted easily using SOX or some other such program.
The system needs to be portable, that is, easily adapted to different computers and operating systems. The system will be created for the Acorn Archimedes computer running the RiscOS operating system, but I create music on the Amiga computer running AmigaOS. As most of the implementation will be mathematical, portability will only be an issue for certain parts of the system.
The system needs to be controlled by textual commands, with the possibility of batch script files containing many commands to ease repetitive operations. The grammar of the commands should be as simple and general as possible, whereas the vocabulary needs to be able to expand to include new types of processing unit.
The system is unlikely to require any particular software to run, as the interface is textual. Textual interaction with the user is supported within C++, so the software may need to be simply recompiled for a new operating system.
Hardware requirements are unlikely to be specific. The system will not support real time processing because this requires specific hardware, so speed is not absolutely critical. However, the faster a computer is the more complex routings and effects it will be able to perform without the user waiting a long time. The script facility will enable users of slow computers to leave the computer performing complex processing while they do something else.
There are three main elements of the system, these are the command line interface, the effect algorithms, and the management kernel to link them together.
Effects processing is the aim of the system, so this section will be discussed first.
There are two main strategies for processing sampled data, each with variations.
Simple programs, like OctaMED’s sample editor, store the entire sound in memory, along with a spare memory buffer. Each effect works in its own way, reading directly from one data array and writing to the other.
More refined programs, such as AmiSOX, can handle files larger than available memory by splitting them into smaller blocks. This forces the effects to work in a broadly time-ordered manner, starting at the start of the sound and working through to the end. AmiSOX developer material can be found in the appendix.
These two “ad hoc” methods are rather inflexible. To use more than one effect, multiple runs of the programs are required. A pipe allows serial chains to be created relatively easily but parallel routing is more awkward.
However, what is impossible with this method is feedback between separate effects (although feedback within an individual effect is possible). This is quite limiting. For example, a chorus effect (consisting of several copies of the original sound superimposed at varying pitches) can be turned into a more dramatic flanger effect with the simple addition of a feedback loop. Feedback in this way requires that only one sample is processed at once. Without the flexible routing that single sample operation allows, for example, the flanger effect would have to be rewritten from the chorus effect, requiring much more work.
As each effect processes only one sample before passing it on, and the vast majority of effects require information from more than one input sample to generate the output, coupled with the fact that each effect is likely to have parameters that can be modified, the most sensible method of solution is to have objects that contain both the data outlined and effect algorithm code and that can be linked together in many different ways.
There are still two alternatives within this method. Either the input sources (sound files, for example) push their data into the system, or outputs pull data from the system. The latter may possibly be better suited to real-time operation, but the former is conceptually closer to the real world, and may also be easier to implement.
Each effect object has things in common, for example it can have inputs and outputs along which sound data passes, and it can have parameters that affect the way the inputs are modified to form the outputs. Therefore there needs to be a class effect, a base class for all of the different effects which incorporates all of these facilities.
As new effect classes can be developed, there needs to be a way of identifying the various inputs, outputs and parameters of different effects. Text is sensible, as it is a natural method of communication, but it is slow to compare strings. Therefore, text should be used to obtain a more efficient identification code, for example a small integer.
Effects need to be linked together, but each input can be connected to only one output (which has only one input linked to it). Therefore each effect needs to identify the other effects it is linked to in both directions, so the if a different effect links to it, the original can be found to unlink it. A link needs to pass the output from one effect to another effect’s input, identified by the id code described above.
As it is useful to make new effects from existing effects, inheritance is an appropriate mechanism. This means that certain methods of the effect class must be virtual so that they can be redefined. The three most important things that an effect does is get input, process it, and send output, so all of these must be able to be redefined to allow for the addition of new inputs, outputs and parameters. The parent class’s function can be called within the redefined version, allowing the parent’s attributes to still be present.
Sound data is passed through the system one sample at a time, to allow feedback. Some effect objects are designated as sources, so for each sample the process method of each of these is called in turn. The process method of each effect (unless it is a sink, having no outputs) call its output method, which calls the input method of the destination effect. The input method checks whether all of the inputs have been filled for this sample time, if they have it calls the process method for that effect object. In this way, data passes through the system from sources to sinks.
The effect’s input method needs to store the input sample within its data space, as well as recording that the input has been set this sample. To allow for inheritance, the input method is called with the input id and the sample. If the id is not recognised then the parent’s input method should be called to deal with it.
The process method takes the input samples, then modifies them and calls the output method. The process method can be a complete replacement of the parent’s process method, but it may also call the parent’s method if it is simply adding some extra functionality.
The output method takes an id and a sample, and if the id is not recognised then the parent method is called. The output method finds the corresponding output link, allowing the data to be passed to the next effect.
Sink effects have no outputs, for example writing the sample data a file would not be counted as an output here because it is not an output to another effect.
After processing the inputs need to be cleared before the next sample. To allow for feedback, however, the inputs must be cleared before the output method is called, in case this causes input to be given to the effect in question. It is impossible for an infinite loop to occur with feedback like this, because an effect getting feedback must have at least one input not in the loop, otherwise the loop could not be started. The output can be fed back to some of the inputs, but the others will not be filled until the next sample.
The individual effects algorithms are in an appendix.
effect::process() generate outputs, perhaps calling super::process(), and store in data space call clearinputs(), which calls super::clearinputs() call sendoutputs(), which calls super::sendoutputs() call output(), perhaps calling super::output() call destination.input() if recognise input id store sample data in data space if destination.inputready() call destination.process() else call destination.super::input()
The command line interface needs to be simple, so that it is easy to learn how to use, yet powerful enough to perform all useful operations. Simplicity can be achieved through use of similar syntax for different commands.
There are only a few lexical elements, which are the names of the commands (“new”, “delete”, “link”, “set”, “run”), and values for them: identifier strings for effects and classes, including “.” to separate parts from objects, numbers (floating point), and character strings (for example filenames).
The commands can be expressed as a grammar, there are no control structures so there is no recursion to complicate matters.
The language tools LEX and YACC can be used to create efficient lexical analysers and grammar parsers from high level definitions, this saves effort and the resulting table driven programs are very efficient.
The commands and effect class and object names should be case insensitive, but case should be preserved, as some operating systems have case sensitive filenames.
command ::= new ; create new object | delete ; delete an object | link ; link an output to an input | set ; set an object parameter | run ; start processing new ::= "new" new_type new_name new_type ::= string ; the type of object to be created new_name ::= string ; the name to give the object delete ::= "delete" delete_name delete_name ::= string ; the object to be deleted link ::= "link" link_source link_out link_dest link_in link_source ::= string ; the source object link_out ::= "." string ; a named output | . ; the main output link_dest ::= string ; the destination object link_in ::= "." string ; a named input | . ; the main input set ::= "set" set_name set_param set_value set_name ::= string ; the object set_param ::= "." string ; parameter name set_value ::= number ; a number | string "." string ; an object with part specifier | "'" chars "'" ; a character string run ::= "run"
The kernel has to link the command line interface to the effects. The command line interface calls kernel functions corresponding to the commands, with values converted to their correct form (for example, numbers converted to float values). The kernel finds the classes and effect objects corresponding to the textual identifiers, and calls the object methods that perform the command.
This separation of command line interface and kernel allows error checking to be simplified greatly. The command line interface has to deal with user input, which may be incorrect. However, the kernel has only correct data to deal with, so error checking is redundant and can be removed when the system has been thoroughly tested. This is especially important for the effects processing section, because code here is executed very frequently.
Note that as some of the commands are the same as C++ keywords, the actual name of the corresponding functions must be different in the implementation (for example, use new_() instead of new()).
kernel::new(string type, string name) find node of type in class list if node not found then error, no such effect type if find node of name in effect list then error, already exists call node::new(effect list, name) create a new effect object create a new effect node link the effect node into the effect list kernel::delete(string name) find node of name in effect list if node not found then error, no such effect delete effect node remove node from list delete effect object remove effect from processing network frees resources, close files and so on kernel::link(string sname, string soutput, string dname, string dinput) find node of sname in effect list if node not found then error, no such effect find node of dname in effect list if node not found then error, no such effect find id of soutput if output not found then error, no such output find id of dinput if input not found then error, no such input create a new link object containing the objects and ids set sname's soutput to the link object delete existing link from the source set dname's dinput to the link object delete existing link to the destination kernel::set(string name, string param, value val) find node of name in effect list if node not found then error, no such effect find id of parameter if parameter not found then error, no such parameter set parameter recalculate affected variables in effect object kernel::run() while there is data left to process for all nodes in effect list if node is an input effect call process() of effect
Inheritance leads to a hierarchy of different effect classes, each of which is ultimately derived from the effect base class.
effect ; base class +-- in0out1 ; effects having only one output, "main" | +-- readfile ; read from a file | | +-- read_8SVX | | +-- read_WAV | | \-- ... | +-- constant ; constant output, but set by parameter | +-- oscillator ; generate a waveform, with parameters like "frequency" | | +-- osc_sine | | \-- ... | \-- ... +-- in1out0 ; effects having only one input, "main" | +-- writefile ; write to a file | | +-- write_8SVX | | +-- write_WAV | | \-- ... | +-- toparam ; sets a parameter of an object when the input data changes | \-- ... +-- in1out1 ; effects having one input and one output, both "main" | +-- feedback ; initialises a feedback loop | +-- delay ; delays the input by a certain time | \-- ... +-- in0outs ; +-- insout0 ; +-- insouts ; stereo versions, with "left" and "right" instead of "main" \-- ...
Several low-level data types are needed by the implementation.
Lists are needed to store the various effect objects and classes. Doubly linked lists can be manipulated easily, only a few functions are needed (addnode, removenode, findnode).
Circular buffers are needed by many effects, to store previous input samples. A circular consists of an array with two pointers, one for writing and one for reading. These are incremented simultaneously, maintaining a constant offset between them. This allows a certain amount of previous data to be stored, without having to copy the entire buffer each time.
Changing the length during use should change the read pointer, for two reasons. Firstly, it is better to have a “jump” in data now, predictably, rather than at some point in the future. Secondly, it is desirable that any length, not just integers, can be used, using linear interpolation. The write pointer must be an integer to be able to write into a certain array element, so changing this would limit length changes to integer steps.
During the summer, before finally deciding on this project, I created a simple system for processing sounds, using QBasic. This system had severe limits on the size of files it could process, but it allowed effects algorithms to be tested. The system showed that it was feasible to develop a sound processing application.
At this point I decided to use the programming language E on the Amiga computer, which I had succesfully used for some other applications. I considered using BOOPSI (Basic Object Oriented Programming System for Intuition), however the advantages conveyed by using this (classes shared between applications, new classes can be created at run time) were outweighed by the disadvantages (one function has to deal with all methods), so I decided to use the inbuilt features of E.
The use of sample data to control parameters is implemented awkwardly. I realised that it depended on what order the effect objects were created, whether parameters would be set before or after the sample at that particular time interval was processed by the controlled effect. A fix was added to ensure the setting of parameters was always after processing, as in order to ensure it is before processing every effect object would have to have a priority, and the whole sample routing strategy would have to be changed.
The three sections of the system need to be tested in different ways.
The command line interface has to deal with user input, which may be incorrect. Therefore the command line interface needs to be tested thoroughly, to ensure that incorrect data is not passed to the other parts of the system.
Each command needs to be tested, with and without valid arguments. Random input should also be tested, to make sure that the interface is resilient. The tests should provoke every error response.
Once the interface has been tested for resilience, it needs to be tested for ease of use and functionality.
The kernel is always given data in the correct format, so method of testing is different to the command line interface. Here the testing consists of verifying that the kernel functions as it is supposed to. This can be done by checking that the result of each operation is correct. Compiler macros can be used to remove this extra testing code from the final program, as it is not necessary after testing.
The common features of all of the effect classes can be tested together, such as linking together and transferring sample data, as the classes are largely similar. However, many of the functions implemented in each class are so small that they can be easily verified to be correct without inserting special testing code.
The effects processing classes need to be tested with real input, so the quality of the results can be judged. Testing of the code is only necessary for the more complicated effects classes, like the z-plane filters. The speed of processing needs to be tested, both for simple and complicated effects.
The system performs as specified, except for a few minor details.
C++ was indicated as the language for implementation, but C++ is a strongly typed language, and there were too many problems in trying to implement the dynamic linking required by the system. This led me to abandon it, and use E, a programming language similar to Pascal, but with object oriented features. Currently E is only implemented for the Amiga range of computers, so porting the system to other platforms would be difficult.
The command line interface required some changes. To enable the use of the operating system function ReadArgs(), which provides powerful command line argument parsing support, the use of “.” to separate effect object names and their inputs, outputs and parameters was dropped.
Commands (for example list) were added to the command line interface, to make using the system easier. These are documented in the user guide. The ability to set global parameters (such as how many samples to process) was needed, so this was incorporated into the set command.
The system is powerful enough to do just about any sound processing, but the command line interface can be awkward to use. The main problem is in keeping track of which effect objects have been created and what links there are between them, and the only way to show this is a graphical interface, which was ruled out as being too complex to implement.
As indicated above, a graphical interface would increase the ease of use of the system. Bars and Pipes, considered in the analysis section, has a graphical interface, but this can be awkward because the “pipe tools” are placed in rigid lines. Free placement of effects is essential. A variety of methods are appropriate for the various commands, for example “new” could allow the user to select the type from a popup menu, parameters could be set in a window opened by double-clicking on the effect object’s icon, and the effect objects could be linked by dragging with the mouse held down from a region of one effect’s icon to a region of another, representing the inputs and outputs.
The system could be altered internally to cope transparently with multichannel sample data. At present each stereo effect has to have its left and right connections linked separately, which is inconvenient.
If sufficiently fast computer hardware is available, new effect types could permit real-time processing of external input. This would require specific drivers for different computer operating systems, and there would have to be a way of checking that the computer was fast enough to cope with the input, because otherwise it may not generate the output samples before the next input arrives. Effects could be added to utilise extra hardware such as signal processing chips on sound cards.
More effect classes can be added easily to the system, but at present they are part of the main program. A plugin system, whereby new effects can be added without recompilation, would allow users and other developers to create their own effects. It is feasible that the system could become a small part of a large music composition, editing and recording application.
More complicated effects can be built from existing simple ones, to the extent that an entire synthesizer could be simulated within the computer, built from various oscillators, envelopes and filters.
The application requires AmigaOS v2.04 or greater.
To install the application, double click on the install icon. The installer asks you in which directory you want to install the application, and then copies all necessary files to that location.
To start the application, double click on the application icon. A console window opens, in which you give commands. To exit the application, click the close gadget of the window with the mouse, or hold the control key and type "".
The application is centered around effects objects, a concept similar to the different effects units found in an ordinary music studio. The commands create and manipulate effect objects. To process sounds, you create effect objects to read the sound from disk, process the sound, and write the new sound to disk. Then you instruct the application to perform the processing.
This section describes all of the commands available.
new effecttype name
Create a new effect object of type effecttype. All the effect objects you create have to be given a name, so that you can refer to them later. The new effect object is initialised with default settings depending on the type.
You will be shown an error message if there is no effect type with the effecttype you specified, or if there is already an effect object with the name you specified (you can’t have more than one effect object with the same name).
For an overview of which effect types are available see the effect reference.
Delete an effect object you have created earlier. You use this command when you no longer need an effect object, and want to get rid of it to free up the memory it requires.
You will be shown an error message if there is no effect object with the name you specified.
link source.output destination.input
Link an output of one effect object to an input of another. You use this command like you would connect cables between different effect units in a music studio, only here you don’t have to scrabble behind racks of equipment.
You will be shown an error message if the source or destination effect objects do not exist, or if there is no output or input with the name you gave in the source or destination object.
To find out which inputs and outputs the different effect types have see the effect reference.
Some linking can cause problems. For example, you can’t link an output of an effect object to one of its inputs, even via other effect objects. This is because an effect object needs to know all of its inputs to generate the output, but as it needs its output as an input it gets stuck before it can get started.
Feedback (having output loop back as an input) can be very useful, so a special effect object type called “feedback” is available. Simply create a new feedback object and link it into the feedback loop at some point. Usually the best place to put it is just before the feedback is returned to the first effect object in the loop.
set name.parameter value
Set a parameter of an effect object. Many effect objects have parameters you can change to change the sound of the effect. For example, the “decay” parameter of an echo object would change the how quickly the echoes die away. Different parameters take different values. Most need you to type in a number, but some require special keywords, and some require a character string (for example a filename, like “MySounds:Voices/BigChoir.8svx”, including quotes (")).
Numbers should be entered normally. You can enter both integers (whole numbers like 5 or -7) and real numbers (like 3.5 or -.01). For very large or small numbers you can use standard form (also called scientific notation), in which the letter “e” (or “E”) represents “multiplying by ten to the power of”, for example -1e3 is equal to -1000, and 3.5e-4 is equal to 0.00035.
You will be shown an error message if there is no effect object with the name you gave, or that effect object doesn’t have the parameter you specified, or you gave a value that wasn’t of the correct type.
To find out which parameters the different effect types have see the effect reference.
Process sounds through the network of effect objects you have set up. The processing stops when there is no more input from sources (like reading sound data from disk) and all of the outputs (like writing sound data to disk) have become quiet (so that the “tails” of echoes are not cut off too quickly). You will be shown an error message if the effect objects are linked together incorrectly, for example if there is a feedback loop without a feedback effect object in it, or if there are some inputs or outputs that are not connected to anything. Other things that can go wrong include not being able to open sound files to read from (for example the file doesn’t exist) or write to (for example the disk is write protected).
You can add comments to the command you are typing in, so that you can remember what what you have done is for more easily. There are two types of comment. If you type “//” (without "“) everything until the end of the line is ignored by the application. For longer comments, anything between”/*" and “/" (without "") is ignored. You can "nest" layers of these, so "/ my /* nested / comment /” is allowed, but there must be an equal number of “/" and "/”, otherwise you will be told about the error.
This section describes all of the effects available.
The output is the sum of all the inputs.
in1 in2 … up to the inputs parameter
inputs the number of inputs to add together
A band pass filter effect, that allows frequencies within a certain range to pass and blocks those outside the band.
lowfreq low cutoff frequency in Hertz (defaults to 250)
highfreq high cutoff frequency in Hertz (defaults to 2000)
A band reject filter effect, that blocks frequencies within a certain range and allows those outside the band to pass.
lowfreq low cutoff frequency in Hertz (defaults to 250)
highfreq high cutoff frequency in Hertz (defaults to 2000)
This is a dynamic range compression and expansion effect. If the control input is above the threshold level, then the output is a scaled according to the ratio. If the ratio is less than one, differences between amplitudes are reduced (the signal is compressed). If the ratio is greater then 1, then differences in level are exagerated (the signal is expanded).
The time parameter controls the level detection. If the time is too short then low frequency signals can cause “pumping”. A long time can result in rapid changes in level not being affected.
main the signal to be manipulated
sidechain the control signal, if this is not linked then the main input is used to modify itself
time the time over which to calculate the average level, in seconds (defaults to 0.05)
threshold the cutoff level (defaults to 0.5)
ratio the compression ratio (defaults to 1)
The output is the input delayed by the delay time. The output is zero until the delay time has passed. Changing the delay time by large amounts during processing can result in “glitches”, the output jumping s uddenly from one value to another. Slow changes can result in the pitch being altered, as the output passes more quickly or more slowly than the input.
delay the delay time in seconds (defaults to 0.1)
An echo effect. Each echo is quieter by the decay factor (which should be less than 1), and they are separated by the delay time.
decay how much quieter succesive echoes are (defaults to 0.5)
delay the time between echoes in seconds (defaults to 0.25)
The output is the volume envelope (average signal level) of the input. This can be used to control effects according to the signal level.
The time parameter controls the level detection. If the time is too short then the output will contain low frequencies from the input. A long time can result in sudden changes not being followed.
The output envelope is delayed by half of the time parameter, relative to the input sound.
time the time over which to calculate the average level, in seconds (defaults to 0.05)
This effect should be used instead of one delay in a feedback loop. See delay and feedback.
delay the delay time in seconds (defaults to 0.1)
A feedback effect must be present in any feedback loop. The output is the same as the input, delayed by one sampling period (the shortest possible time). For accurate delay times, the effect fbdelay should be used in place of one delay in the feedback loop.
If the average sidechain input level is below the threshold parameter then the output is scaled to zero, otherwise the output is the main input. Gating is useful for removing background noise during gaps in the main signal.
The time parameter controls the level detection. If the time is too short then low frequency signals can cause the gate to open and close in time, resulting in “pumping”. A long time can result in short quiet sections not being masked.
The output is delayed by half of the time parameter, relative to the input.
main the signal to be manipulated
sidechain the control signal, if this is not linked then the main input is used as the control
time the time over which to calculate the average level, in seconds (defaults to 0.05)
threshold the cutoff level (defaults to 0.5)
This is a half-wave rectifier effect. When the input is positive, the output is the same as the input, otherwise the output is zero, so the parts of the waveform below the axis are “cut off”. This leads to an increase in frequencies one octave above the fundamental, although the results are not as pronounced as full rectification (see rectify).
A high pass filter effect, that allows high frequencies to pass but blocks low frequencies. The freq parameter indicates the cutoff frequency, below which lower frequencies are reduced.
freq cutoff frequency in Hertz (defaults to 2000)
The output is the input inverted, so peaks in the waveform become troughs and vice versa.
If the average level is above the threshold, the amplitude is scaled down to the threshold level (similar to a compressor (see compand), but more severe).
The time parameter controls the level detection. If the time is too short then low frequency signals can cause “pumping”. A long time can result in sudden loud sections not being reduced in level.
The purpose of a limiter in music recording is to prevent the signal from exceeding a certain level, so that the recording device doesn’t overload and distort. See the various write effects for details.
The output is delayed by half of the time parameter, relative to the input.
time the time over which to calculate the average level, in seconds (defaults to 0.05)
threshold the maximum level (defaults to 1)
A low pass filter effect, that allows low frequencies to pass but blocks higher frequencies. The freq parameter indicates the cutoff frequency, above which higher frequencies are reduced.
freq cutoff frequency in Hertz (defaults to 250)
The output is all of the inputs multiplied together. This can be used to change the volume of sounds (if one input is slowly varying) or add new frequencies (for sounds of similar pitch).
in1 in2 … up to the inputs parameter
inputs the number of inputs to multiply together
A pitch shifter changes the pitch of a sound without changing the speed. The ratio parameter sets how much to change the pitch by (for example 2 will raise the pitch by one octave). The freq parameter controls the frequency at which the sound is repeated, the effect works by recording short sections and repeating them more quickly or more slowly. For best results, when pitching up the freq parameter should be close to the fundamental frequency of the sound, but lower when pitching down.
ratio the pitch change factor (defaults to 1)
freq the shifting frequency in Hertz (defaults to 256)
Reads the output from a sample file of format SLab, IFF 8SVX, or RIFF WAVE (respectively). The output has a maximum amplitude of 1 for each type other than SLab’s own, which may contain any value.
By default SLab files are normalised when they are read in. This means that the sound is scaled so that the maximum amplitude is 1, which is what is wanted for normal sounds, but probably not for control signals, for which normalisation may be turned off.
file the file to read from
normalise (SLab format only) set to “yes” or “no” (defaults to “yes”)
This is a full-wave rectifier effect. The output is the absolute value of the input, so parts of the waveform below the axis are folded over. This leads to an increase in frequencies one octave above the fundamental.
This effect reverses sound in time. This effect has to store the sound coming in before it can play it backwards, so the time parameter indicates how much to store. The first output is after the time parameter, after which the first part of the input is output in reverse, followed by later sections in reverse.
To reverse an entire sound, simply set the time to longer than the sound. Interesting effects can be obtained using very short times (for example, 0.002)
time the reverse time in seconds (defaults to 0.25)
The input is sent unaltered to all the outputs. This is often used to combine effects in parallel.
out1 out2 … up to the outputs parameter
outputs the number of outputs to send to
This effect absorbs all of the input, not passing it on, until it rises above the threshold. After that, the output is equal to the input. This is useful for preventing output files starting with a period of silence.
As this effect doesn’t send input for a time, it should be used with caution. Unpredictable results will occur if the output of one vox effect is linked (directly or indirectly) to an effect that has an input not linked to the same vox effect. It is recommended that this effect is used directly before the final output.
threshold the cutoff level (defaults to 0.001)
This effect changes the width of a stereo image. If the size of the width parameter is greater than 1, left and right seem further apart, otherwise they seem closer. Negative width parameters swap left and right.
width the width of the stereo image (defaults to 1)
Write the input to a sample file of format SLab, IFF 8SVX, or RIFF WAVE (respectively). SLab’s own format is the only one that doesn’t clip the signal. The others distort for input signals with an amplitude greater than 1, so a limiter may be necessary (see limit).
Any already existing file will be overwritten, so make sure there is no file with the same name before starting.
file the file to create
This is a z-plane filter effect. The z-transform is a mathematical technique that allows filters to be made according to design, however this can be complicated. Some preset filters have already been set up (see lowpass, bandpass, highpass, bandreject) so that they can be used more easily. Essentially, frequencies are represented as going around a semicircle, and poles and zeros are placed within the semicircle. Poles make frequencies near them louder and those further away quieter, and zeros make frequencies near them quieter and those further away louder.
The details of designing filters will not be gone into here, for more information consult a good book on the topic (for example, “An introduction to the analysis and processing of signals” by P. A. Lynn, 1973-89). Make sure that no poles have a radius greater than 1, and that for each pole or zero with a frequency not equal to zero or half of the sampling rate there is another with the same radius but negative frequency.
This effect is very powerful, but you do need to know what you are doing to be able to use it properly.
poles the number of poles in the filter
zeros the number of zeros in the filter
pole1r pole2r … zero1r zero2r … the radius of the poles and zeros
pole1f pole2f … zero1f zero2f … the frequencies of the poles and zeros
This section is a step by step guide in using the application. In this section, things you need to type in are printed like this, and the output of the application is printed like this.
As a first tutorial, we will add some echo to a sound that you have on disk.
First we need to get the sound from the disk. Here we will use the file “MySounds:Funky/OrchStab.8svx”, you will need to use one of your own files. As this is an IFF 8SVX file (indicated by the extensions .8svx or .iff), we will need an 8SVX reader:
>> new read_8svx reader >> set reader.file "MySounds:Funky/OrchStab.8svx"
Now we need to decide where to put the echoed sound. We will use “MySounds:Funky/OrchStab_Echo.8svx”, again you should choose your own name for the new sound file. We will write the file as an IFF 8SVX, although you can choose a different format if you want to:
>> new write_8svx writer >> set writer.file "MySounds:Funky/OrchStab_Echo.8svx"
We now have a reader and a writer, time to put the echo in between. We will have a fairly long echo time of one and a half seconds, but which dies away relatively quickly (by having the decay close to zero):
>> new echo echo >> set echo.delay 1.5 >> set echo.decay 0.25
Note how you can have an effect object with the same name as an effect type. The computer doesn’t get confused, although with more complicated processing than this simple echo you might confuse yourself!
With all of the effect objects set up, now we have to link them together:
>> link reader.main echo.main >> link echo.main writer.main
Now all of the setting up is done, we can process the sound:
All being well, a new file will be created containing the echoed sound. You will be informed of any problems, for example if there is not enough space on the disk for the new sound file.
Although there is a built in echo effect, here we will show how to make your own echo effect out of simpler building blocks.
An echo effect is quite simple. Even so, we will need six effect objects for one echo effect, as you can see from the diagram. First we create the objects we need, the names start with e_ so that we know that they are all part of one echo effect:
>> new add e_add >> set add.inputs 2 >> new split e_split >> set split.outputs 2 >> new delay e_delay >> new mul e_scale >> set mul.inputs 2 >> new feedback e_fb >> new constant e_decay
Then we link them together:
>> link e_add.main split.main >> link e_split.out2 e_delay.main >> link e_delay.main e_scale.in1 >> link e_decay.main e_scale.in2 >> link e_scale.main e_feedback.main >> link e_feedback.main e_add.main
Now that the effect objects making up the echo are linked together, we can set the echo parameters (the decay value should be between 1 and -1, otherwise the echo would make the sound get louder and louder):
>> set e_decay.out .5 >> set e_delay.delay .33333
This gives an echo half the volume of the previous one, about 3 times per second.
Now that our echo is set up, we can link it to a reader and a writer to process a sound. This is described in detail in an earlier tutorial. You need to link to e_add.in1 and from e_split.out1.
Here the real power of the application begins to show itself. We are going to control some effects with other effects, to create a very unusual sound.
The diagram explains what we are going to do, only a few notes will be placed as comments in the following. You can type the comments in, they do not affect the results.
Warning: the output file will be nearly 700 kB in size, so make sure there is enough space before you run.
new read8svx reader ; Set up sample source set reader file "Tutorial3/Input.8svx" new add readmix ; Repeat sample every 0.5s set readmix inputs 2 new split readsplit set readsplit outputs 2 new feedback readfb new fbdelay readdelay set readdelay delay 0.5 link reader main readmix in1 link readmix main readsplit main link readsplit out2 readdelay main link readdelay main readfb main link readfb main readmix in2 new rampup volume ; Set up volume oscillator set volume freq 0.5 new constant volscale ; Scale to between 0.5 and 1 set volscale value 0.25 ; (1 - 0.5) / 2 new constant volshift set volshift value 0.75 ; scale * 3 new mul volmul set volmul inputs 2 new add voladd set voladd inputs 2 link volume main volmul in1 link volscale main volmul in2 link volmul main voladd in1 link volshift main voladd in2 new mul changevol ; Modulate volume set changevol inputs 2 link readsplit out1 changevol in1 link voladd main changevol in2 new zfilter filter ; Set up filter set filter poles 4 set filter zeros 4 set filter pole1r 0.95 ; Poles just outside zeros set filter pole2r 0.95 ; give isolated peaks set filter pole3r 0.95 set filter pole4r 0.95 set filter zero1r 0.9 set filter zero2r 0.9 set filter zero3r 0.9 set filter zero4r 0.9 link changevol main filter main new sine freq ; Set up frequency oscillator set freq freq 0.25 new constant frqscale ; Scale to between 700 and 2100 set frqscale value 700 ; (2100 - 700) / 2 new constant frqshift set frqshift value 2100 ; scale * 3 new mul frqmul set frqmul inputs 2 new add frqadd set frqadd inputs 2 link frqume main frqmul in1 link frqscale main frqmul in2 link frqmul main frqadd in1 link frqshift main frqadd in2 new toparam ctrl1 ; Set up filter control new toparam ctrl2 new toparam ctrl3 new toparam ctrl4 new toparam ctrl5 new toparam ctrl6 new toparam ctrl7 new toparam ctrl8 set ctrl1 to filter set ctrl2 to filter set ctrl3 to filter set ctrl4 to filter set ctrl5 to filter set ctrl6 to filter set ctrl7 to filter set ctrl8 to filter set ctrl1 param pole1f set ctrl2 param pole2f set ctrl3 param pole3f set ctrl4 param pole4f set ctrl5 param zero1f set ctrl6 param zero2f set ctrl7 param zero3f set ctrl8 param zero4f new split ctrlsplit1 ; Set up filter control routing set ctrlsplit1 outputs 4 new split ctrlsplit2 set ctrlsplit2 outputs 3 new split ctrlsplit3 set ctrlsplit3 outputs 2 new split ctrlsplit4 set ctrlsplit4 outputs 2 new mul ctrlmul set ctrlmul inputs 2 new constant ctrlband set ctrlband 1.5 link ctrlband ctrlmul in1 new invert ctrlinv1 new invert ctrlinv2 link frqadd main ctrlsplit1 main ; Link filter routing link ctrlsplit1 out1 ctrl1 main link ctrlsplit1 out2 ctrl5 main link ctrlsplit1 out3 ctrlinv1 main link ctrlinv1 main ctrlsplit3 main link ctrlsplit3 out1 ctrl2 main link ctrlsplit3 out2 ctrl6 main link ctrlsplit1 out4 ctrlmul in2 link ctrlmul main ctrlsplit2 main link ctrlsplit2 out1 ctrl3 main link ctrlsplit2 out2 ctrl7 main link ctrlsplit2 out3 ctrlinv2 main link ctrlinv2 main ctrlsplit4 main link ctrlsplit4 out1 ctrl4 main link ctrlsplit4 out2 ctrl8 main new writewav writer ; Set up sample output set write file "Tutorial3/Output.wav" link filter main writer main set . runtime 8 ; Run for 8 seconds run
The system is laborious to use. The command line interface is long winded, however all of the commands are necessary. A graphical user interface would make many operations possible with one or two mouse clicks rather than a line of text. A second advantage is that all linkages would be visible, so you would not have to remember what names had been given to each effect object or what links had already been made.
The scripting facility hinted at in the analysis section has not been implemented. However, there are two workarounds.
Firstly, the input and output handles for the SLab command can be redirected using the system Shell:
SLab <mycommands.txt >NIL:
The redirected input file should contain commands as they would be entered into the console window. The quit command is not strictly necessary at the end of the input file, as SLab quits when an EOF is read from the input.
Secondly, commands can be pasted into the console window using the standard system key (Right Amiga - V), copied from any source (for example a text editor). Multiple commands can be pasted simultaneously, however as all of the lines are entered at once the system cannot display the prompts for each command until the input stops, at which point all of the prompts are displayed on one line. This looks unaesthetic but has no effect on the correct working of the system. This method was used during testing.
The scripting facility need not be implemented within SLab, rather there should be an ARexx port. ARexx is a simple interpreted language, but applications can create their own (named) ARexx port. Using the ARexx ADDRESS command, any command line not recognised by ARexx as an ARexx command is passed to this port.
A mechanism is in place at the effect level for a get command to get the current values of parameters, however this has not been implemented in the kernel or the command line interface.
Simple effects, such as rectify, can be implemented very easily. However, the command line interface parts of the effects (those that convert strings to id codes) show a large amount of repetition of simple code. To make implementing effect classes easier, the inputs, outputs and parameters could be stored in a table containing the name, id code and various properties (such as the offset of the link structure in the effect object for inputs, the type of parameters, and whether linking or setting this parameter requires recalculation). This table could be used by generic methods of the effect base class, any classes that require more complicated arrangements (for example zfilter or the multiple input or output classes) could use the current arrangement.
A useful side effect of this table driven method is that it would be simple to list all of the inputs, outputs and parameters of a given effect class or object. With the addition of textual descriptions this could also become an online help system.
The quality of the output is very high, especially the filter effects. When compared to simple moving average filters (as found in OctaMED), the sound is much clearer. OctaMED’s filters tend to make the sound seem muffled. The lack of clipping ensured by floating point implementation makes it easier to combine effects; in OctaMED’s sample editor some effects (like echo) can lead to volume increase and clipping so the volume must be reduced first.
The echo effect (see testing, first echo) exibits a slight loss of high frequencies in the echos, this is due to the linear interpolation used in the delay effect when the delay time is not an integer number of samples. Natural echos from soft or irregular surfaces tend to exhibit loss of high frequencies, so this property may be useful.
The (mathematically) correct interpolation requires summing the function y = sin(at)/(at) for every sample (past and future), this function has a peak at the sample in question and is zero at all other sample points. However an implementation of this interpolation would be slow, and is not really necessary. Alternatively, a switch parameter added to the delay effect could ensure that the delay time is adjusted to be an integer number of samples.
The sound processing is very slow, but this is due to obsolete hardware being used (the CPU is a 7 MHz Motorola 68000, with 16 bit integer multiplication taking 70 clock cycles). Modern computers are easily a thousand times faster at floating point maths, so the system would be much more useable. On the current hardware, simple processing of one second of sound (at 22050 Hz sampling rate) takes about one minute, rising to over five minutes if parameters are continually changed that require recalculation (for example filter frequencies). Speed could be increased by assembly language optimisation of critical sections, but there are unlikely to be large gains as there are few loops in the code.
The command line interface is currently case sensitive. The functions that need to be changed are in the source code file string.e, and should be made to use the system standard utility.library functions. There is little checking on names, with the result that an effect object can end up with the name "" (or more dangerously “.”), confusing the user and the system.
The run command doesn’t check for the end of sounds. Currently it runs for a fixed number of samples (22050), longer waveforms can be processed by using the run command several times in succession. This can be fixed by having effect.issink() and effect.isdone() methods, and run finishes when all sinks are done. Alternatively (for use when the sound will never cease) the set command can be expanded to include global parameters, with “.” as the effect name for consistency with the list command. Global parameters could include rate (global sample rate, perhaps defaulting to the maximum sample rate in use), runtime (time to run for), or runsamples (number of samples to run for).
The effect linkage checking enters an infinite loop if effects are linked in a loop without a feedback effect, with the recursion leading to stack overflow which could crash the operating system. This could be fixed by having every effect check for loops, this would require the effect.check() method to be split into two methods, one defined by the effect base class to prevent loops, and one defined in each derived class to do the checking. The latter function would be called by the loop prevention method. This fix would require that feedback effects are last in the loop, so that these can stop the loop (otherwise effects in the loop after the feedback effect would not be checked).
A simpler fix would be to check free stack space, if this is very low it is due to either extremely long chains of effects or the recursive loop described above. However, this method would give only a vague error message, that there was a loop somewhere in the effects linkage.
File handling leaves much to be desired. Error reporting is poor, no errors are reported if files cannot be opened, and read / write errors currently exit the system ungracefully (as does running out of memory). The files are opened too early (when the name is set) and closed too late (when the name is set to something else or the effect is deleted (including exiting the system)).
The write effects do not check whether the file exists, and as a side effect the reset command causes the file that has been written to be erased. A workaround is to use a command like “set mywriter file NIL:” before using the reset command.
The Fourier transform is derived from the Fourier series, a method of representing periodic functions by infinite series of sine and cosine functions. The series is extended to aperiodic functions by having a continuous (rather than discrete) frequency spectrum, expressed more concisely using complex exponentials:
The Laplace transform is an extension of the Fourier transform, which is valid for more functions. The Laplace transform uses a complex frequency variable :
The Fourier and Laplace functions are for continuous functions, but sampled data signals are made up of many discrete points. A signal is represented by a sum of impulse functions, separated by the sampling time interval . The Laplace transform is easily found, and by setting , the z transform can be derived:
A linear system can be represented by a transfer function, because the transform of the output is the transform of the input multiplied by the transfer function. The response of a system to a unit impulse is its impulse response, the transform of which is the transfer function.
Multiplication of transforms is the same as convolution in the time domain. Convolution for sampled data means that each sample is replaced by a scaled copy of the impulse response and the output is the sum of all of them. This is expressed as an integral for continuous functions:
Many systems can be represented as a set of poles and zeros in the s or z plane, specifying the transfer function, from which the frequency and phase response characteristics of the system can be determined. Conversely, a filter can be designed by placing poles and zeros to create a desired characteristic.
The z domain transfer function can be used to generate a simple recursion formula to process sampled input. The formula gives the current output in terms of the current and previous inputs and previous outputs. Each entire signal can be shifted to minimize delay (if so desired) or create a realisable filter (where effect is later in time than cause), this is equivalent to having extra poles or zeros at the origin of the z-plane, unaffecting the frequency response.
The IFF (Interchange File Format) standard defines a generic file structure, built around chunks. Different types of file (sound, graphics etc) define new chunks.
ULONG chunk id (usually a character string) ULONG data length ... data
Every IFF file is a FORM chunk, containing other chunks:
ULONG "FORM" ULONG length ULONG type ... chunk list
For 8SVX sound files, the FORM type field is “8SVX”, and then a VHDR and a BODY chunk are required (in this order). All IFF files may contain other chunks, but these can be skipped using the length field.
ULONG "VHDR" ULONG length ULONG samples in high octave 1-shot part ULONG sample start offset of high octave repeat part ULONG samples per cycle in high octave repeat (0 = no repeat) UWORD samples per second UBYTE number of octaves UBYTE compression: 0 = none, 1 = Fibonacci-delta encoding ULONG volume (65536 maps to 1.0)
ULONG "BODY" ULONG length SBYTEs sample data
The RIFF (Resource Interchange File Format) standard is Microsoft’s own duplication of the IFF file structure. The main difference is that the data words are stored lsb first.
Every RIFF file is a RIFF chunk, containing other chunks:
ULONG "RIFF" ULONG length ULONG type ... chunk list
For WAVE sound files, the RIFF type field is “WAVE”, and then a fmt and a data chunk are required (in this order). All RIFF files may contain other chunks, but these can be skipped using the length field.
ULONG "fmt " ULONG length UWORD encoding (1 = PCM) UWORD number of channels ULONG sampling rate ULONG bandwidth (= rate * channels * [bits/8]) UWORD block align (= channels * [bits/8]) // encoding specific data, here encoding = 1 UWORD bits per sample
ULONG "data" ULONG length // data format for encoding = 1 // bits = 1 to 8, UBYTE (least significant bits 0) // bits > 8, signed integer of least number of bytes required (for // example 3 bytes for 20 bit) (least significant bits 0) ... sample data, channels interleaved
RIFF WAVE files are quite complicated, and can include cue points (with text labels and comments), silent sections, data compression, play lists, and even embedded files (images to be displayed at cue points, for example).
ULONG "SLab" ULONG length ULONG "Info" ULONG length = 12 FLOAT rate (def = 44100.0) FLOAT bias (def = 0.0) \ for normalisation FLOAT ampl (def = 1.0) / when reading in ULONG "Data" ULONG length = number of samples FLOATs sample data
() = base classes not for direct use - = not yet implemented (effect) base class (container) container base class echo input + output "main", params "decay" + "delay" - pitchshift notch input + output "main", params "depth" + "frequency" - reverse - widen parameter "width" -1 = swap, 0 = mono, 1 = same (in0out1) one output "main" constant output = param "value" (osc) oscillator, params "rate", "frequency", "amplitude", "phase" pulse pulse, param "width" ramp ramp (up; down <=> negative amplitude) sine sine wave triangle triangular wave (read) read from param "file", with sample rate "rate" (def = rate stored in file) read8svx file type = IFF 8SVX readslab file type = SLab, param "normalize" = "on", "off" - readwave file type = RIFF WAVE (16 bit) whitenoise white noise (in1out0) one input "main" print write to screen toparam converts data to parameters, param "to" (objpart) (write) write to param "file" write8svx file type = IFF 8SVX writeslab file type = SLab - writewave file type = RIFF WAVE (16 bit) (in1out1) one input "main", one output "main" amp amplify by param "gain" copy output = input delay output(t) = input(t - param "delay") fbdelay delay fixed for feedback loops - envfollow envelope follower, parameter "time" (- envchain) optional input "sidechain", parameter "threshold" - compand compressor / expander - gate gate - limit limiter (ducker when used with sidechain ?) feedback allow feedback loops (filter) simple filter, param "frequency" (bandfilter) param "bandwidth" bandpass band pass filter bandreject band reject filter highpass high pass filter lowpass low pass filter halfrectify output = if input > 0 then input else 0 invert output = - input rectify output = abs(input) - vox output = input, after input > param "threshold" zfilter params "poles", "zeros", "poleXr", "poleXf", "zeroXr", "zeroXf" (in1outm) one input "main", param "outputs" "out1" etc split outputX = input - multidelay outputX = input(t - delayX) (inmout1) param "inputs" "in1" etc, one output "main" add output = in1 + in2 + ... mul output = in1 * in2 * ...
Most of the effects were tested as follows. The test input structure was the same, with the effect specific commands inserted at “…”. The audio cassette contains for each effect the original and the modified sound twice each.
new read8svx r set r file Test/Beat.8svx set r rate 22050 new write8svx w set w file Test/Beat.<effect>.8svx new <effect> f ... link r main f main link f main w main run quit amp set f gain 0.5 bandpass set f frequency 1000 set f bandwidth 500 echo 1 set f delay 0.39 set f decay 0.5 echo 2 set f delay 0.01 set f decay 0.9 halfrectify highpass set f frequency 1000 lowpass set f frequency 300 rectify
Oscillators were tested differently. The audio cassette contains short sections of output from each of the oscillators with default parameters.
new <effect> o set o rate 22050 ... new write8svx w set w file Test/Osc.<effect>.8svx link o main w main run quit sine ramp pulse pulse set o width 0.2 triangle whitenoise
The notch filter was tested as follows. A short section of the output is shown, you can see that the fundamental frequency has been removed from the otherwise square wave. Other features of the output are phase changes (leading to asymmetry of the waveform) and time taken to reach the steady state response (the start is slightly different to the remainder of the wave). The diagram is a screenshot taken from OctaMED’s sample editor, showing 256 samples.
new pulse r set r frequency 1000 set r amplitude 0.5 new write8svx w set w file Test/Notch.8svx new notch f set f depth 0.95 set f frequency 1000 link r main f main link f main w main run quit
The matrix and vector classes are taken from LEDA with slight modifications.
"In the fall of 1988, we started a project (called LEDA for Library of Efficient Datatypes and Algorithms) to build a small, but growing library of data types and algorithms in a form which allows them to be used by non-experts. We hope that the system will narrow the gap between algorithms research, teaching, and implementation.
LEDA is available by anonymous ftp from:
ftp.cs.uni-sb.de (22.214.171.124) /pub/LEDA
ftp.maths.warwick.ac.uk (126.96.36.199) /pub/sources/c++
The distribution contains all sources, installation instructions, a technical report, and the LEDA user manual. LEDA is not in the public domain, but can be used freely for research and teaching. A commercial license is available from the author."
The developer CD contains documentation on operating system functions and the IFF standards, among other things.
There are several useful sites on the Internet concerned with sound processing. A description of various effects processors can be found at:
http://www.eden.com/~keen/effxfaq/fxtaxon.htm ? http://www.hut.fi/Misc/Electronics/dsp.html
by P. A. Lynn, (c) 1973-89. This book describes how to use the Laplace and z transforms to process signals, including using z-plane filters and recursion formulae to filter sampled sound.