IMPLEMENTATION OF SYNCHRONOUS SAMPLE RATE CONVERTER USING MODULAR AUDIO PROCESSING SYSTEM Piotr Nykiel, Bartosz Bielawski Warsaw University of Technology Institute of Radioelectronics ul. Nowowiejska 15/19, PL-00-665 Warsaw POLAND P.Nykiel@ire.pw.edu.pl, B.Bielawski@stud.elka.pw.edu.pl Abstract: The paper describes a simple yet powerful audio processing library for personal computers and demonstrates how it can be used to create a synchronous sample rate converter. In the first part Modular Audio Processing System (MAPS) is presented. The two unique features are integer-only calculations assuring high processing quality and modular design enabling rapid system development. The second part of the paper presents possible uses of MAPS and a sample implementation of synchronous sample rate converter utilizing the library and the Qt Toolkit for GUI creation. The software details as well as the general system architecture are presented. 1 Introduction Operations performed during mastering audio material may be divided into two categories: interactive and batch actions. Interactive must be supervised by an operator monitoring and adjusting parameters in realtime. Perceiving sound is subjective, so no computer can handle this human task. On the other hand there are some operations that do not require immediate attention, those are called batch jobs. In most cases an operator chooses what should be done, sets all parameters and starts processing. All other things are done automaticaly, the only thing an operator must do is to watch for errors and wait for the task to end. Modular Audio Processing System (for short MAPS) was designed as a robust library for batch processing. Diagrams and graphs are a very convenient way of showing principles of operation as they decomposite complicated systems into simple blocks. The same can be done to almost any audio processing system. The design goal of MAPS was to provide those basic blocks and enable user to create complex systems by easily linking those blocks. As it was mentioned before, MAPS is capable of running on almost any modern PC with Microsoft Windows XP/Vista or Linux operating system installed. The library is written in plain C++ which makes it portable and robust. There were two main reasons, why C++ has been chosen: it is compiled to native machine code and it’s object-oriented. First feature ensures top performance [1]. Not so long ago only dedicated digital signal processors were capable of filtering signal in realtime. Nowadays Intel x86 CPUs with an integrated x87 FPUs are cheap and powerful and despite being general puprose CISC processors they can handle DSP as well. The object-oriented approach allows programmers to use general interfaces and encapsulation. The library provides set of blocks that share common interface and hide details of operation behind it. The user is presented only few methods that set parameters, allow block interconnection and do the procesing. The MAPS is an unique software because it uses fixed-point arithmetics for processing signal. Floatingpoint numbers which are commonly used in other audio programs have some serious drawbacks. The defects will be described in the next section. 2 Fixed-point vs. floating-point Both ways of storing numbers in computer’s memory use two’s complement code, which is a slight modification of the natural binary code. An n-bit number stored in two’s complement code has a value of: x = −dn−1 · 2n−1 + n−2 X di · 2i i=0 where Dn is nth digit of number — right-most digit has an index of 0. Fixed-point numbers are stored the same way. The only difference is placement of radix point, not behind the last digit but somewhere inside the word. Let’s consider an n-bit number with m-bit fractional part. It’s value is as follows: Pn−2 −dn−1 · 2n−1 + i=0 di · 2i x= 2m Floating-point numbers are stored using exponential formula: x = (−1)S · F · 2E−B (1) where S is sign of the number, F is a fractional part, E is an exponent and B is a constant enabling negative exponents. S, F and E are nonnegative integers. The lenghts and position of these fields in IEEE754 single standard are shown in table 1 [2]. Field Length Position Sign 1 32 Exponent 8 31 Mantissa 23 24 Table 1: Structure of IEEE754 single precission floating-point number To compare which numbers are better for audio processing further analysis has to be conducted. The two most common operations used in digital signal processing are addition and multiplication which are often combined into Multiply And Accumulate (MAC) instruction of DS processors. The case of fixed-point numbers is simple. Addition is precise as long as the operation does not cause overflow. It is common to use result world longer than terms’ word, thus preventing overflow. The result of multiplication of two n-bit numbers is 2n-bits long and has to be stored in a word of this length. To restore initial length the result has to be somehow truncated, in most cases rounding is used: Multiplication of two floating-point numbers can be described by following equation: x · y = S1 · S2 · M1 · M2 · 2E1 +E2 Please note that M1 · M2 is fixed-point multiplication and rounding is incorporated. It should be also mentioned that exponantial format used to store floatingpoint numbers causes quant to be non-uniform. This leads to quantization error begin correlated with input signal and modulative interference. Floating-point numbers seem to be the best choice for scientific calculations where a wide range of magnitudes is used. Audio processing is not the case, the signal dynamic range is well known. Fixed-point arithmetics allows designers to control signal processing with accuracy of a single bit, and that is why it was chosen to be used in MAPS. 3 MAPS architecture The MAPS library was designed to be modular, yet easy to use. All entities used in signal processing were divided into two categories: buffers and filters. Buffers are blocks used for storing data, it can be both audio signal as well as filter’s coefficients. Buffers are partially aware of their contents, besides samples, samplerate, number of channels and bitdepth is transffered. Internally data is stored as 32-bit integers, but all system blocks support using most significant bits (up to 8) as integer part. Buffers automatically manage memory for it’s contents and can load and save data from/to text files. Filter is not really a block, but rather an interface for specific ones. The general interface depicted on figure 1 allows software to use same functions to perform common tasks on all objects driven from filter. 1 x ˜ = bx + LSBc 2 It is possible to minimize the rounding error while calculating convolution. The solution is similar to those used in digital signal processors — using a temporal variable that has at least two times more bits than input operands. This way the rounding happens only once for an output sample. Operations on floating-point numbers are more complicated. Addition can be explained using symbols from equation 1 (assume x > y): x + y = (M1 + M2 · 2E2 −E1 ) · 2E1 Figure 1: Common filter interface Fractional parts may be added (M1 + M2 ) only after being brought to the same, bigger of the two, exponents (E2 ). This is equal to shifting smaller mantissa right and may lead to a loss of precission whenever exponents do not match. It was assumed that each block creates and controls only it’s output buffers, while input data is acquired using pointers to other buffers. This enables a buffer to be source of signal to any number of buffers. The most important functions are: set {input|coeffs|dither} buffer() — Sets the corresponding buffer and does the error checking. input version configures block to work with the given buffer. Some blocks do not implement all of these functions (e.g. delayline doesn’t support setting coeffs buffer). get {output|random} buffer() — Returns a pointer to corresponding buffer. If the parameter is invalid or the block does not support this feature — returns NULL. Not all of these functions are implemented in all blocks. {set|get} out bits() — sets or gets the number of significant bits in the output buffer(s). flush() — flushes all delaylines and buffers in block, sets internal parameters to default values. process() — processes data, kind of processing depends heavily on the type of the block. The MAPS library supports currently 9 types of blocks, their names and functions are presented in the picture 2. Multiplexer (mux) interleaves two or more buffers into one buffer. It is commonly used just before output file block to prepare data for being written into file. The number of inputs is set in constructor (mux()). Demultiplexer (demux) separates one buffer into two or more buffers. It is commonly used to disjoin channels in input file buffer. The number of outputs is set in constructor (demux()). Adder (adder) mixes down at least two buffers into one buffer. Before addition samples are weighted, channel weights are set using set input amp(). Adder supports dithering and can be a source of signal for dither generator. Input file (infile) is used to read data from audio files into infile’s output buffer. This block uses no input buffer. There are three important infile functions: open(), close() and eof(). This block wraps functions provided by libsndfile library [3]. Output file (outfile) is used to write data from a buffer into an audio file. File format may be specified. There are threee important functions: open(), close() and set format(). outfile autodetects output samplerate and bitdepth. This block wraps functions provided by libsndfile library [3]. Dither generator (dither) is a block designed to provide dither signal for other blocks. It’s presence is optional. Figure 2: Division of objects by their function Those nine blocks can be combined to create many different signal processing lines. In the picture 3 sample configuration of stereo-to-mono converter is presented. Below a short description of all blocks is presented. FIR filter (fir filter) is a block that convolves input signal x[n] with coefficients h[n]. This block has one input buffer, one output buffer, supports signal dithering and generates pseudorandom data for dither block. PolyFIR filter (polyfir filter) supersedes fir filter by enabling polyphase filtering. Coefficients for all subfilters are provided by coefficient buffer. User must also set L (interpolation) and M (decimation) factors using set filter ratio(). Please note that polyfir filter may upsample as well as downsample signal. Delayline (delayline) delays input signals by N samples. N must be set using set delay(). Delayline supports dithering, but can’t be a source of random data for dither generator. Figure 3: Sample stereo-to-mono converter Providing only blocks to build system configured at compile-time is not a flexible solution. Building processing line of those components is easier, but still requires some programming skills. That is why MAPS provides class for managing buffers and filters. Class filter chain can be descibed as an inteligent container with dynamically created content. Besides storing objects it is capable of loading signal processing line from a text configuration file. Configuration file syntax is subset of syntax supported by libconfig library [4]. Each configuration file consists of several sections, one of them is ”config” section, which describes some parameters of the whole processing line. Other sections describe blocks or buffers, one section each. Properties of a specific section depend on the type of section, only two things are common: name and type specifier. Upon loading of such file filter chain object parses input, checks for errors (e.g. checks for missing buffers) and creates dynamic structure as described in the file. The class provides also methods for easy access to default input and output files and wraps calling block’s process() functions in it’s own process() method. Simple program using MAPS is presented on listing 1. Please note the error checking has been ommited. Listing 1: Sample program using MAPS library adder : { t y p e = ” adder ” ; i n p u t s = [ ”dmx : 0 ” , ”dmx : 1 ” ] ; amps = [ 0 x3FFFFFFF , 0 x3FFFFFFF ] ; // 1/2 bits = 16; }; out : { type = ” o u t f i l e ” ; i n p u t = ” adder ” ; }; 4 #include <i o s t r e a m > #include ” f i l t e r c h a i n . h” using namespace s t d ; filter chain f ; i n t main ( i n t a r g c , char∗ a r g v [ ] ) { // c h e c k number o f arguments i f ( a r g c != 4 ) { p r i n t f ( ” \ nUsage : maps f i l t e r . f l t r . . . i n f i l e . wav o u t f i l e . wav\n” ) ; return 0 ; } // l o a d p r o c e s s i n g l i n e f . l o a d f i l t e r ( argv [ 1 ] ) ; // g e t p o i n t e r s , open f i l e s , r e c o n n e c t i n f i l e ∗ inf = f . g e t d e f a u l t i n f i l e () ; o u t f i l e ∗ outf = f . g e t d e f a u l t o u t f i l e () ; i n f −>open ( a r g v [ 2 ] ) ; f . reconnect () ; o u t f −>open ( a r g v [ 3 ] ) ; // c h e c k i f c o n n e c t e d ok if (! f . check all () ) return 1 ; Implementation of SSRC The examples shown above are only a tiny piece of MAPS’s capabilities, as it was created mainly for synchronous sample rate conversion. This process can be done in several ways. The most simple case is a single step decimator/interpolator which can be used only if conversion ratio is an integer. If samplerate ratio is rational a cascade of interpolator and decimator can be used. This method is not optimal, as the second stage must process audio data at L times increased speed. The best solution is to use polyfir filter class which can handle any rational (also integer) ratio using polyphase filter approach. Polyphase filtering is an effect of optimizing standard interpolator-decimator cascade [5]. Combining those two sections leads to two LP filters placed between up-sampler and down-sampler, one of them is redundant. While upsampling leads to processing many zero-valued samples, the down-sampling discards some of just calculated. With a little effort a structure ommiting multiplying by 0’s and skipping unneeded output samples can be developed. This kind of structure is implemented in MAPS’s polyfir filter. A simple case of polyphase interpolator is presented on figure 4. // p r o c e s s d a t a w h i l e n o t end o f f i l e do f . process () ; while ( ! i n f −>e o f ( ) ) ; p r i n t f ( ”Done : ) \n” ) ; return 0 ; } Configuration file implementing processing line — stereo to mono converter — shown on figure 3 is demonstrated on listing 2. Listing 2: Configuration file implementing stereo-to mono converter c o n f i g : { // c o n f i g f o r w h o l e p r o c e s s i n g l i n e d e s c r i p t i o n = ” s t e r e o t o mono c o n v e r t e r ” ; d e f a u l t i n = ” in ” ; d e f a u l t o u t = ” out ” ; in channels = 2; }; in : { type = ” i n f i l e ” ; }; dmx : { t y p e = ”demux” ; input = ” in ” ; outputs = 2 ; }; Figure 4: Simple polyphase interpolator MAPS can be used to implement another commonly used technique — cascading filters in order to decrease computational complexity. This is achieved by lowering transition band requirements for a single stage, thus reducing total filter length. MAPS may be therefore used in such cases, but the configuration file must be moddified accordingly. Sample config file describing 44,1 kHz to 96 kHz converter using two cascaded polyphase filters is presented on listing 3. Listing 3: Two-stage 44 1 kHz to 96kHz stereo sample rate converter #w r i t t e n by : B . B i e l a w s k i , P . N y k i e l config : { d e s c r i p t i o n=” 44 k t o 96 k s t e r e o f i l t e r ! ” ; i n s a m p l e r a t e =44100; i n c h a n n e l s =2; d e f a u l t i n = ” i n ” ; d e f a u l t o u t=” out ” ; } ; i n : { t y p e=” i n f i l e ” ; } ; The main aim whilst developing iSRC was to create an easy interface hiding multiple options given by MAPS. The GUI is as simple as possible, it was designed to be clear and easy to understand even for a begginer. The end-user is presented only a limited set of choices regarding frequencies, bit-depth and output filename. The application has only one window, which consists of five regions. They have been marked on the picture 5. dmx : { t y p e=”demux” ; i n p u t=” i n ” ; o u t p u t s =2;} x2 b : { // c o e f f s f o r s t a g e 1 type = ” b u f f e r ” ; f i l e n a m e = ” c o e f f s /44 −88. dat ” ; } ; pf 1l : { // f s x 2 , i n p u t demux : 0 t y p e=” p o l y f i r ” ; i n p u t=”dmx : 0 ” ; c o e f f s=” x2 b ” ; f i l t e r s =2; i n c r e m e n t = 1 ; } ; pf 1r : { // f s x2 , i n p u t demux : 1 t y p e=” p o l y f i r ” ; i n p u t=”dmx : 1 ” ; c o e f f s=” x2 b ” ; f i l t e r s =2; i n c r e m e n t = 1 ; } ; pf b :{ // c o e f f s f o r s t a g e 2 t y p e=” b u f f e r ” ; f i l e n a m e=” c o e f f s /88 −96. dat ” ; } ; p f 2 l : { // f s x 160 / 147 t y p e=” p o l y f i r ” ; i n p u t=” p f 1 l ” ; c o e f f s=” p f b ” ; f i l t e r s =160; i n c r e m e n t =147; } ; p f 2 r : { // f s x 160 / 147 t y p e=” p o l y f i r ” ; i n p u t=” p f 1 r ” ; c o e f f s=” p f b ” ; f i l t e r s =160; i n c r e m e n t =147; } ; Figure 5: iSRC – frontend to the MAPS library mx : { t y p e=”mux” ; i n p u t s =[ ” p f 2 l ” , ” p f 2 r ” ] ; } ; out : { t y p e=” o u t f i l e ” ; i n p u t=”mx” ; } ; The functions of the regions are as follows: r t : { // c o e f f s f o r d i t h e r gen . type = ” b u f f e r ” ; f i l e n a m e = . . . ” c o e f f s / rottab3 . txt ” ; }; Input file selection is used to browse and select files to be converted. Context menu enables user to add all files in a current directory or go directly to another directory using standard system dialog. d l : { // d i t h e r g e n e r a t o r l e f t type = ” d i t h e r ” ; input = ” p f l ” ; outputs = [ ” d l l ” , ” x 2 l ” , ” p f l ” ] ; coeffs = ” rt ” ; }; dr : { // d i t h e r g e n e r a t o r r i g h t type = ” d i t h e r ” ; input = ” p f r ” ; outputs = [ ” d l r ” , ” x2 r ” , ” p f r ” ] ; coeffs = ” rt ” ; }; 5 iSRC — MAPS frontend The MAPS library has been created as an flexible engine. This modular design enabled authors to design simple graphical user interface without dealing with the internals of library. Therefore GUI is using MAPS, but as long as the API is stable, the applications can be developed separately [6]. The iSRC has been written in C++ and uses Qt Toolkit [7]. Qt is a framework delivering cross-platform abstraction layer for many common programming tasks like GUI creation, configuration storage and thread management. Selected files panel lists files selected by user with full path included. Conversion options is the most complicated part. ”Conversion filter” dropbox is used to select input and output frequency and number of channels. ”Output bits” spinbox lets user choose bitdepth of resulting file. ”-6dB” checkbox enables and disables overflow countermeasure by suppresing input signal. If ”overwrite files” is not selected program will not write over already existing files. ”Output dir” is a place where converter files are stored. ”Name format” is field that generates output filename. Tokens are supported (e.g.”%F” — output frequency and ”%n” — file number). The slider at the bottom selects program’s priority, thus choosing conversion speed and system responsiveness. Control buttons start, stop or pause conversion. Please note that ”stop” aborts current process. Progress and other information region is used to inform user about the state of conversion. All types of messages appear here. The usage is simple and intuitive. User first selects files he/she wants to convert, then conversions settings must be chosen. Pressing ”start” will begin conversion, which cen be stopped temporarily by pressing ”pause” or aborted by pressing ”stop”. 6 Conclusion The MAPS library emerged from need for a precise and fast synchronous sample rate converter. The flexibility of the design allowed authors to use it in other DSP tasks. Custom processing lines can be created using simple configuration files. In the nearest future further development of both MAPS and iSRC is planned. The schedule includes adding plugin support, more blocks (e.g. compandor, noise gate, tone generator) and a new GUI. Creating a new GUI for MAPS is an additional task and will not be the top priority, although some design assumptions has been made. The planned features are: • processing line configuration files (*.fltr) support, • interactive creation of processing lines using graphical representation of blocks and links, • realtime alteration of block’s parameters, • support for various signal sources. It is hoped that MAPS and iSRC will soon become well-known brand and will be used by thousands audio engineers worldwide. References [1] B. Stroustrup, The C++ Programming Language, Addison-Wesley Pub Co; 3rd edition, 2000. [2] IEEE, IEEE Standard for Floating-Point Arithmetic (IEEE754) 2008. [3] Erik de Castro Lopo, libsndfile Manual, Homepage: http://www.mega-nerd.com/libsndfile/ [4] Mark A. Linder, libconfig Manual, Homepage: http://www.hyperrealm.com/libconfig/ [5] Roland E. Crochiere and Lawrence R. Rabiner, Interpolation and Decimation of Digital Signals — A Tutorial Review, Proceedings of the IEEE vol 69, 1981. [6] B.Bielawski, Synchronous Sample Rate Converter — M.Sc. thesis, not yet published. [7] Nokia, Qt 4.5 Manual, Homepage: http://doc.qtsoftware.com/4.5 This document has been typeset with LATEX.
© Copyright 2024