2011 IEEE Workshop on Applications of Signal Processing to Audio... October 16-19, 2011, New Paltz, NY

2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
October 16-19, 2011, New Paltz, NY
ARBITRARY SAMPLE RATE CONVERSION WITH RESAMPLING FILTERS OPTIMIZED
FOR COMBINATION WITH OVERSAMPLING
Andreas Franck
Fraunhofer IDMT
Ehrenbergstraße 31
98693 Ilmenau, Germany
fnk@idmt.fraunhofer.de
ABSTRACT
Arbitrary sample rate conversion (ASRC) is used in many applications of DSP. ASRC algorithms based on integer-ratio oversampling and continuous-time resampling filters enable good resampling quality for wideband signals.
A previous publication introduced an overall optimization
scheme for structures based on oversampling and Lagrange interpolation to design the oversampling component such that the design
error of the overall frequency response is minimized with respect to
a selectable error norm. However, the achievable quality strongly
depends on the continuous-time resampling filters. Lagrange interpolators show severe deficiencies when used in this role.
The present paper proposes a design objective for continuoustime resampling filters that are specifically adapted for use with
oversampling and the overall optimization scheme. ASRC systems
utilizing these so-called optimized image band attenuation (OIB)
resampling functions achieve significant quality improvements over
existing approaches. Performance analyses show that this class of
filters enables implementations with reduced complexity for a wide
range of design specifications.
Index Terms— Sample rate conversion, Delay filters, Acoustic
signal processing
1. INTRODUCTION
Arbitrary sample rate conversion (ASRC) is useful in many applications of audio signal processing, for instance for converting audio
signals between different standard sample rates [1], digital audio
effects [2], or sound reproduction systems such as wave field synthesis [3].
ASRC can be considered as a generalization of conventional,
rational-ratio sample rate conversion (SRC) techniques (e.g. [4])
that exhibits several advantages. First, it supports arbitrary conversion ratios R = Ti /To , where Ti and To denote the input and output sampling periods. Moreover, ASRC algorithms allow multiple
conversion ratios without requiring separate filter designs, and they
enable continuously time-varying ratios. For this reason, ASRC
techniques can be used to interface asynchronous systems with different sampling clocks [1].
A convenient way to describe ASRC algorithms is the hybrid
analog/digital model [4, 5]. It models the conversion process by a
discrete-to-continuous conversion (D/C) followed by resampling at
This work has been supported by the DIOMEDES project, funded under
the European Commission ICT 7th Framework Programme.
c
978-1-4577-0693-6/11/$26.00 2011
IEEE
x[n]
L
Hdig (e jω )
Integer-ratio SRC
Hint (jω)
y[m]
Continuous-time resampling filter
Figure 1: Signal flow of an ASRC structure based on integer oversampling and a continuous-time resampling filter
the output sampling period To . In this way, the ASRC algorithm
is completely determined by the frequency response of the continuous-time anti-imaging filter of the C/D process, denoted Hc (jω).
The frequency response of the ideal anti-imaging filter is given by
(
Ti , |ω| ∈ Xp
b
.
(1)
Hc (jω) =
0 , |ω| ∈ Xs
Here, Xp = [0, ωc ] and Xs = [2π − ωc , ∞) represent the passband
and stopband intervals of the design specification, respectively. ωc
denotes the cutoff frequency of the input signal. For notational convenience, the angular frequency ω = 2πf Ti is normalized to the
input period Ti .
Two convenient measures for the resampling quality, the maximum passband error δp and the minimum stopband attenuation As ,
are directly obtained from the frequency response Hc (jω)
b c (jω)|
δp = max |Hc (jω) − H
(2)
b c (jω)| .
As = −20 log10 max |Hc (jω) − H
(3)
ω∈Xp
ω∈Xs
Algorithms for ASRC fall into three general classes: Methods based
on numerical interpolation techniques such as Lagrange or spline
interpolation, piecewise polynomial resampling filters such as the
modified Farrow structure [5], and methods based on oversampling
and a continuous-time resampling function (e.g. [1, 6, 7]).
2. STRUCTURES BASED ON OVERSAMPLING AND
CONTINUOUS-TIME RESAMPLING FILTERS
The widespread use of structures based on integer-ratio oversampling and fixed continuous-time resampling filters is justified by
several reasons. First, it enables efficient conversion of wideband
signals. Second, it allows the multitude of well-established design
techniques and implementations for rational-factor SRC to be used.
149
2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Hc (jω)
1.001
1
0.999
ω
0.998
0π 0.2π 0.4π 0.6π 0.8π 1π
|Hc (jω)| φ
0 dB
φ0
Xp0
Hc (jω) Lagrange
Hc (jω) Lagrange (opt)
ω
Hint (j L
) Lagrange
−40 dB
|Hc (jω)|
0
φ0 Xp
φ0
−20 dB
(a) Passband detail
−60 dB
−65 dB
−70 dB
October 16-19, 2011, New Paltz, NY
−60 dB
−80 dB
5π 5.5π 6π 6.5π 7π
ω
−100 dB
0π
2π
(b) Stopband detail
4π
6π
8π
ω
12π
10π
(c) Frequency response
Figure 2: Frequency response of an Oversampling+Lagrange structure, overall optimization scheme compared to conventional design. Parameters L = 3, Nint = 5, Ndig = 159, ωc = 0.85π, L∞ design. Xp0 and φ0 denote the passband and transition band images of Hdig (e jω ).
The signal flow of this structure is depicted in figure 1. The
oversampling component consists of a sample rate expander followed by a discrete-time anti-imaging filter, or prefilter, Hdig (e jω ).
Although the continuous-time resampling filter Hint (jω) conceptually operates on continuous-time signals, it is generally implemented as a discrete-time filtering operation which evaluates the
signal value only at the requested output instants.
In conventional approaches, these two components are designed
independently. In most cases. Hdig (e jω ) is designed as a lowpass
filter according to design specifications for integer-ratio SRC systems, e.g. [4]. Likewise, Hint (jω) typically forms a simple resampling filter, such as a low-order Lagrange interpolator.
However, this independent design has two major drawbacks.
First, the resulting ASRC structures are not optimal with respect to
a given error norm such as the (weighted) least squares L2 or the
minimax L∞ norm. Moreover, it prohibits the inclusion of additional time- or frequency-domain conditions.
2.1. Overall Optimization of the Discrete-Time Prefilter
To overcome these deficiencies, an overall optimization scheme for
the discrete-time prefilter Hdig (e jω ) has been proposed in [8]. Although this approach uses Lagrange interpolators for Hint (jω), the
general idea is readily applicable to other resampling filters.
The overall continuous-time frequency response of the system
can be expressed as
Hc (jω) =
jω 1
Hdig e L Hint
L
jω
L
.
(4)
As the prefilter Hdig (e jω ) is advantageously implemented as a linear-phase FIR filter, e.g. [4], its frequency response is given by
0
Ndig
jω
Hdig (e ) =
X
0
b[n] trig(n, ω) with Ndig
=
j
Ndig −1
2
k
, (5)
n=0


Ndig even, n = 0
1 ,
trig(n, ω) = 2 cos(nω) ,
Ndig even, n > 0 .

2 cos n + 1 ω , N odd
dig
2
(6)
Thus, the overall frequency response can be stated as a linear combination of basis functions Go (n, ω)
0
Ndig
Hc (jω) =
X
b[n]Go (n, ω) with
(7)
n=0
ω
jω
1
Hint
.
Go (n, ω) = trig n,
L
L
L
(8)
In this way, the design of the filter coefficients b[n] can be stated as
an optimization problem with respect to a norm Lp
0
PNdig
b c (jω) .
minimize b[n]G
(n,
ω)
−
H
(9)
o
n=0
{b[n]}
p
For widely-used error norms such as the L2 or the L∞ norm, (9)
can be efficiently solved as a convex optimization problem. For
example, a method operating on a discretized representation of the
approximation region X = Xp ∪ Xs is described in [8].
A design example for this overall optimization scheme is shown
in figure 2. In this example, the passband error δp is reduced to
48.3 % compared to a conventionally designed prefilter, while the
minimum stopband attenuation is increased by 4.0 dB. This performance improvement follows from two causes: In the passband
region, the magnitude roll-off of the Lagrange interpolator is compensated by the design method, yielding an approximately equiripple behavior. In the stopband region, the improvement results from
the shaping of the frequency response in the image regions φ0 of the
transition band φ = ( ωLc , 2π − ωLc ) of the discrete-time prefilter.
2.2. Alternative Continuous-Time Resampling Filters
However, the above design example also reveals shortcomings of
Lagrange interpolation when used in combination with oversampling. On the one hand, the limited attenuation of Hint (jω) in the
passband images of Hdig (e jω ) (denoted Xp0 in figure 2) effectively
limits the achievable stopband attenuation. On the other hand, the
relatively flat passband response of Lagrange interpolators is possibly not required in this application, since most passband errors are
readily compensated in the design of the prefilter.
For these reasons, the use of alternative resampling filters, such
as B-spline basis functions and O-MOMS functions (Optimal maximal-order interpolation of minimal support) [9], appears promising.
150
2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Hc (jω)
1
|Hc (jω)|
0 dB
0.9
ω
0.8
0π 0.2π 0.4π 0.6π 0.8π 1π
(a) Passband detail
Hc (jω) O-MOMS
Hc (jω) OIB
ω
Hint (j L
) O-MOMS
ω
Hint (j L
) Oib
−20 dB
−40 dB
−60 dB
−80 dB
|Hc (jω)|
−100 dB
−100 dB
−120 dB
−125 dB
−150 dB
October 16-19, 2011, New Paltz, NY
5π 5.5π 6π 6.5π 7π
ω
−140 dB
0π
2π
(b) Stopband detail
4π
6π
8π
10π
ω
12π
(c) Frequency response
Figure 3: Frequency response of an Oversampling+OIB structure, comparison to Oversampling+O-MOMS design. Parameters L = 3,
Nint = 5, Ndig = 159, ωc = 0.85π, L∞ design
.
class of functions to model such resampling filters. They include
Lagrange interpolators, B-splines, and O-MOMS functions as subsets and are efficiently implemented by the modified Farrow structure. Their frequency response takes the form
ω
|Hint (j L
)|
0 dB
−40 dB
−80 dB
−120 dB
−160 dB
Lagrange
B-spline
O-MOMS
OIB
0
Hint (jω) =
M N
int
X
X
bmn G(m, n, ω) ,
(10)
m=0 n=0
ω
2πL−ωc 2πL+ωc
Figure 4: Comparison of different resampling functions Hint (jω).
Parameters Nint = 5, L = 3, ωc = 0.85π. Hatched area represents
the first passband image of Hdig (e jω ).
Spline basis functions are widely used in DSP, for instance in
image processing [10]. Compared to Lagrange interpolation, they
provide superior image attenuation with an asymptotic rate of decay
proportional to ω −Nint −1 at the expense of a more severe passband
roll-off. O-MOMS functions aim at minimizing the L2 approximation error for a given length of the resampling filter. In resampling
applications, they effectively decrease the magnitude of the first signal image while reducing the asymptotic rate of decay. For more
information, the reader is referred to [9].
The use of these resampling functions in combination with the
proposed overall optimization scheme is investigated in [11]. For
the design example of figure 2, the use of B-spline and O-MOMS
resampling filters increases the minimum stopband attenuation by
27.6 dB and 38.8 dB, respectively. The passband errors are decreased by the same ratios.
3. OPTIMIZED IMAGE BAND ATTENUATION DESIGN
Notwithstanding the improvements gained by the resampling filters
considered in the preceding section, none of these are specifically
adapted to ASRC systems incorporating oversampling. It is therefore worthwhile to consider design specifications for Hint (jω) that
take the characteristics of this structure into account.
Symmetric piecewise polynomials, e.g. [5], form a suitable
where bmn form the elements of a coefficient matrix and
G(m, n, ω) are basis functions as defined, for instance, in [5].
As argued above, the attenuation of Hint
reS (jω) in theωimage
c
gions of the passband of Hdig (e jω ), Xs0 = ∞
k=1 [2πk − L , 2πk +
ωc
], is of paramount importance for the performance of the comL
plete system. In contrast, passband deviations of Hint (jω) can
be rectified within certain limits by an appropriately designed prefilter. However, design specifications purely based on the stopband
behavior typically lead to degenerate solutions, or require a large
passband gain of Hdig (e jω ), which in turn deteriorates the overall performance. A relatively loose specification for the maximum
passband error, e.g. δp = 0.5, proves to be a sensible choice. Thus,
a suitable design specification reads
0
X
Nint
M X
(11)
minimize max0 bmn G(m, n, ω)
{bmn } ω∈Xs m=0 n=0
0
X
Nint
M X
bmn G(m, n, ω) − 1 ≤ δp , 0 ≤ ω ≤ ωLc .
subject to m=0 n=0
In figure 4, a function designed according to an OIB specification is compared to the other resampling filters considered in this
paper. It is observed that the minimum stopband attenuation in the
passband image region is considerably higher than for Lagrange, Bspline, and O-MOMS functions. On the other hand, the passband
roll-off is comparable than that of B-spline and O-MOMS filters.
The design example of section 2 is repeated with an OIB resampling filter and a discrete-time prefilter designed with the overall optimization scheme. The resulting frequency response is shown
in figure 3. Compared to the O-MOMS design, the minimum stopband attenuation is further increased by about 17.6 dB. Again, the
151
2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Method
δp /δp∗
δp
∗
Lagrange (conventional)
Lagrange (optimized)
B-spline basis function
O-MOMS
OIB
As
−3
As − A∗s
1.37 · 10
−
59.6 dB
−
6.61 · 10−4 48.29 % 63.6 dB 4.0 dB
2.75 · 10−5 2.01 % 91.2 dB 31.6 dB
7.64 · 10−6 0.56 % 102.4 dB 42.8 dB
1.01 · 10−6 0.07 % 120.0 dB 60.4 dB
Table 1: Performance comparison for ASRC designs based on oversampling. Parameters L = 3, Nint = 5, Ndig = 159, ωc = 0.85π.
Improvements δp /δp∗ and As − A∗s with respect to conventional
Oversampling+Lagrange design
700
600
500
Farrow structure
Oversampling+Lagrange
Oversampling+B-spline
Oversampling+O-MOMS
Oversampling+OIB
October 16-19, 2011, New Paltz, NY
B-spline or O-MOMS basis function require significantly less instructions. The OIB design gains a further significant reduction of
the complexity. For example, the modified Farrow structure and
the Oversampling+Lagrange structure require 433 and 461 instructions, respectively, to compute one output sample for As = 100 dB.
In contrast, the OIB design reduces the instruction count to 238.
5. CONCLUSIONS
In this paper, ASRC structures based on integer-ratio oversampling
and continuous-time resampling functions have been considered.
It has been demonstrated that the performance is significantly improved compared to existing approaches by using resampling functions specifically adapted to this structure. The proposed optimized
image band attenuation (OIB) filters are conveniently combined
with an overall optimization scheme that incorporates the characteristics of the resampling filter into the design of the oversampling
component. In this way, considerable complexity reductions are
achieved for a wide range of design specifications.
400
6. REFERENCES
300
[1] R. Lagadec, D. Pelloni, and D. Weiss, “A 2-channel, 16-bit
digital sampling frequency converter for professional digital
audio,” in Proc. IEEE Int. Conf. on Acoustics, Speech, and
Signal Processing, vol. 7, May 1982, pp. 93–96.
200
100
40 dB
[2] J. O. Smith, S. Serafin, J. Abel, and D. Berners, “Doppler simulation and the Leslie,” in Proc. 5th Int. Conf. on Digital Audio Effects (DAFx-02), Sept. 2002, pp. 13–20.
As
60 dB
80 dB
100 dB
120 dB
[3] A. Franck, K. Brandenburg, and U. Richter, “Efficient delay
interpolation for wave field synthesis,” in AES 125th Convention, Oct. 2008.
Figure 5: Minimum number of instructions to compute one
output sample for minimum stopband attenuation levels As =
40, 45, . . . , 120. Parameters ωc = 0.85π, conversion ratio R ≈ 1.
[4] R. E. Crochiere and L. R. Rabiner, Multirate Digital Signal
Processing. Eaglewood Cliffs, NJ: Prentice Hall, Inc., 1983.
[5] J. Vesma and T. Saramäki, “Polynomial-based interpolation
filters—Part I: Filter synthesis,” Circuits Systems Signal Process., vol. 26, no. 2, pp. 115–146, Apr. 2007.
passband error is reduced by the same amount due to the uniform
error weighting utilized. The performance improvements gained by
the resampling filters considered in this paper are summarized in
table 1.
[6] T. A. Ramstad, “Digital methods for conversion between arbitrary sampling frequencies,” IEEE Trans. Acoust., Speech,
Signal Processing, vol. 32, no. 3, pp. 577–591, June 1984.
4. PERFORMANCE COMPARISON
The above design example shows that the OIB design enables significant quality improvements for fixed, suitably chosen design parameters L, Ndig , and Nint . However, a criterion more relevant to
practical application is the minimal computational complexity required to reach a prescribed quality requirement.
In figure 5, the minimum number of instructions to compute one
output sample is displayed as a function of the achieved stopband
attenuation As . In this example, a cutoff frequency of ωc = 0.85π
and a conversion ratio close to unity R ≈ 1 are assumed. For
a large number of design parameter variations (M = 1, . . . , 7,
N = 1, . . . , 72 for the modified Farrow structure and L = 1, . . . , 6,
Ndig = 1, 5, . . . , 305, Nint = 1, . . . , 7 for structures comprising
oversampling), the attained quality and the required computational
effort are evaluated. The designs that meet a prescribed performance goal are selected for several stopband attenuation levels.
It is observed that the modified Farrow structure and Oversampling+Lagrange designs exhibit similar complexity for all considered quality levels. In contrast, the designs using oversampling and
[7] G. Evangelista, “Design of digital systems for arbitrary sampling rate conversion,” Signal Processing, vol. 83, no. 2, pp.
377–387, Feb. 2003.
[8] A. Franck and K. Brandenburg, “An overall optimization
method for arbitrary sample rate converters based on integer
rate SRC and Lagrange interpolation,” in Proc. IEEE Workshop Applications Signal Processing to Audio and Acoustics
WASPAA’09, Oct. 2009.
[9] T. Blu, P. Thévenaz, and M. Unser, “MOMS: Maximal-order
interpolation of minimal support,” IEEE Trans. Image Processing, vol. 10, no. 7, pp. 1069–1080, July 2001.
[10] M. Unser, “Splines — A perfect fit for signal and image processing,” IEEE Signal Processing Mag., vol. 16, no. 6, pp.
22–38, Nov. 1999.
[11] A. Franck, “Performance evaluation of algorithms for arbitrary sample rate conversion,” in AES 131st Conference, New
York, NY, Oct. 2011.
152