On optimal threshold and structure in threshold system based detector

Signal Processing 92 (2012) 170–178
Contents lists available at ScienceDirect
Signal Processing
journal homepage: www.elsevier.com/locate/sigpro
On optimal threshold and structure in threshold system
based detector
Gencheng Guo, Mrinal Mandal Department of Electrical and Computer Engineering, 2nd Floor, ECERF Building, University of Alberta, Edmonton, AB, Canada T6G 2V4
a r t i c l e in f o
abstract
Article history:
Received 3 May 2010
Received in revised form
27 June 2011
Accepted 30 June 2011
Available online 6 July 2011
Threshold systems are widely used in nonlinear signal detection. The optimal threshold
and the optimal structure are crucial to obtain good performance. In this paper, we
propose an efficient method to estimate optimal threshold in a threshold system based
detector (TD). In addition, we investigate the structural issue of TD and propose
a technique to determine the optimal structure. An optimal threshold detector is
then implemented based on the optimal structure with the optimal threshold(s).
Experiments show that the proposed optimal TD can be fairly compared to the optimal
detection with much easier implementation and much less computational complexity.
& 2011 Elsevier B.V. All rights reserved.
Keywords:
Signal detection
Optimal threshold detector
Optimal threshold
Optimal structure
1. Introduction
The problems of detecting a known signal in additive
noise with known distribution have well-known solutions
[1]. Matched filters provide optimal detection in the Gaussian noise. For non-Gaussian noise, the optimal detection
performance can be achieved only with a nonlinear system.
Given a weak signal in non-Gaussian noise with known
probability density function (pdf), a locally optimal (LO)
nonlinear detector can be derived and is considered to be
asymptotically optimum [1]. However, accurate estimation
of the noise pdf is difficult to obtain in many applications
because the noise may vary with time or from one application to another. Furthermore, even though the pdf is
available, some optimal detectors have high implementation complexity. Therefore, more robust, suboptimal detectors provide a more practical choice [2]. Threshold system
(TS) based detector, henceforth referred to as threshold
detector (TD), is widely used in many applications as a
Corresponding author. Tel.: þ1 780 492 0294;
fax: þ1 780 492 1811.
E-mail addresses: gencheng@ece.ualberta.ca (G. Guo),
mandal@ece.ualberta.ca (M. Mandal).
URL: http://www.ece.ualberta.ca/~mandal/ (M. Mandal).
0165-1684/$ - see front matter & 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.sigpro.2011.06.016
suboptimal detector [3–6]. These detectors have several
advantages: high speed, low resource requirement, and high
detection efficiency. Chapeau-Blondeau [3] demonstrated
that TD can improve the detection performance, in case of
non-Gaussian noise, compared to linear detector.
Consider the binary hypotheses detection problem
described as follows:
(
H0 : x½n ¼ w½n
n ¼ 0,1, . . . ,N1
H1 : x½n ¼ s½n þ w½n
n ¼ 0,1, . . . ,N1
,
ð1Þ
where w[n] is an independent and identically distributed
(i.i.d.) noise and s[n] is a DC signal. Note that here, for
simplicity, we derive the basic results (optimal threshold
and optimal structure) for a DC signal in an i.i.d. noise
with known pdf. We will explain how to deal with a
general signal s[n] (non-DC signal, such as a sinusoidal
signal) in noise with unknown pdf in Section 4. The
schematic of a TD typically used [3,7–9] is shown in Fig. 1.
The output of the threshold system (TS) can be calculated using either of the following two equations:
(
y½n ¼
1
x½n Z t
0
x½n o t
ð2Þ
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
171
calculated. The decision H1 is made if LðzÞ 4 g, where
LðzÞ ¼ fZ ðz; H1 Þ=fZ ðz; H0 Þ is the LRT (likelihood ratio test)
statistic on Z, fZ ðz; Hi Þ, i ¼ 0,1 are the pdf’s of Z under the
two hypotheses, g is calculated using the following
equation for a given PFA ¼ a:
Z
PFA ¼
fZ ðz; H0 Þ dz ¼ a:
ð4Þ
z:LðzÞ 4 g
With the above g, PD is calculated by
Z
PD ¼
fZ ðz; H1 Þ dz:
Fig. 1. The schematic of the threshold detector.
or
(
y½n ¼
ð5Þ
z:LðzÞ 4 g
0
x½n Z t
1
x½n o t
,
ð3Þ
according to different situations, where t is a threshold
P
of the TS. A test statistic Z ¼ z ¼ ð1=NÞ N1
n ¼ 0 y½n is then
calculated. We use Z (upper case) to represent the random
variable (RV) and z (lower case) to represent a scalar value
calculated from a given observation x½n, n ¼ 0,1, . . . ,N1.
The decision H1 is made if z 4 g0 .
Note that g0 and t are two different thresholds, where g0
is the threshold always used in binary detection while t is
the threshold of TS used to construct TD. The performance
of the TD is sensitive to the TS threshold t, and therefore t
needs to be calculated properly. However, the optimality
issues about TD have not been analyzed systematically. The
calculation of the optimal threshold is computationally
intensive. In addition, to the author’s knowledge, no analysis
on the optimal structure has been available in the literature.
The optimal structure refers to a specific type of TS that
leads to optimal detectability for a given noise pdf.
In this paper, we address these two limitations. First,
we propose a faster method to calculate the optimal threshold. This is performed by using a novel detectability
indicator, which monotonically increases with the probability of detection (PD). Secondly, based on the indicator,
we determine an optimal TS structure of the detector. We
show that the optimal TD is implemented based on the
optimal structure with the optimal threshold(s). Here, the
optimal TD is the one that has maximum PD for a given PFA
(probability of false alarm) using the test shown in Fig. 1.
Experimental results demonstrate that the performance
obtained from the optimal TD can be very close to that
obtained from the LO detector, and can be much better than
the performance obtained from the linear matched filter, for
non-Gaussian noise with heavy pdf tails.
The paper is organized as follows. In Section 2, we
propose a new indicator of detectability. We then calculate
the optimal threshold and determine the optimal structure
in Section 3. In Section 4, we address some practical issues
when employing the proposed TD. In Section 5, several
examples demonstrate the optimal design and the detection
performance, which is followed by conclusions in Section 6.
2. An indicator of the detectability PD
Typically, the pair ðPD ,PFA Þ is used for the performance
measurement of a detector. For the given test in Fig. 1 and
an observation x½n, n 2 ½0,N1, the test statistic z can be
In this section, we first derive fZ ðz; Hi Þ, i ¼ 0,1 and
hence PD can be calculated. We then show that the PD
monotonically increases with a novel indicator, which can
be used instead of PD.
2.1. Estimation of fZ ðz; H0 Þ and fZ ðz; H1 Þ
Let fX ðx; H0 Þ and fX ðx; H1 Þ be the pdf’s of one sample in
x½n, n 2 ½0,N1 under the conditions of H0 and H1, respectively. Given a threshold t for the TS described in Eq. (2) or
(3), Y ¼ y½n is a random variable (RV) taking only two
values, 0 or 1. Hence, Y can be considered as an output of a
success (1)/failure (0) experiment known as the Bernoulli
trial. The probability Pr ðY ¼ 0; H0 Þ refers to the probability
of Y¼0 under the condition of H0. The probabilities Pr
ðY ¼ 1; H0 Þ, Pr ðY ¼ 0; H1 Þ and Pr ðY ¼ 1; H1 Þ are defined in
similar manner, and are calculated as follows:
Z t
Pr ðY ¼ 0; H0 Þ ¼
fX ðx; H0 Þ dx ¼ q0 ,
ð6Þ
1
Pr ðY ¼ 1; H0 Þ ¼
Pr ðY ¼ 0; H1 Þ ¼
Z
1
t
Z t
fX ðx; H0 Þ dx ¼ p0 ,
ð7Þ
fX ðx; H1 Þ dx ¼ q1 ,
ð8Þ
fX ðx; H1 Þ dx ¼ p1 ,
ð9Þ
1
Pr ðY ¼ 1; H1 Þ ¼
Z
1
t
where p0 þ q0 ¼ 1 and p1 þ q1 ¼ 1. We observe that p0 and
p1 are the probability of Y¼1 under the conditions of H0 and
H1, respectively. It can be verified that if the DC signal A 4 0
(fX ðx; H1 Þ is a right shift of fX ðx; H0 ÞÞ and the TS based on
Eq. (2) is employed, p1 4p0 holds true. For A o0, we can
choose the TS based on Eq. (3) to make p1 4 p0 hold true.
Henceforth, we only consider the case p1 4 p0 and the TS
expressed by Eq. (2), which is the case for A Z 0. While for
A o0, using the TS expressed by Eq. (3), we will have the
same results.
Consider an observation x½n, n ¼ 0,1, . . . ,N1 under
H0. Let m 2 ½0,N be the number of successes when x[n] is
applied to the TS. The discrete probability distribution of
the number of successes is a binomial distribution, and
the probability mass function (pmf) of the number m is
N m Nm
fM ðmÞ ¼ ðm
Þp0 q0 . We then have
! 1
N
1 NX
m
NNz
y½n ¼ ; H0 ¼
:
ð10Þ
fZ z ¼
pNz
0 q0
Nn¼0
N
Nz
Because fZ ðz; Hi Þ, i ¼ 0,1 are discrete (N and Nz are both
non-negative integers, z ¼ m=N, m ¼ 0,1, . . . ,NÞ, the ROC
172
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
(receiver operating characteristic) is a staircase form
(see Fig. 4(b)) and the step size (height and width) of
the staircase depends on N. If N is large, the size of the
staircase is small and the demonstration could be more
precise. To eliminate the discreteness effect of N, we use
the continuous form of Eq. (10) for theoretical demonstration, which is shown as follows:
fZ ðz; H0 Þ ¼
N GðN þ 1Þ
pNz qNNz ,
GðNz þ 1ÞGðNNz þ1Þ 0 0
ð11Þ
where N, Nz 2 R, 0 r Nz rN and GðÞ is the Gamma function defined as
Z 1
GðnÞ ¼
t n1 et dt:
0
Note that the factor N in the numerator of Eq. (11) is a
scale factor to make the integral be 1. Similarly, we obtain
fZ ðz; H1 Þ ¼
N GðN þ 1Þ
pNz qNNz :
GðNz þ 1ÞGðNNz þ 1Þ 1 1
ð12Þ
Using Eqs. (11) and (12), we calculate L(Z) as follows:
Nz NNz
m
p1
q1
¼
L z¼
:
ð13Þ
N
p0
q0
Because we have assumed p1 =p0 4 1, we also have
q1 =q0 o1. Thus, both ðp1 =p0 ÞNz and ðq1 =q0 ÞNNz monotonically increase with respect to z. Therefore, L(z) is a
monotonically increasing function and the decision H1 is
made if z 4 g0 , where g0 is calculated using the following
equation (for a given PFA ¼ aÞ:
Z
fZ ðz; H0 Þ dz ¼ a:
ð14Þ
PFA ¼
z 4 g0
With the above g0 , PD can be calculated as follows:
Z
fZ ðz; H1 Þ dz:
PD ¼
ð15Þ
z 4 g0
Note that we use integral instead of summation in Eqs. (14)
and (15) because we consider fZ ðz; Hi Þ, i ¼ 0,1 as continuous
in theoretical derivation. Eqs. (14) and (15) conform to the
schematic shown in Fig. 1, i.e., we decide H1 provided that
z 4 g0 . Next, we will show that Dp ¼ p1 p0 can be an
indicator of PD, where p0 ,p1 are shown in Eqs. (7) and (9),
respectively.
2.2. A detectability indicator Dp
Given fX ðx; H0 Þ, fX ðx; H1 Þ, the optimal t can be calculated
by
topt ¼ argmaxPD :
t
For a given t, PD is calculated from the derived fZ ðz; H1 Þ
and fZ ðz; H0 Þ. Because the closed form of the integral of
fZ ðz; H1 Þ and fZ ðz; H0 Þ is not available, numerical method
needs to be employed. To reduce the computational
complexity and for other convenience, we propose an
alternative detectability indicator Dp ¼ p1 p0 , which is
shown in the following Lemma.
Lemma 1. If A 5 s and N,m,ðNmÞ b 1, for a given PFA , PD
can be approximated as a monotonically increasing function
of Dp, where Dp ¼ ðp1 p0 Þ, s is the standard deviation of the
noise.
Proof. In this proof, we first show that PD is a function of
p1 and p0. By defining Dp ¼ p1 p0 , we then express PD as a
function of Dp and p1. We then show that @PD =@p1 0 and
@PD =@Dp 40, therefore we can approximate PD as a
monotonically increasing function of Dp. This means that
we can use Dp as an indicator of PD.
The conditions A5 s and N,m,ðNmÞ b 1 are reasonable
for a TD to be used in DC detection. Signal is often assumed
weak compared to the noise strength [1]. The condition
N b 1 is generally true because A 5 s and the number of
samples in one observation should be larger for a good
detectability. If the conditions m b1 or ðNmÞ b 1 are not
true, the chosen t (the line x ¼ tÞ should be at very right tail
(m is close to 0) or very left tail (ðNmÞ is close to 0) of
fX ðx; Hi Þ, i ¼ 0,1 (see Eqs. (6)–(9)). Using this t, the TS will
convert almost all the samples x[n] (for both H0 and H1) to
0’s or 1’s, i.e., the information in x[n] to distinguish H1 from
H0 is lost, and hence the detectability will be poor. Therefore, we should chose a t that makes the output of the TS
satisfy m b1 and ðNmÞ b1.
If N,m,ðNmÞ b 1, using de Moivre–Laplace theorem,
fZ ðz; H0 Þ and fZ ðz; H1 Þ in Eqs. (11) and (12) can be approximated by the normal distribution
fZ ðz; H0 Þ N ðz; p0 ,p0 ð1p0 Þ=NÞ,
ð16Þ
fZ ðz; H1 Þ N ðz; p1 ,p1 ð1p1 Þ=NÞ,
ð17Þ
2
where N ðz; m, s Þ is normal distribution with mean m and
variance s2 , N is the number of the samples. Using
Eq. (16), Eq. (14) can be expressed as
!
g0 p0
Q pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ a,
ð18Þ
p0 ð1p0 Þ=N
where Q ðÞ is the complementary CDF (cumulative distribution function) of standard normal pdf. The g0 is
solved from Eq. (18), which is shown as follows:
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
g0 ¼ p0 þ Q 1 ðaÞ p0 ð1p0 Þ=N,
ð19Þ
where Q 1 is the inverse function of Q ðÞ. Substituting the
g0 into Eq. (15), we obtain
!
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!
p0 p1 þQ 1 ðaÞ p0 ð1p0 Þ=N
g0 p1
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
:
PD ¼ Q pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ¼ Q
p1 ð1p1 Þ=N
p1 ð1p1 Þ=N
ð20Þ
Substituting p0 ¼ p1 Dp into Eq. (20) (since Dp ¼ p1 p0 Þ,
we obtain
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!
Dp þ Q 1 ðaÞ ðp1 DpÞð1p1 þ DpÞ=N
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
PD ¼ Q
:
p1 ð1p1 Þ=N
Because Q ðÞ is a monotonically decreasing function, we
can only consider the part inside Q ðÞ function, which is
denoted as
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Dp þQ 1 ðaÞ ðp1 DpÞð1p1 þ DpÞ=N
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
:
ð21Þ
Y¼
p1 ð1p1 Þ=N
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
Claim (1) @Y=@p1 0. Because A 5 s, and fX ðx; H1 Þ ¼ fX
R1
ðxA; H0 Þ, we have Dp ¼ p1 p0 ¼ tA fX ðx; H0 Þ
R1
dx t fX ðx; H0 Þ dx Af X ðt; H0 Þ 0. Because Dp
0, it can be shown that t s, t=2ss=2t 0,
and thus @Y=@p1 0 (see Eq. (23)). Y is function of Dp and p1 as shown in Eq. (21). Due to
@Y=@p1 0, we can view Y as a function of
Dp only.
Claim (2) @Y=@Dp o 0. To show this, we recast Eq. (22) as
!
@Y
1
Q 1 ðaÞðp0 1=2Þ pffiffiffiffi
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
p
p
¼
N :
@Dp
p1 ð1p1 Þ
p0 ð1p0 Þ
ð24Þ
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Because p1 ð1p1 Þ is positive, we only need to
decide the sign of the following equation:
!
Q 1 ðaÞðp0 1=2Þ pffiffiffiffi
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi N :
Cðp0 , a,NÞ ¼
ð25Þ
p0 ð1p0 Þ
The value of Cðp0 , a,NÞ is related to the variables a, p0 and N. Generally, it can be either
positive or negative because some factors in
Eq. (25) can be infinity, such as lima-1 Q 1 ðaÞ
¼ 1. However, considering reasonable values
for these variables, we can obtain a useful
result. First, since N,m,ðNmÞÞ b 1, p0 and p1
should not be very close to 0 or 1. Therefore,
we assume 0:2 o p0 o 0:8 that is generally
satisfied when topt is chosen (see the examples
in Section 5). Secondly, for any detector,
limPFA -0 PD ¼ 0 and limPFA -1 PD ¼ 1. Therefore,
we omit these extremes. Under these considerations, for 0:2 o p0 o 0:8, we have 0:75 o
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ððp0 1=2Þ= p0 ð1p0 ÞÞ o 0:75. We further assume
0:05 o a o0:95, then 1:6449 o Q 1 ðaÞ o 1:6449.
The maximum value of the first factor in
Eq. (25) is about 1.24. If N Z2, the sign of
Cðp0 , a,NÞ is negative. As explained at the
beginning of this proof, the assumption
N,m,ðNmÞ b 1 is typically satisfied. Therefore
Cðp0 , a,NÞ will be negative.
From Claim (1) and Claim (2), we show that approximately Y is a monotonically decreasing function of Dp.
A justification is demonstrated in Fig. 2 using a practical
case. The subplot (a) shows the contour lines of the PD with
respect to p1 and p0. For every ðp1 ,p0 Þ, the PD is calculated
using Eqs. (14)–(17), with PFA ¼ 0:1, N¼100. When p1 ¼ p0 ,
the contour line PD ¼ PFA is the line p0 ¼ p1 . The other
contour lines are like straight lines parallel to the line
p0 ¼ p1 . Because the contour lines parallel to p0 ¼ p1 , the
points in one of these contour lines all have same Dp. Also
because the points are in same contour line, these points
have same PD. It means that same Dp leads to same PD, no
matter which p1 is used, which verify the claim (1). We also
observe that the PD monotonically increases from top isoline
PD ¼ 0:1 to the bottom isoline PD ¼ 0:9 with the increase of
Dp, which verifies the claim (2).
An enlarged plot for the region in the box (p0 ,p1 2
½0:2,0:5Þ of (a) is shown in (b). A possible function p1 ðp0 Þ
is added to (b) shown as the thick curve. If we move the
threshold t of the TS from the right to the left of the pdf’s
fX ðx,Hi Þ, i ¼ 0,1, we will have an increasing p1 and p0.
Sometimes the increasing p1 is faster than the increasing
PD=0.1
0.8
p0=p
1
0.6
p0
pffiffiffiffi
t
s
t
þ Dp t Dp N
Q 1 ðaÞ ð12p1 Þ
@Y
2s 2t
s
¼
,
ð23Þ
@p1
t2
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
where s ¼ ðp1 DpÞð1p1 þ DpÞ and t ¼ ðp1 Þð1p1 Þ.
In the following, we show that Claim (1) @Y=@p1 0,
which means that the changing p1 has little influence to Y
and therefore Y can be approximated as a function of
only Dp; and Claim (2) @Y=@Dp o 0, which means that Y
monotonically decreases with Dp.
Hence PD is a monotonically increasing function of
Dp because PD ¼ Q ðYÞ is a monotonically decreasing
function. &
0.4
PD=1 in this area
0.2
0.2
0.4
p1
0.6
0.8
0.5
D
0.45
p0=p
0.4
1
PD=0.1
p0
The differential of Y with respect to Dp and p1 can then be
obtained as follows:
pffiffiffiffi
@Y
1
ð1 þ 2p1 2DpÞ
N þQ 1 ðaÞ
,
ð22Þ
¼
t
2s
@Dp
173
0.35
0.3
B
C
0.25
0.2
0.2
0.25
0.3
0.35
p1
0.4
0.45
0.5
Fig. 2. A justification of Lemma 1. (a) The contour lines of PD with
respect to p1 and p0. (b) The enlarged region in the box as shown in (a)
with a ðp1 ,p0 Þ curve.
174
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
of p0, i.e., the increasing Dp 40, as shown in the curve from
point B to the point C, and sometimes the increasing Dp o 0
as shown in the curve from the point C to the point D. It is
observed that the PD monotonically increases with Dp.
Lemma 1 shows that PD is a monotonically increasing
function of Dp. Therefore, Dp can be used as an indicator
of PD. We note that the proof of Lemma 1 is based on some
approximations, and hence Lemma 1 is not rigorously
correct for all cases. For example, in Fig. 2(a), the isoline
PD ¼ 0:9 in the region p1 o0:2 is not a parallel line to
p0 ¼ p1 and Lemma 1 may not hold true. However, for the
general cases of signal detections (p0 and p1 are not close
to 0 or 1, and Dp is small), it works very well. Dp calculation
is much easier than PD. Next Dp indicator is employed to
calculate topt and determine optimal structure.
3. Optimal threshold calculation and optimal structure
determination
In this section, we propose techniques to calculate the
optimal threshold and determine the optimal structure
using the Dp indicator.
3.1. Calculation of the optimal threshold based on Dp
Dp indicator provides a much easier and faster way to
find topt , which is shown in the following lemma.
Lemma 2. The topt should be one of the x’s among the
intersections between fX ðx; H0 Þ and fX ðx; H1 Þ.
Proof. According to the Dp indicator, PD increases with
the increase of Dp. Therefore, topt should locate at one of
the local maximum points of DpðtÞ. Because DpðtÞ ¼
R1
p1 p0 ¼ t ðfX ðx; H1 ÞfX ðx; H0 ÞÞ dx, the local maximum
points of Dp should satisfy dDp=dt ¼ fX ðt; H0 ÞfX ðt; H1 Þ
¼ 0 because fX ð1; Hi Þ ¼ 0, i ¼ 0,1. Therefore, topt has to
be one of the x’s among intersections between fX ðx; H0 Þ
and fX ðx; H1 Þ, i.e., the x’s satisfy fX ðx; H1 Þ ¼ fX ðx; H0 Þ. &
Based on Lemma 2, we can derive the topt for some
specific cases, which is illustrated as follows.
Corollary 1. topt is the x such that fX ðx; H0 Þ¼ fX ðx; H1 Þ if
there is only one intersection between fX ðx; H0 Þ and fX ðx; H1 Þ.
In particular, if fX ðx; H0 Þ is unimodal and symmetric, we have
topt ¼ ðx0 þ x1 Þ=2, where x0 and x1 are the mode of fX ðx; H0 Þ
and the mode of fX ðx; H1 Þ, respectively.
From Lemma 1, we know the topt locates at the only
one intersection. The unimodal means that the probability
distribution has a single mode and the mode is a value at
which the pdf/pmf takes its maximum value. If the noise
pdf is unimodal and symmetric (like Gaussian), we can
easily know that the pdf fX ðx,H0 Þ is symmetric with
respect to x ¼ x0 , where x0 is the mode of fX ðx,H0 Þ. Because
fX ðx; H1 Þ ¼ fX ðxA; H0 Þ (right shifted), the only intersection
has to be x ¼ x0 þ A=2 ¼ ðx0 þ x1 Þ=2, where x1 is the mode
of fX ðx; H1 Þ.
Corollary 2. For multi-intersection case, for example, K
intersections, topt ¼ argmaxt ¼ xi Dp and xi , i ¼ 1, . . . ,K is
the x of the K intersections, i.e., fX ðxi ; H0 Þ ¼ fX ðxi ; H1 Þ.
From Lemma 1, PD increases with the increase of Dp.
Therefore, the t that yields to maximum Dp is topt .
It is worth noting on the computational complexity.
Using Dp, we can show that the computational complexity is significantly reduced compared to using PD directly.
For a given range and resolution of t, if using Dp, two
integrals, one integral on fX ðx; H0 Þ and one integral on
fX ðx; H1 Þ, are needed to calculate all the pi , i ¼ 0,1 for every
t to find the intersections located at xi , i ¼ 1, . . . ,K. We
still compare the Dpi , i ¼ 1, . . . ,K to find the maximum
one. The total computing cost is about two integrals. If
using PD, beside the two integrals for p0 and p1, it needs
two integrals for every t to calculate the PD for a given PFA .
Therefore, the computational complexity is 2 þ 2nðrange
=resolutionÞ integrals. Obviously, the computational complexity of calculating topt is significantly reduced if Dp
indicator is used. A running time to find the topt will be
presented in Section 5.1.
3.2. Structure of threshold detector
We calculated the topt in Section 3.1. However, in cases
where multiple local maximum (MLM) points exist in
DpðtÞ, optimal PD cannot be achieved using only one TS
expressed by Eq. (2). We propose a TS structure for better
performance.
Lemma 3. The optimal TS structure for MLM cases is the
combination of the multiple TS’s that are designed for the
partitions of data space divided by the local minimum Dp’s.
Proof. DpðtÞ has the significant values: ð0,B1max ,B1min , . . . ,
BKmax ,0Þ corresponding to the t’s ð1, t1max , t1min , . . . , tKmax ,
þ 1Þ, where Bimax represents the ith, i 2 ½1,K local maximum Dp and Bjmin represents the jth, j 2 ½1,K1 local
minimum Dp. Partition the data space X 2 R into K subsets
by the tjmin , denoted as Xk ,k 2 ½1,K. The K independent
TS’s are designed on every subset. In the Xk, the pdf’s of x
are given as
fX ðx; H0 ,Xk Þ ¼
fX ðx; H0 Þ
,
Pr ðx; H0 ,Xk Þ
fX ðx; H1 ,Xk Þ ¼
fX ðx; H1 Þ
,
Pr ðx; H1 ,Xk Þ
where Pr ðx; H0 ,Xk Þ and Pr ðx; H1 ,Xk Þ are the probability
of x locating in Xk under the conditions of H0 or H1,
respectively. The local maximum Dpk in Xk can be calculated using
Z
Dpk ¼
ðPr ðx; H1 ,Xk Þ fX ðx; H1 ,Xk Þ
x:x2Xk , tk o x
Pr ðx; H0 ,Xk Þ fX ðx; H0 ,Xk ÞÞ dx,
Z
ðfX ðx; H1 ÞfX ðx; H0 ÞÞ dx,
¼
ð26Þ
x:x2Xk , tk o x
where tk is the optimal t for the TS designed in Xk. From
Eq. (26), we know that tk is identical to the t corresponding to the local maximum point of Dp in Xk before x is
divided into K subsets. Therefore, the partition of X can be
performed based on the DpðtÞ and only one maximum
P
Dpk in Xk. Therefore, Dps ¼ Kk ¼ 1 Dpk is the maximum Dp
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
we can obtain for X 2 R. Also because the change in Dp
dominates the change of PD, the proposed TS structure
will gain the optimal performance because it obtains the
maximum Dp. &
Lemma 3 shows how to determine the optimal structure.
For a simple case that DpðtÞ has the significant values
ð0,B1max ,B1min ,B2max ,0Þ corresponding to the t’s: ð1, t1max ,
t1min , t2max , þ1Þ, the optimal structure is the combination of
TS1 and TS2, where TS1 is a TS (expressed by Eq. (2)) with
t1 ¼ t1max for X1 : fx o t1min g and TS2 is with t2 ¼ t2max for
X2 : fx 4 t1min g. An example for the structure determination
will be presented in Section 5.2.
We proposed the optimal threshold and the optimal
structure based on Lemma 1. We point out further comments for the condition N,m,ðNmÞ b 1. Though these
conditions are common in signal detection and with them
we can approximate the binomial distribution to a normal
distribution (in showing the proof of Lemma 1), we note
that it is not necessary for the implementation of the TD.
Consider that we have only one sample. Statistically, the
optimal design of the TD is still valid. This is because, if
the detection from the one sample is done many times, the
detection problem can be considered as a detection from
many samples and the TDs used for many samples and any
one sample are identical. Other practical issues about the
implementation are addressed below.
4. For non-DC signals and noises with unknown pdf’s
In Section 2, we proposed an indicator of the detectability of the TD and in Section 3, we derived the optimal
threshold and optimal structure. However, these results
are obtained assuming that the signal is DC and the noise
pdf is known. In practice, nevertheless, we often need to
deal with non-DC signals and the noises with unknown
pdf’s. In this section, we will address these practical issues
using the theory we have derived to implement the TD.
4.1. Non-DC signal
Now we want to determine whether a known signal
s[n] exists from an observation x[n], where s[n] is a known
non-DC signal. Though the non-DC signal could be any
deterministic signals, from now on we will use sinusoidal
signals in our justification and experiments because the
175
sinusoidal signal is widely used in signal detection [1,10].
This strategy can be easily extended from sinusoidal
signals to any other deterministic signals.
Consider one specific sample in the observation x[n],
n ¼ 0,1, . . . ,N1, denoted as x½t, t 2 ½0,N1. We have
x½t ¼ s½t þw½t. If we want to make a detection decision
only based on this one sample x[t], the detection problems
is a known DC signal s[t] embedded in noise w[t]. The
optimal design of the TD is valid for this sample, and we
obtain the maximum Dp. Therefore, for a specific sample
x[t], we use a TS (shown in Eq. (2) or Eq. (3)) denoted as
TS[t] and its output is denoted as zt. The overall detector
can be the sum of the zt ,t 2 ½0,N1 with weight js½tj, i.e.,
z¼
N
1
X
zt js½tj:
ð27Þ
t¼0
Note that we alternatively choose the TS using Eq. (2) or
Eq. (3) depending on s[n] is positive or negative and we
always have p1 4p0 . Therefore, zt always contributes
positively to H1 and hence the weight should be positive.
That is why js½tj is employed as the weight. Based on the
above discussion, the schematic of the TD for detecting a
known signal s[n] is shown in Fig. 3. The structure of the
TD is composed of a multiway switch, a TS array and a
replica-correlator.
The strategy can be used to detect any known deterministic signals. Detailed analysis (for example, performance analysis) will be presented in a future paper. The
detection performance of the TD in detecting a sinusoidal
signal in non-Gaussian noise is presented in Section 5.3.
4.2. Noise with unknown pdf
So far, the results we derived are based on known noise
pdf’s. The accurate pdf of the noise is difficult to obtain in
many applications and that is why we use TD as a suboptimal choice. In this section, we will demonstrate that
optimal design of the proposed TD for unknown noises is
much easier than the detectors that need full knowledge of
the noise, such as LO detector [1] and another TD [6]. We
classify the noises into two categories for the demonstration.
(1) The noises with unimodal and symmetric pdf, which
is called Type I noise in this paper. This type of noise
covers a wide range of applications and the assumption
does not appear to be overly restrictive for practical
Fig. 3. The schematic of the TD for known sinusoidal signals.
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
To conclude, for the Type I noises, the optimal design
of the proposed TD does not depend on the noise pdf’s.
For Type II noises, we only need the knowledge that
decides the intersections of the pdf’s fX ðx,Hi Þ, i ¼ 0,1,
which is much less than the full knowledge of the noise
pdf, easier to estimate and less sensitive to the inaccurate
estimation. We note that the proposed TD is not suitable
for all the noise. Only for the noises with heavy pdf tails,
the TD can have a performance very close to the optimal
one. The validation (if TD is suitable or not), detail
methodology and some other issues for the different
types of the noises will be addressed in a future paper.
5. Experimental results
Two examples are presented in this section to illustrate the optimal threshold, the structure of the TD for DC
signals and another one is used to demonstrate the
detection performance for a sinusoidal signal.
2
where c ¼ ½a þ ð1aÞb 1=2 , 0 o a o1, b 4 0. A DC signal
A ¼0.1 embedded in GM noise with a ¼ 0:9, b ¼ 5 and
s ¼ 1. According to the symmetry of the GM, there is only
one intersection between fX ðx; H0 Þ and fX ðx; H1 Þ, located at
0.05, and therefore topt ¼ 0:05. We show that the theoretical and simulation PD with respect to t for different PFA
in Fig. 4(a). Obviously, the optimal t ¼ 0:05 is justified by
the theoretical results.
Some comments about the simulation results are worth
mentioning here. The PD’s obtained from simulations are not
very close to the theoretical ones. That is because the pmf’s
fZ ðz; Hi Þ, i ¼ 0,1 obtained from simulations are discrete. It
can be verified that in this case the numbers of the non-zero
values for the two pmf’s fZ ðz; Hi Þ, i ¼ 0,1 are about 11 (see
the number of the staircases in the simulation ROC in (b)).
The pmf’s lead to the inaccurate PFA’s and PD’s. The error is
not slight because the size of staircase is large (only 11 steps
from 0 to 1). However, the simulation PD’s go up and down
1
Theoretical results
Probability of detection (PD)
applications [11]. For example, the generalized Gaussian
(GG) and unimodal Gaussian mixture (GM) [1] belong
to this large group. Note that GG (including Gaussian,
Laplacian, uniform, etc.) and unimodal GM are widely
used noise types in a variety of applications [1,6,12,13].
For the Type I noises, the optimal design of the TD is
simple. Because we have only one intersection between
fX ðx; H0 Þ and fX ðx; H1 Þ, the structure design is not necessary. Based on Corollary 1, the topt should be x0 þ A=2 for
a DC signal s½n ¼ A or x0 þ s½n=2 for a non-DC signal s[n],
where x0 is the mode of fX ðx; H0 Þ. Therefore, assuming
the noise pdf is symmetric and unimodal, the optimal
design of the TD does not depend on the noise pdf.
(2) Other noises. For the noises that do not belong to the
Type I noise, referred to as Type II noise, the solution
cannot be as straightforward as the one for Type I
noises. For example, in Section 5.2, we show a bimodal
Gaussian mixture (GM) noise (the pdf with two different modes). For these noises, we need to calculate the
topt and to determine the optimal structure. According
to the Lemmas 2 and 3, to design the TD, we only need
to know the intersections between fX ðx; H0 Þ and
fX ðx; H1 Þ. In other words, only the knowledge that is
useful to determine the intersections is necessary. The
required knowledge is much less compared to the full
knowledge of the noise pdf and is easier to estimate in
applications. For instance, for the GM (bimodal) noise
shown in Fig. 5, we only need to estimate the locations
(x) of the two modes of fX ðx; H0 Þ. This can be performed
based on the estimation theory. While for the LO
detector, the full knowledge of the pdf is needed.
Simulation results
PFA=0.3
0.8
0.6
0.4
PFA=0.1
PFA=0.2
0.2
−1
−0.5
0
τ
0.5
1
1
Probability of detection (PD)
176
0.8
0.6
0.4
LO detector
Threshold detector (theory)
Threshold detector (simulation)
Matched filter
0.2
0
0
0.2
0.4
0.6
0.8
1
Probability of false alarm (PFA)
5.1. Optimal threshold determination
We consider the GM (unimodal) noise [1,6], which is
defined as
"
!#
2 2
c
c x
1a
c 2 x2
fX ðx; H0 Þ ¼ pffiffiffiffiffiffi a exp exp 2
þ
,
b
2s2
s 2p
2b s2
ð28Þ
Fig. 4. Performance of the TD of a DC signal in the Gaussian mixture
noise. The DC signal is A ¼ 0.1. The GM is with a ¼ 0:9, b ¼ 5 and s2 ¼ 1.
The number of the samples in one observation is N ¼100. Theoretical
results are calculated based on Eqs. (11) and (12). For a given t and PFA,
simulation PD is calculated from fz ðz; H0 Þ and fz ðz; H1 Þ, which are the
histograms generated from 1000 simulations for (a) and 10 000 simulations for (b). (a) The PD with respect to t and PFA. (b) The ROCs from LO
detector, matched filter and TD.
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
(1) Using PD. Step I: we calculate all the ðp0 ,p1 Þ’s for all the
different t’s and for every pair ðp0 ,p1 Þ we have fZ
ðz; Hi Þ, i ¼ 0,1 as shown in Eqs. (11) and (12). Step II,
for every fZ ðz; Hi Þ, i ¼ 0,1, two additional integrals are
needed to find g0 and PD shown as Eqs. (14) and (15).
In step II, we will totally calculate 2 1001 ¼ 2002
numerical integrals to obtain 1001 PD’s and then choose
the maximum one.
(2) Using Dp. We only need to calculate the intersections
between fX ðx; H0 Þ and fX ðx; H1 Þ. Based on the same
resolution x ¼ 5 : 0:01 : 5, we compare fX ðx; H0 Þ and
fX ðx; H1 Þ to find the intersection(s), i.e., if ðfX ðxk ; H0 Þ
4 fX ðxk ; H1 ÞÞ and (fX ðxk þ 1 ; H0 Þ o fX ðxk þ 1 ; H1 ÞÞ or vice
versa, we can consider that the intersection is either
xk or xk þ 1 , where xk and xk þ 1 are the kth and (kþ1)th
components in x ¼ 5 : 0:01 : 5. We calculate fX ðx; H0 Þ
and fX ðx; H1 Þ for all x and therefore the computational
complexity is only as same as the step I when using
PD. The 2002 numerical integrals in step II are saved
here.
The GM (bimodal) is defined as
fx ðx; H0 Þ ¼ 12 N ðx; m, s2 Þ þ 12N ðx; m, s2 Þ:
fx ðx; H1 Þ ¼ 12 N ðx; m þ A, s2 Þ þ 12N ðx; m þ A, s2 Þ,
ð30Þ
where A is the DC signal. Fig. 5(a) shows fX ðx; H0 Þ and
fX ðx; H1 Þ, and the intersections locate at t1 , t2 and t3 ,
corresponding to the extrema of Dp.
The Dp ¼ p1 p0 is illustrated in Fig. 5(b). Two maximum
points B1max and B2max , locate at t1 and t2 . One minimum
point B1min locates at t3 . x is divided into two complementary
subset X1 and X2, where X1 : x r t3 and X2 : x 4 t3 . The MTS
is constructed using two TS’s (TS1 and TS2) represented by
PN1 1
Eq. (2), i.e., the test statistics are TS1 : z1 ¼ ð1=N1 Þ
n¼0
P
ðx½n 4 t1 Þ for x½n r t3 , n 2 ½0,N1 , TS2 : z2 ¼ ð1=N2 Þ nN2¼10
ðx½n 4 t2 Þ for x½n 4 t3 , n 2 ½0,N2 and the MTS is z ¼
z1 þ z2 . We can have Dp1 ¼ B1max B1min , Dp2 ¼ B2max and Dps
obtained from the TD with the MTS is as Dps ¼ Dp1 þ Dp2 .
Dps is the maximum Dp that can be obtained for this
specific mean-shift Gaussian mixture detection and therefore z is the optimal TD.
5.3. Sinusoidal signal detection
Fig. 6 illustrates the detection performance when the
TD shown in Fig. 3 is used to detect a sinusoidal signal in
GM (unimodal) noise. The TD is composed of TD½t, t 2
½0,N1 and TD[t] adapts topt ¼ s½t=2. We now show the
0.2
fx(x;H1)
fx(x;H0)
0.1
0
−5
τ
0 3
x
τ1
τ2
5
τ2
5
0.1
2
1
Δ p=p1−p0
Bmax
Comparing these two cases, it is obvious that the
computational complexity to obtain topt using Dp is much
less compared to the classical method using PD. For this
case, the average running time (Matlab) is about 0.189 s
using PD and 0.034 s using Dp in a computer with AMD
8450 triple-core processor, 2.1 GHz and 4 GB memory.
ð29Þ
A mean-shift detection problem leads to
fx(x)
around the theoretical ones and we consider that the simulation results are consistent with the theoretical results. To
alleviate the inaccuracy, we can use the linear interpolation
between the staircases of the pmf’s. Nevertheless, because
of the uncertainty of the estimated pmf’s, the interpolation
is not easy. A better way is to use the continuous pdf instead
of pmf for accurate demonstration, which is what we used
in this paper (see Eqs. (11) and (12)) and topt is accurately
verified from the theoretical ROCs.
Given optimal t ¼ 0:05, we can have the theoretical
ROC from Eq. (20) shown in Fig. 4(b) with the ROC from
Monte Carlo simulations. The upper bound of this detection is the performance of the LO detector. It is shown
that the TD with optimal threshold only has an average
uniform loss of 0.34 dB (4%). Compared to the matched filter, the TD has an average uniform improvement
of 1.19 dB (15%).
We now compare the computational complexity for
calculating topt from PD, and from Dp. Because the GM
(unimodal) is Type I noise (zero mean), we can obtain the
topt ¼ A=2 ¼ 0:05 without any calculation. To compare the
calculation time fairly, it is assumed that we need to
calculate the intersections given Eq. (28) and A¼0.1.
Assume that we tune t and x from 5 to 5 with a step
size 0.01 (t ¼ 5 : 0:01 : 5Þ, i.e., totally the number of t’s
or x’s is 1001.
177
Bmax
0.05
X2
X1
1
5.2. Optimal structure determination
In this section, we use a mean-shift bimodal Gaussian
mixture (GM) detection to show the optimal MTS structure.
0
−5
τ1
0τ3
τ
Bmin
Fig. 5. (a) fX ðx; H0 Þ and fX ðx; H1 Þ and (b) Dp ¼ p1 p0 in the mean-shift
Gaussian mixture detection with A ¼ 0.5, m ¼ 3 and s ¼ 1.
178
G. Guo, M. Mandal / Signal Processing 92 (2012) 170–178
assuming a DC signal in a noise with known pdf. We
present techniques to use the TD for any deterministic
signal in a noise with unknown pdf. Hence, the proposed
TD is good to be used in practice applications. Experimental results show the validation of the optimal threshold and the optimal structure. For non-Gaussian noise
with heavy pdf tails, the performance of the TD can
outperform the matched filter, and be very close to the
performance of the LO detector.
Probability of detection (PD)
1
0.8
0.6
0.4
0.2
0
Acknowledgment
LO detector
Proposed TD
Matched filter
0
0.2
0.4
0.6
0.8
1
Probability of false alarm (PFA)
Fig. 6. The comparison of the ROCs obtained from the Monte Carlo
simulations in detecting sinusoidal signal between the proposed TD and
other detectors, including the LO detector and the matched filter. The
GM noise is with a ¼ 0:9, b ¼ 5 and s2 ¼ 1. The sinusoidal signal
s½n ¼ 0:1 sinð0:02pnÞ,n 2 ½0,N1 and N ¼100 in one observation. ROCs
are obtained from 10 000 simulations.
performance comparison between the proposed TD and
the LO detector, the matched filter. It is shown that the
proposed TD only has an average uniform loss of 0.389 dB
(4.6%) compared to the LO detector, with the benefit of
not being sensitive to the parameters of the noise pdf.
Compared to the matched filter, the TD obtains a significant improvement, an average uniform improvement
of 1.283 dB (16%).
6. Conclusions
In this paper, we considered the optimality design of a
TD. From the pdf’s of the observation x[n] under two
hypothesis, fX ðx; H0 Þ and fX ðx; H1 Þ, the corresponding pdf’s
of the output of the test statistic is binomial distribution
such that the optimal TD can be constructed in the sense
of Neyman–Pearson criterion. We showed that the PD can
be substituted by an indicator, Dp ¼ p1 p0 (Lemma 1) to
find optimal performance. Using Dp, the optimal threshold topt can be easily calculated. At the same time, the TS
structure is addressed. With the help of Dp, a MTS is
proposed as the optimal structure. Using the optimal
structure with optimal threshold(s), we can implement
an optimal TD. The optimal design of the TD is conducted
The authors would like to thank the reviewers for their
comments that help improve the manuscript. Also the
authors thank Dr. Yindi Jing and Rongfei Fan for some
valuable discussions.
References
[1] S. Kay, Fundamentals of Statistical Signal Processing, Volume II:
Detection Theory, Prentice Hall PTR, 2008.
[2] J. Thomas, Nonparametric detection, Proceedings of the IEEE 58 (5)
(1970) 623–631.
[3] F. Chapeau-Blondeau, Nonlinear test statistic to improve signal
detection in non-Gaussian noise, Signal Processing Letters, IEEE 7
(7) (2000) 205–207, doi:10.1109/97.847369.
[4] M. Guerriero, S. Marano, V. Matta, P. Willett, Stochastic resonance
in sequential detectors, IEEE Transactions on Signal Processing 57
(1) (2009) 2–15, doi:10.1109/TSP.2008.2005087.
[5] H. Papadopoulos, G. Wornell, A. Oppenheim, Low-complexity
digital encoding strategies for wireless sensor networks, in: Proceedings of ICASSP, 1998.
[6] A. Saha, G. Anand, Design of detectors based on stochastic resonance, Signal Processing 83 (6) (2003) 1193–1212.
[7] S. Kay, Can detectability be improved by adding noise? IEEE Signal
Processing Letters 7 (1) (2000) 8–10, doi:101109/97.809511.
[8] S. Kay, J. Michels, H. Chen, P. Varshney, Reducing probability of
decision error using stochastic resonance, IEEE Signal Processing
Letters 13 (11) (2006) 695–698, doi:10.1109/LSP.2006.879455.
[9] H. Chen, P. Varshney, S. Kay, J. Michels, Theory of the stochastic
resonance effect in signal detection: part I—fixed detectors, IEEE
Transactions on Signal Processing 55 (7) (2007) 3171–3184.
[10] B.C. Levy, Principles of Signal Detection and Parameter Estimation,
first ed., Springer, 2008.
[11] S. Kay, D. Sengupta, Advances in Spectrum Analysis and Array
Processing, Recent Advances in Non-Gaussian Autoregressive Processes, Prentice-Hall Signal Processing Series, vol. 1, 1991.
[12] W. Machell, C.S. Penrod, G.E. Ellis, Statistical characteristics of
ocean acoustic noise processes, in: E.J. Wegman, S.C. Schwartz,
J.B. Thomas (Eds.), Topics in Non-Gaussian Signal Processing,
Springer, New York, 1989, pp. 29–57 (Chapter 3).
[13] D. Powell, G. Wilson, Class A modeling of ocean acoustic noise
processes, in: E.J. Wegman, S.C. Schwartz, J.B. Thomas (Eds.), Topics
in Non-Gaussian Signal Processing, Springer, New York, 1989,
pp. 17–28 (Chapter 2).