Blind Source Separation from Single Channel Audio Recording Using ICA Algorithms Juan S. Calderón Piedras Universidad Distrital F. J. C. Bogotá Colombia jscalderonp@correo.udistrital.edu.co Álvaro. D. Orjuela-Cañón. IEEE CIS – COLOMBIA Chair. Bogotá Colombia dorjuela@ieee.org Abstract—FastICA method has been proposed for blind identification and separation characteristics of components, this paper has made a study of this method in order to measure its performance in the task of separating real audio signals that share the same channel simultaneously .We propose an SCICA algorithm based on FastICA, which allows finding the mixing matrix and its inverse. In this way, it is possible to find representative bases, which after a clustering process, are used as impulse response filters to discriminate source signals. Parameters used in the process identifying sources are studied to improve the results. Keywords — Independent component analysis, single channel ICA, blind source separation, power spectrum, filters, audio signals. I. INTRODUCTION Applications in audio engineering are currently demanding in terms of advances in digital signal processing. Processes for speech segregation and recognition, automatic music transcription, musical information systems, forensic audio and signal separation need powerful techniques to complete the application required. In this way, Independent Component Analysis (ICA) technique is useful in audio engineering tasks [1],[2], where Blind Separation of Signals (BSS) from mixtures procedures have extensive treated some time ago [3]. For applications cited above, the need for separation usually happen in single channel mixes, and this problem cannot be solved with traditional ICA techniques. In this case, it is necessary to develop new and complementary algorithms to separate the signals, known as Single Channel ICA (SCICA). In SCICA as in the basic ICA model, where collection of perceptually motivated techniques jointly called Computational Auditory Scene Analysis (CASA) are widely used [4]. The solutions in SCICA are generally limited, due to require strong additional assumptions, where the separation is often only achieved through fairly intensive computational procedures [5 – 7]. Improving these limitations, information extracted from differences between time-frequency (t-f) distributions of sources are frequently used. Studies with these characteristics can be found in [8 – 10]. David A. Sanabria Quiroga Universidad Distrital F. J. C. Bogotá Colombia dasanabriaq@correo.udistrital.edu.co Although SCICA method can solve the problem, is necessary to know under which SCICA parameters we have the best results. SCICA approach, as noted in [11] is a special case of the analysis of multidimensional independent components (MICA) [12], in SCICA the input vector must be delayed N times, that implies multiple independent components (IC's) associated with a single independent source, which makes it necessary to group the IC’s in order to reconstruct the corresponding initial signals. As restriction we must have disjoint spectra to discriminate between signals this through filters with coefficients provided by ICA. The present work describes a methodology to separate two signals mixed in one channel. SCICA is used to find and separate the sources. Next section shows characteristics from used signals and details about methodology implemented. A discussion and conclusions extracted from the results are presented in sections IV and V. II. MATERIALS AND METHODS This section describes the database and methodology used to obtain the sources. We implemented a methodology to get independent components, based on ICA and SCICA theory that will be the signal source to be separated. A – Database Two sets of signals were used to test the algorithm; the first signal created was a Synthetic signal with uniform distribution, zero mean and unit variance; the signals has a spectral and statistics characteristics that are well adapted to the method, that means: There is at least one Gaussian process, all the independent random processes must be band limited with disjoint spectral support; and no non-Gaussian process can be further decomposed into multiple independent or spectrally disjoint processes (maximal decomposition). The second set of signals are a guitar and the voice of a man, these signals were acquired individually. A Shure SM-57 microphone was used with a sample rate of 16 KHz. To create these signals we avoid all the environment noise. B – Independent Component Analysis (ICA) and Single Channel Independent Component Analysis. Single channel ICA is part of the ICA method, where analysis has an initial search approaching to independent component, the ICA model is constructed by following. 𝒙 = 𝐀𝒔 (1) Where x corresponds to the mixed signals and s are independent random vectors, or source signals. The main concept of ICA is finding A, which represents the mixing matrix with which the initial random vectors have been mixed. From (2) is possible to construct an immediate solution to the problem of blind source separation. 𝐬=𝐖∗𝐱 (2) Where W represents the inverse matrix of A, that is 𝐀−𝟏 = 𝐖, then ICA is a method which focuses on the mixing matrix calculating A by maximizing the non-Gaussianity of the signal. The SCICA model can be seen as an extension of ICA since it uses the matrices A and W to make the filter separating source signals. ICA is a method that requires a source number equal to the number of available mixtures signs while SCICA only needs two signals for a mixture to be separated. The final SCICA method is a blind method which can be use with artificial or real audio signals recorded by any microphone, so the input matrix for the method is constructed as shown in (3). 𝐗= 𝐱n 𝐱 𝐧−𝟏 ⋮ 𝐱 𝐧−𝐍+𝟏 (3) FastICA is one of the fastest and robust algorithms, where a principal component analysis and whitening preprocessing are implemented [13]. Due to (3), it was necessary to find the right number of delays to develop the SCICA technique. In this way, this number was found experimentally, using different types of signals and study the correlation between the source signals and the signals found for different delays. SCICA needs an input matrix as in (3), then you can use any of the algorithms of ICA to make a decomposition into independent sub components, the problem is that ICA will get many solutions as delays (N) because it was made over mix signal, that means, the source signals which are the independent components have been broken, in other words, the independent components are divided. The characteristics of each independent sub-component can be observed from the mixture matrix by finding the FFT from the A columns, which gives us a different wave form to discriminate between signals. K-means algorithm allows cluster waveforms that are similar to each other, this components are grouped into subgroups 𝛾𝑝 , Number of clusters 𝛾𝑝 was determined using the Davies-Bouldin index, which asses the quality of clusters through distance within and inter clusters relation [15], then will require a re-organization of the mixing matrix and its inverse (A and W respectively) according to the groups of submatrices 𝛾𝑝 where 𝑎𝑖 are the columns of A and 𝑤𝑖 are de rows of W. Finally, the separation - reconstructing filter in the initial sources is defined as: 𝑓𝑖 = 1 𝑁 𝑎𝑖 −𝑡 ∗ 𝑤𝑖 𝑡 (4) 𝑖 ∈ 𝛾𝑝 Filter impulse response coefficients were obtained using (4) with 𝑓𝛾𝑝 = [𝑓1 , 𝑓2 , 𝑓3 , 𝑓4 ]′, this represents the frequency response for the 4 clusters created by the K-means algorithm. Diagram of the method is shown in Fig. (1). Where n is equal to the length of the X signal, which is the input matrix system to SCICA. Once you have the X matrix with N delays, we proceed to perform a dimensionality reduction and whitening; these methods are not strictly necessary but they are a very effective block pre-processing to shorten the waiting time and the searching process time of the independent components, is therefore highly recommended to use such methods for implementing SCICA. Then the dimensionality and whitened input matrix runs on ICA in order to make the search for finding the separate components by A and W matrix, ICA measures the nongaussian to the signals carried by the kurtosis as a function of contrast to the measurement of this parameter. C – Implemented methodology Fig. 1 describes the methodology implemented. First the signal is delayed and a reduction is implemented. Then an ICA algorithm was used to find N independent components, which were clustered by K-means algorithm [14]. Finally, filters were built using the found clusters as impulse response. Fig. 1. Flow diagram of SCICA method. Finally to validate the results obtained have been three measures, correlation index, the signal noise ratio and the mean square error, these evaluate the similarity of the source signals with those found by the method SCICA. III. RESULTS Consider Fig. 2 there are two artificial signals, the input vector from experiment 1 is the mix by the addition of this two signals. Fig. 3. Correlation index for S1/ IC1 and S2/IC2. Fig. 2. SCICA Source Signals. One of the most important parameters for SCICA method is the number of delays from the single mix signal. To define the number of delays we make a sweep to measure the correlation index between the obtained signal and the initial source, this results are shown in Fig. 3. We found the highest correlation value for the two signals (Fig. 3) between the ranges of 22 to 32 shifts. Consider now that X has been delay 25 times, and s is the output of ICA system and represents the independent components, the matrix W is of size [25x25] that is the inverse of the mixing matrix A, x [25xn] with n equal to the number samples of the source signals, this assures that the output of the system will have a dimension, s[25xn] corresponding to 25 possible solutions or 25 independent sub-components with a length equal to the length of the vector mixture. Through some method of clustering is necessary to group the 25 signals with a similar waveform, these characteristics can be observed in the time domain or the frequency domain. For this we use the k-means algorithm with four clusters, supported by this Davies-Bouldin index as shown in Fig. 5, where the lowest index in a range given by 2 and 0.04 1000 2000 3000 4000 5000 6000 7000 8000 The key concept to SCICA is the independent subcomponents decomposition; therefore it is necessary to consider a source signal as the sum of several subcomponents found in Table 3. 3 measurement parameters are shown to observe the performance of the method. Correlation index shown is high, above 96% similarity of the resulting signal compared to the -1 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250 300 350 400 450 500 1 10 0 0 1000 2000 3000 4000 5000 6000 7000 8000 0.04 -1 1 0.02 0 In the second experiment we used the acquired signals from guitar and voice, these signals have a peculiarity is that in general can be considered independent random vectors from one another, an essential requirement for good performance of the method. As a second restriction is necessary that the source signals have disjoint spectra. We run SCICA to find the separation- reconstruction filters. First signal corresponds to the IC1 component shown in Fig. 6. b. (2), to construct the second source signal is necessary to add the components IC1, IC3, IC4 shown in Fig. 6. b. (1, 3, 4). 0 0 20 0 In Fig.4 (a) are shown the impulse response of the filters constructed by (4). Fig. 4 (b) is the result of applying the filter to the mix, as is evident has been recovered the source signals. 1 0.02 0 5 is in 𝛾𝑝 = 4 cluster. This interval is necessary to narrow it down to a small number to avoid an excessive number of solutions that may contaminate the reconstruction of the IC's (Independent components). 0 0 1000 2000 3000 4000 5000 6000 7000 8000 0.1 -1 1 0.05 0 0 0 1000 2000 3000 4000 a. Fig. 4. 5000 6000 7000 8000 -1 b. SCICA (a) Impulse response to 4 filters; (b) Waveform of the mixture after being filtered. Normalization was done by the maximum value found in the signal, It was also necessary to delay the output signals in order to find the correct phase of each. The number of delays made on the initial mixture determines the order of the filter building by method to discriminate the signals source, the range for the model proposed here is between 22 and 32 shifts. Another important technical performance parameter is the number of clusters; the number was fixed in 4 groups that allow good discrimination and classification of separate subcomponents, A good number of delays and cluster helps us to build a filter with a spectral response equal to the target signal. Fig. 5. I. Davies-Bouldin index for Experiment 2. source signal, it is important to highlight that the SCICA method here developed is a method of filtering the mixed signal so if there overlapping the frequency spectra of the source signals, it will be impossible to separate these signals. CONCLUSIONS The SCICA technique is a solution to the problem of the blind source separation, as long as the spectra of the signals to separate are not overlap in frequency, so it is a method that works for very specific situations in a wide variety of sounds that the human being is entitled to recognize. The number of delays to the initial mixture and the number of clusters to group correctly independent sub-components are the most relevant parameters for the method SCICA, here we find the suitable range for the discrimination of two signals with only one mixture The signals found by the method SCICA not preserve the amplitude and the phase of the initial signals, depending on the application will be necessary to consider the importance of these features because it is not possible to access to any kind of data that give us information about the initial phase and amplitude. TABLE 1. SCICA results. IV. DISCUSSION OF RESULTS The filters constructed by the SCICA method has non-unity gain and non-zero phase, therefore it is necessary to normalize the signals before obtaining the measurements shown in Table.1. -3 1 1 x 10 0 0.5 0 1 -1 0 1000 -3 x 10 2000 3000 4000 5000 6000 7000 1 1 1 2 3 4 5 6 4 x 10 1 0 -1 0 1000 -3 x 10 2000 3000 4000 5000 6000 7000 0 1 2 3 4 5 8000 6 4 x 10 1 0 0.5 0 0 8000 0.5 0 Audio signals in general has a small range of frequencies located between 20 Hz and 20 kHz, this requires the separationreconstruction filters has higher orders. In SCICA the order of the filter is directly related to the number of delays to the initial mixture, therefore a number of delays greater than 20 are necessary to obtain a good performance in the separation of real audio signals. -1 0 1000 -3 x 10 2000 3000 4000 5000 6000 7000 8000 0 1 2 3 4 5 6 4 x 10 1 0.5 0 0 -1 0 1000 2000 3000 4000 5000 6000 7000 8000 0 1 2 3 4 5 6 4 a. b. x 10 Fig.6. Filter and resulting signals; (a) Impulse response of the separation filters, (b) resulting signals after applying the mixture to the separation filter. REFERENCES [1] P. Comon, Independent component analysis -a new concept? Signal Processing. 287-314, (1994). [2] Hyvärinen and E. Oja. Independent Component Analysis: Algorithms and Applications. Neural Networks, 13(4-5):411-430, (2000). [3] Cardoso J.-F., Blind Signal Separation: statistical Proceedings of the IEEE, 9, 10, 2009–2025. (1998). [4] Wang D.L., Brown G.J. Computational auditory scene analysis, Principles, Algorithms, and Applications. IEEE Press/WileyInterscience, Hoboken NJ. (2006). [5] M. E. Davies , N. Mitianoudis , A Simple mixture model for sparse over complete ICA , IEE Proc. VISP 151 (1) 35-43,(2004). [6] M. Girolami, A variational method for learning sparse and over complete representations, Neural Comput.( 2002 ) 2517-2532 . [7] M.S. Lewicki , TJ Sejnowski , Learning over complete representations , Neural Comput . 12 ( 2000 ) 337-365. [8] D. Mika, P. Kleczkowski, ICA-based Single Channel Audio-Separation: New Bases and Measures of Distance, Archives of Acoustics, Vol 36, No. 2, pp. 311-331, (2011). [10] E. Vincent, H. Sawada, P. Bofill, S. Makino, and J. Rosca, First Stereo Audio Source Separation Evaluation Campaign: Data, Algorithms and Results, Independent Component Analysis and Signal Separation, Lecture Notes in Computer Sciences, Vol 4666, pp. 552-559, (2007). [11] M.E. Davies, C.J. James, Source separation using single channel ICA ,science direct, Signal Processing 87 (2007). [12] C. J. James , O. Gibson, M. E. Davies , On the analysis of single versus multiple channels of electromagnetic brain-signals , Artif . Intell . Med 37 (2) 131-143,( 2006 ) . [13] Hyvärinen. Fast and Robust Fixed-Point Algorithms for Independent Component Analysis. IEEE Transactionson Neural Networks 10(3):626-634, (1999). [14] Coates, Adam; Ng, Andrew Y. "Learning feature representations with k-means". In G. Montavon, G. B. Orr, K.-R. Müller. Neural Networks: Tricks of the Trade. 2nd edn, Springer. (2012). [15] Davies, David L.; Bouldin, Donald W. "A Cluster Separation Measure". IEEE Transactions on Pattern Analysis and Machine Intelligence. PAMI-1 (2): 224–227. (1979). principles,
© Copyright 2025