Our BSS methods work very well also for instantaneous mixtures of sound sources. The only condition is, that the rank of the mixing matrix is at least equal to the number of sources. By the way, the same holds for image sources as well - this is a general requirement in BSS.
A typical sound source is usually a super-Gaussian signal, i.e. its kurtosis is of positive value, whereas a typical natural image is a sub-Gaussian signal, i.e. its kurtosis is of negative value. Thus, different activation functions have to be applied in our separation rules in above two cases.
To document the proper behavior of our methods results of two experments on sound sources and their mixtures are shown on this page.
I have used the 10 mono sound sources and their mixtures, provided by Dr. Barak Pearlmutter (University of New Mexico in Albuquerque).
At first I have extracted the ten mono sound sources from the file t10-mono.wav.
In first experiment I have mixed the 10 sound sources by a difficult conditioned mixing matrix A of size 10x10 (cond (A) = 1013). If you want to hear it, the sequence of my 10 mixed signals is given
1. Individually for each pair: output signal Yi - source Sj with signal amplitudes scaled to <-1, 1> the SNR[i,j] factor (signal to noise ratio) is calculated as
2. A combined error index qEI(P) for whole separated output set relative to the source set. > At first a matrix P = {a_ij} is created with a_ij = 1/sqrt(MSE[i,j]) and every row i of the matrix is normalized such that max_k(a_ik) = 1. The index qEI(P) is defined as the sum of all squared elements of such normalized matrix P.
To such obtained sound mixture I have applied our two separation rules.
As usual the robust rule gives slightly better separation results.
For example: 5 sources were extracted with SNR-qualities of 28.3 - 30.8 [dB]
and the remaining 5 sources with 16.3 - 22.2 [dB].
The combined error index qEI for this signal set was 0.038.
In the case of a multi-layer local rule separation you can arrive
after 4 layers with all sources properly separated (SNR=15-34 [dB])
and with the error index qEI = 0.063.
Fig. 2 : Source 7, one my mixture and
appropriate output signal (separated source 7) in our method.
If you want to listen to the separation results, they are given
Every signal in the pipe is of length 55168 x 2 byte with silence pauses of 9832 x 2 byte between signals.
Dr. B. Pearlmutter prepared 10 mixtures of his 10 mono sound sources -
in the
file t3-mono.wav .
B.Pearlmutter gives also results of processing this mixture
set by two algorithms:
- a basic ICA algorithm (Bell & Sejnowski),
in the
10-signal file t5-mono.wav ,
- and his cICA algorithm,
in the
10-signal file t7-mono.wav .
The given ICA results are poor, as only the separated source
2 with SNR = 20.1 [dB] and source 1 with 13.3 [dB] are of satisfying
quality. But the remaining 8 sources have their SNRs
in the range 1.7 - 7.3 [dB]. The overall qEI index would be 1.667.
In opposit the given results of cICA are of very high quality.
The SNRs of 7 separated sources are within 20.7 - 42.1 [dB],
but of 3 remaining sources are only 7.8 - 14.5 [dB].
The combined qEI index would be 0.125.
The results of applying our two methods:
the local learning multi-layer rule, called CKA rule,
(Cichocki:Kasprzak:Amari:NOLTA95)
and the robust one-layer ACY rule
(Amari:Cichocki:Yang:NIPS95)
to the above mixture are both of very good quality.
After 4 layers of local rule learning I arrived
at separation results with index qEI = 0.080 - 0.110,
depending on different learning factors.
All 10 sources were always separated very well
with their SNRs within 13 - 35 [dB].
By using the robust rule, as expected,
I have achieved a separated source
set with even a better index value qEI = 0.066.
In this case the SNRs of 6 separated sources were within
24.9 - 30.0 [dB]
and the SNRs of 4 remaining sources: 14.4 - 18.7 [dB].
Fig. 5 : Source 6, one Pearlmutter's mixture and
appropriate output signal (separated source 6) in our method.
By using the robust rule, as expected, I have achieved a separated source
set with even a better index value qEI = 0.065.
In this case the SNRs of 6 separated sources were within
24.9 - 30.0 [dB] and the SNRs of 4 remaining sources: 14.4 - 18.7 [dB].
This separated 10-signal output set is available in following files: