Discussion:
[pulseaudio-discuss] Testing echo cancellation on an armhf OMAP phone
Neil Jerram
2012-12-17 21:49:48 UTC
Permalink
Hi pulseaudio folk. I've been following the list for a while, but this
is my first post...

I'm working with PulseAudio on the GTA04 phone, specifically trying to
use it to route the audio during a call, with echo cancellation.

Without the echo cancellation, the picture would be:

+----------+ +--------------------+
| GSM chip |------ module-loopback -------->|earpiece (sink) |
| sound | | |
| card |<------- module-loopback -------|microphone (source) |
+----------+ +--------------------+

The earpiece and microphone belong to a single sound card, which is
different from the GSM chip sound card.

The GSM source and sink are named
alsa_input.platform-soc-audio.1.analog-mono and
alsa_output.platform-soc-audio.1.analog-mono. The earpiece is
alsa_output.platform-soc-audio.0.analog-stereo and the microphone is
alsa_input.platform-soc-audio.0.analog-stereo.

To add in echo cancellation, I load module-echo-cancel, and then start
up the loopbacks like this:

exec pactl load-module module-loopback \
source=alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel \
rate=8000 \
sink=alsa_output.platform-soc-audio.1.analog-mono

exec pactl load-module module-loopback \
source=alsa_input.platform-soc-audio.1.analog-mono \
rate=8000 \
sink=alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel

Does that all sound correct in theory?

Now, I'm not actually at the point of doing all that yet. First I'm
trying to test the echo cancellation. To do that, I:

- load module-echo-cancel

- do "paplay -d
alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
/media/card/Documents/audio/ogg/Do\ They\ Know\ It\'s\ Christmas.ogg"
in one terminal

- do "parecord -d
alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel
--file-format=wav > record1.wav" in another terminal

- speak into the microphone.

Then the idea is that I would play record1.wav back and see if contains
an echo of the song.

However, I seem to be hitting various problems, which I suspect are all
to do with resampling.

- With the default resample method (speex-float-3), I don't get any
sound at the earpiece, except for intermittent crackling.

- I then tried speex-fixed-3. This gives recognisable song playback at
the earpiece, but with strange echo-like distortions - i.e. as though
short snatches of the song are being repeated.

- I then tried src-sinc-fastest, and found that PulseAudio exited as
soon as I loaded module-echo-cancel.

- I then tried src-linear. This gives good song playback, except for
occasional clicks and crackles.

The song is at 44.1 kHz, I think the sound card's default rate is 48
kHz, and it looks from the log as though module-echo-cancel causes the
song to be resampled to 32 kHz (and presumably then back to 48 kHz?).
Is that all expected, and is there any way of reducing this amount of
playback resampling?

Now - still with src-linear - if I try the parecord line at the same
time as the playback, the log goes crazy with umpteen rapid repeats of:

Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-source.c: Trying resume...
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: Device hw:0 doesn't support 44100 Hz, changed to 48000 Hz.
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: ALSA period wakeups disabled
Dec 17 21:04:34 neo pulse.sh: W: [alsa-source] alsa-source.c: Resume failed, couldn't restore original sample settings.

and I get no content (apart from the WAV header) in the file that I'm
trying to record.

On the other hand, if I try the parecord on its own when not also
playing back, it works fine.

I'd very much appreciate any input on whether what I'm doing looks right
(which I'm not yet confident at all about) and on the observations of
things not working as I'd expect.

Many thanks,
Neil
Tanu Kaskinen
2012-12-18 04:58:25 UTC
Permalink
Post by Neil Jerram
Hi pulseaudio folk. I've been following the list for a while, but this
is my first post...
I'm working with PulseAudio on the GTA04 phone, specifically trying to
use it to route the audio during a call, with echo cancellation.
+----------+ +--------------------+
| GSM chip |------ module-loopback -------->|earpiece (sink) |
| sound | | |
| card |<------- module-loopback -------|microphone (source) |
+----------+ +--------------------+
The earpiece and microphone belong to a single sound card, which is
different from the GSM chip sound card.
The GSM source and sink are named
alsa_input.platform-soc-audio.1.analog-mono and
alsa_output.platform-soc-audio.1.analog-mono. The earpiece is
alsa_output.platform-soc-audio.0.analog-stereo and the microphone is
alsa_input.platform-soc-audio.0.analog-stereo.
To add in echo cancellation, I load module-echo-cancel, and then start
exec pactl load-module module-loopback \
source=alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel \
rate=8000 \
sink=alsa_output.platform-soc-audio.1.analog-mono
exec pactl load-module module-loopback \
source=alsa_input.platform-soc-audio.1.analog-mono \
rate=8000 \
sink=alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
Does that all sound correct in theory?
Yes, I think so.
Post by Neil Jerram
Now, I'm not actually at the point of doing all that yet. First I'm
- load module-echo-cancel
- do "paplay -d
alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
/media/card/Documents/audio/ogg/Do\ They\ Know\ It\'s\ Christmas.ogg"
in one terminal
- do "parecord -d
alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel
--file-format=wav > record1.wav" in another terminal
- speak into the microphone.
Then the idea is that I would play record1.wav back and see if contains
an echo of the song.
However, I seem to be hitting various problems, which I suspect are all
to do with resampling.
- With the default resample method (speex-float-3), I don't get any
sound at the earpiece, except for intermittent crackling.
- I then tried speex-fixed-3. This gives recognisable song playback at
the earpiece, but with strange echo-like distortions - i.e. as though
short snatches of the song are being repeated.
- I then tried src-sinc-fastest, and found that PulseAudio exited as
soon as I loaded module-echo-cancel.
- I then tried src-linear. This gives good song playback, except for
occasional clicks and crackles.
The song is at 44.1 kHz, I think the sound card's default rate is 48
kHz, and it looks from the log as though module-echo-cancel causes the
song to be resampled to 32 kHz (and presumably then back to 48 kHz?).
Is that all expected, and is there any way of reducing this amount of
playback resampling?
If you haven't configured the sample rate of module-echo-cancel, then it
will default to 32 kHz (I don't know why), which indeed will cause
unnecessary resampling just as you described. If the hardware runs at 48
kHz, then I think it's best to pass "rate=48000" to module-echo-cancel.

I think it would make sense to modify module-echo-cancel to use the rate
of the microphone by default...
Post by Neil Jerram
Now - still with src-linear - if I try the parecord line at the same
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-source.c: Trying resume...
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: Device hw:0 doesn't support 44100 Hz, changed to 48000 Hz.
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: ALSA period wakeups disabled
Dec 17 21:04:34 neo pulse.sh: W: [alsa-source] alsa-source.c: Resume failed, couldn't restore original sample settings.
Are only these five lines repeated? I don't understand why this would be
looping, maybe setting the log level to more verbose would reveal the
reason.

Anyway, looping or not, the reason why you can't get anything recorded
is that the source fails to resume from suspended state. If this happens
only when playback is happening at the same time, it suggests that
initially, when playback was not active, the source successfully opened
the device with 44100 sample rate, at which point the rate got locked in
pulseaudio (I think pulseaudio could be fixed to not do that). When
playback is active (presumably at 48 kHz), the hardware doesn't anymore
support capturing at 44.1 kHz, so when pulseaudio tries to open the
device with the old rate, it doesn't work anymore.

You can fix this by setting the default sample rate to 48000.
--
Tanu
Arun Raghavan
2012-12-18 05:26:19 UTC
Permalink
Post by Tanu Kaskinen
Post by Neil Jerram
Hi pulseaudio folk. I've been following the list for a while, but this
is my first post...
I'm working with PulseAudio on the GTA04 phone, specifically trying to
use it to route the audio during a call, with echo cancellation.
That's quite interesting!
Post by Tanu Kaskinen
Post by Neil Jerram
+----------+ +--------------------+
| GSM chip |------ module-loopback -------->|earpiece (sink) |
| sound | | |
| card |<------- module-loopback -------|microphone (source) |
+----------+ +--------------------+
The earpiece and microphone belong to a single sound card, which is
different from the GSM chip sound card.
The GSM source and sink are named
alsa_input.platform-soc-audio.1.analog-mono and
alsa_output.platform-soc-audio.1.analog-mono. The earpiece is
alsa_output.platform-soc-audio.0.analog-stereo and the microphone is
alsa_input.platform-soc-audio.0.analog-stereo.
To add in echo cancellation, I load module-echo-cancel, and then start
exec pactl load-module module-loopback \
source=alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel \
rate=8000 \
sink=alsa_output.platform-soc-audio.1.analog-mono
exec pactl load-module module-loopback \
source=alsa_input.platform-soc-audio.1.analog-mono \
rate=8000 \
sink=alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
Does that all sound correct in theory?
Yes, I think so.
As Tanu says, yes it does.
Post by Tanu Kaskinen
Post by Neil Jerram
Now, I'm not actually at the point of doing all that yet. First I'm
- load module-echo-cancel
- do "paplay -d
alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
/media/card/Documents/audio/ogg/Do\ They\ Know\ It\'s\ Christmas.ogg"
in one terminal
- do "parecord -d
alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel
--file-format=wav > record1.wav" in another terminal
- speak into the microphone.
Then the idea is that I would play record1.wav back and see if contains
an echo of the song.
However, I seem to be hitting various problems, which I suspect are all
to do with resampling.
- With the default resample method (speex-float-3), I don't get any
sound at the earpiece, except for intermittent crackling.
- I then tried speex-fixed-3. This gives recognisable song playback at
the earpiece, but with strange echo-like distortions - i.e. as though
short snatches of the song are being repeated.
- I then tried src-sinc-fastest, and found that PulseAudio exited as
soon as I loaded module-echo-cancel.
- I then tried src-linear. This gives good song playback, except for
occasional clicks and crackles.
The song is at 44.1 kHz, I think the sound card's default rate is 48
kHz, and it looks from the log as though module-echo-cancel causes the
song to be resampled to 32 kHz (and presumably then back to 48 kHz?).
Is that all expected, and is there any way of reducing this amount of
playback resampling?
You could try setting the resampler to 'ffmpeg', which is really
light-weight. speex-fixed-0 might be useful to test as well.
Post by Tanu Kaskinen
If you haven't configured the sample rate of module-echo-cancel, then it
will default to 32 kHz (I don't know why), which indeed will cause
unnecessary resampling just as you described. If the hardware runs at 48
kHz, then I think it's best to pass "rate=48000" to module-echo-cancel.
I think it would make sense to modify module-echo-cancel to use the rate
of the microphone by default...
Different echo-cancellation algorithms work best at certain sample rates
(depending on the filters they embed). I've picked the highest viable
one for each canceller as the default, so setting something higher is
not a good idea.

What would make sense is to pick the sample rate that you're getting
from the GSM sound card, which it seems you're doing already
(rate=8000)?

Also, are you using the webrtc echo canceller or speex?

Cheers,
Arun
Neil Jerram
2012-12-19 08:06:17 UTC
Permalink
Post by Arun Raghavan
Post by Tanu Kaskinen
Post by Neil Jerram
Hi pulseaudio folk. I've been following the list for a while, but this
is my first post...
I'm working with PulseAudio on the GTA04 phone, specifically trying to
use it to route the audio during a call, with echo cancellation.
That's quite interesting!
Thanks! It's very educational for me, too!
Post by Arun Raghavan
Post by Tanu Kaskinen
Post by Neil Jerram
The song is at 44.1 kHz, I think the sound card's default rate is 48
kHz, and it looks from the log as though module-echo-cancel causes the
song to be resampled to 32 kHz (and presumably then back to 48 kHz?).
Is that all expected, and is there any way of reducing this amount of
playback resampling?
You could try setting the resampler to 'ffmpeg', which is really
light-weight. speex-fixed-0 might be useful to test as well.
Thanks, I'll remember to try those settings.
Post by Arun Raghavan
Post by Tanu Kaskinen
If you haven't configured the sample rate of module-echo-cancel, then it
will default to 32 kHz (I don't know why), which indeed will cause
unnecessary resampling just as you described. If the hardware runs at 48
kHz, then I think it's best to pass "rate=48000" to module-echo-cancel.
I think it would make sense to modify module-echo-cancel to use the rate
of the microphone by default...
Different echo-cancellation algorithms work best at certain sample rates
(depending on the filters they embed). I've picked the highest viable
one for each canceller as the default, so setting something higher is
not a good idea.
What would make sense is to pick the sample rate that you're getting
from the GSM sound card, which it seems you're doing already
(rate=8000)?
Yes, I see that now, and have written/asked more about it in my other
replies.
Post by Arun Raghavan
Also, are you using the webrtc echo canceller or speex?
I've tried both. As far as I recall there was no significant difference
in the effect on the playback sound (through the ...echo-cancel sink)
that I heard. I think that makes sense, because distortions of the
playback sound are mostly due to resampling quality and load, not the
echo cancellation algorithm.

I haven't really reached looking at echo cancellation quality yet. What
would you recommend, for the best combination of quality and low CPU
use?

Thanks again,
Neil
Neil Jerram
2012-12-19 08:00:12 UTC
Permalink
Post by Tanu Kaskinen
If you haven't configured the sample rate of module-echo-cancel, then it
will default to 32 kHz (I don't know why), which indeed will cause
unnecessary resampling just as you described. If the hardware runs at 48
kHz, then I think it's best to pass "rate=48000" to module-echo-cancel.
Thanks, I'll try that.
Post by Tanu Kaskinen
Post by Neil Jerram
Now - still with src-linear - if I try the parecord line at the same
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-source.c: Trying resume...
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: Trying to disable ALSA period wakeups, using timers only
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: Device hw:0 doesn't support 44100 Hz, changed to 48000 Hz.
Dec 17 21:04:34 neo pulse.sh: I: [alsa-source] alsa-util.c: ALSA period wakeups disabled
Dec 17 21:04:34 neo pulse.sh: W: [alsa-source] alsa-source.c: Resume failed, couldn't restore original sample settings.
Are only these five lines repeated? I don't understand why this would be
looping, maybe setting the log level to more verbose would reveal the
reason.
Thanks; if I keep seeing this, despite the following help, I'll try to
get a better log.
Post by Tanu Kaskinen
Anyway, looping or not, the reason why you can't get anything recorded
is that the source fails to resume from suspended state. If this happens
only when playback is happening at the same time, it suggests that
initially, when playback was not active, the source successfully opened
the device with 44100 sample rate, at which point the rate got locked in
pulseaudio (I think pulseaudio could be fixed to not do that). When
playback is active (presumably at 48 kHz), the hardware doesn't anymore
support capturing at 44.1 kHz, so when pulseaudio tries to open the
device with the old rate, it doesn't work anymore.
You can fix this by setting the default sample rate to 48000.
I'm still a bit confused on the detail here, but I think I understand
the principle of what's happening now. Presumably there's something I
can find inside pacmd that will tell me what the current locked-in rate
is? I'll check for that, and also try changing default sample rate as
you suggest.

Now, as I wrote in my reply just now to Arun, I realise that I really
want my in-call audio to run entirely at 8000. Does that mean that I
need to modify your advice above to:

- load-module module-echo-cancel rate=8000

- default-sample-rate = 8000

If I did that, should I then expect the microphone sink to be detected
and used at 8000? (Currently it's initially detected at 44100.)

Many thanks,
Neil
Arun Raghavan
2012-12-18 05:30:00 UTC
Permalink
On Mon, 2012-12-17 at 21:49 +0000, Neil Jerram wrote:
[...]
Post by Neil Jerram
- load module-echo-cancel
- do "paplay -d
alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
/media/card/Documents/audio/ogg/Do\ They\ Know\ It\'s\ Christmas.ogg"
in one terminal
- do "parecord -d
alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel
--file-format=wav > record1.wav" in another terminal
- speak into the microphone.
In general, to start with, you should pick a recording of voice rather
than music since that's the sort of echo that is designed to be
cancelled. I've noticed varying degrees of success for music with speex
and much better success with the webrtc canceller, but starting with the
basics is better.

Also, if you're hitting trouble with double-resampling, you could
resample the file to what the canceller sink supports before doing your
test.

Cheers,
Arun
Neil Jerram
2012-12-19 07:41:21 UTC
Permalink
Post by Arun Raghavan
[...]
Post by Neil Jerram
- load module-echo-cancel
- do "paplay -d
alsa_output.platform-soc-audio.0.analog-stereo.echo-cancel
/media/card/Documents/audio/ogg/Do\ They\ Know\ It\'s\ Christmas.ogg"
in one terminal
- do "parecord -d
alsa_input.platform-soc-audio.0.analog-stereo.echo-cancel
--file-format=wav > record1.wav" in another terminal
- speak into the microphone.
In general, to start with, you should pick a recording of voice rather
than music since that's the sort of echo that is designed to be
cancelled. I've noticed varying degrees of success for music with speex
and much better success with the webrtc canceller, but starting with the
basics is better.
Good point, thanks, I'll do that. Also I realise now that I really want
the entire process of in-call audio routing to be running at 8000 only -
because that's all I need for voice, and because I presume that should
take less power than involving higher rates.

Overall, for this phone, I have two audio scenarios.

- In-call audio, which can/should all be handled at 8000.

- Media playback outside calls, which I think should be at 44.1 kHz for
best quality.

Is it possible for a single instance of PulseAudio to switch between
those scenarios. If not, I think I can pretty easily stop and restart
PulseAudio when the scenario changes. (I'm guessing from your and
Tanu's other replies to me that I might need to restart with different
default-sample-rate settings, to get the best outcome and performance
for my two scenarios.)

Thanks,
Neil
Tanu Kaskinen
2012-12-20 07:25:42 UTC
Permalink
This post might be inappropriate. Click to display it.
Loading...