Discussion:
[pulseaudio-discuss] VLC, PulseAudio and large tlengths
David Henningsson
2011-08-19 18:02:59 UTC
Permalink
I've spent most the afternoon trying to figure out why VLC doesn't work
well with a large tlengths. I seem to have found suboptimal behaviour on
both the PulseAudio and VLC sides.

What bothers me on the PulseAudio side is this call (in alsa-sink.c,
mmap_write):

pa_sink_render_into_full(u->sink, &chunk);

For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.

However, things are probably not as bad as it looks. If a new package
comes in from the client in time, I believe PulseAudio would rewind back
the 70 ms of silence and write the new data, and no glitch will be
heard. So the worst thing is actually the somewhat "false alarm" sent to
the client.

However messing with PulseAudio's buffering mechanisms isn't giving me
warm and fuzzy feelings, at least not right before the 1.0 release :-D

So over to the VLC side. I started off with the current git head of VLC.

For the synchronisation, I believe the correct way is to do something like:
1) when the first packet arrives, notice its timestamp (pts), and set a
system timer to trigger at that point in time (i e trigger in i->pts -
mdate() usecs)
2) the callback from the system timer would then uncork/trigger the stream.
At that point, PulseAudio's buffer has been filled up by all the other
calls to Play that happened in between.

I did a quick hack myself: I didn't know how to do system timers in VLC
so I set it to check at every call to Play, if it was yet time to start
the stream. (And commented out the call to stream_resync.) That gave
good synchronisation as well as I could see (being layman on observing
synchronisation issues).

For the buffering attributes, I tried setting tlength to 500 ms (note:
AOUT_MAX_PREPARE_TIME is actually 2000 ms, not 500 ms as I originally
thought).
Given an initial filled buffer as suggested above, that did not
underrun. That was with playing back a local video file.
I set minreq to AOUT_MIN_PREPARE_TIME (40 ms), which is mostly taken out
of the air.

However, given the reasoning above, if you want to be certain to avoid
the false underrun alarms as outlined in the PulseAudio section, I
believe a minreq of AOUT_MIN_PREPARE_TIME and tlength of
AOUT_MIN_PREPARE_TIME * 4 = 160 ms should be a relatively safe setting.

Also remember to set the PA_STREAM_ADJUST_LATENCY flag.

Hopefully this gives a little insight in the current problems with VLC
and PulseAudio!
--
David Henningsson, Canonical Ltd.
http://launchpad.net/~diwic
Pierre-Louis Bossart
2011-08-19 18:54:20 UTC
Permalink
Post by David Henningsson
For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.
What is the value of the prebuf field here? If the playback started when the
buffer contains 200ms you would not see any underrun or rewind?
-Pierre
Rémi Denis-Courmont
2011-08-19 20:34:40 UTC
Permalink
Post by Pierre-Louis Bossart
Post by David Henningsson
For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.
What is the value of the prebuf field here? If the playback started when
the buffer contains 200ms you would not see any underrun or rewind?
In current VLC, prebuf is 0. Trigger is manual.
--
R?mi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis
Rémi Denis-Courmont
2011-08-19 20:34:02 UTC
Permalink
Hello,
Post by David Henningsson
I've spent most the afternoon trying to figure out why VLC doesn't work
well with a large tlengths. I seem to have found suboptimal behaviour on
both the PulseAudio and VLC sides.
Nice.
Post by David Henningsson
What bothers me on the PulseAudio side is this call (in alsa-sink.c,
pa_sink_render_into_full(u->sink, &chunk);
For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.
However, things are probably not as bad as it looks. If a new package
comes in from the client in time, I believe PulseAudio would rewind back
the 70 ms of silence and write the new data, and no glitch will be
heard. So the worst thing is actually the somewhat "false alarm" sent to
the client.
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio stream...
this is not good if there was no actual under-run :-/
Post by David Henningsson
However messing with PulseAudio's buffering mechanisms isn't giving me
warm and fuzzy feelings, at least not right before the 1.0 release :-D
So over to the VLC side. I started off with the current git head of VLC.
1) when the first packet arrives, notice its timestamp (pts), and set a
system timer to trigger at that point in time (i e trigger in i->pts -
mdate() usecs)
Yeah. That would certainly be better than the current zero padding of the
stream, especially when resuming from pause (i.e. PulseAudio uncorking).

But but, VLC does not have a mainloop. A timer is going to need a dedicated
thread. Or maybe libpulse can accept user timers in its threaded mainloop?
Post by David Henningsson
2) the callback from the system timer would then uncork/trigger the stream.
At that point, PulseAudio's buffer has been filled up by all the other
calls to Play that happened in between.
I did a quick hack myself: I didn't know how to do system timers in VLC
so I set it to check at every call to Play, if it was yet time to start
the stream. (And commented out the call to stream_resync.) That gave
good synchronisation as well as I could see (being layman on observing
synchronisation issues).
vlc_timer_*() functions. But I'd rather use the Pulse mainloop if possible.
Post by David Henningsson
AOUT_MAX_PREPARE_TIME is actually 2000 ms, not 500 ms as I originally
thought).
Given an initial filled buffer as suggested above, that did not
underrun. That was with playing back a local video file.
I set minreq to AOUT_MIN_PREPARE_TIME (40 ms), which is mostly taken out
of the air.
However, given the reasoning above, if you want to be certain to avoid
the false underrun alarms as outlined in the PulseAudio section, I
believe a minreq of AOUT_MIN_PREPARE_TIME and tlength of
AOUT_MIN_PREPARE_TIME * 4 = 160 ms should be a relatively safe setting.
In my experience, larger tlength caused more underruns. But maybe that's
because VLC is more likely to be continuously late?
Post by David Henningsson
Also remember to set the PA_STREAM_ADJUST_LATENCY flag.
It's not clear to me what this actually will do.
Post by David Henningsson
Hopefully this gives a little insight in the current problems with VLC
and PulseAudio!
--
R?mi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis
David Henningsson
2011-08-20 07:33:18 UTC
Permalink
Post by Rémi Denis-Courmont
Hello,
Post by David Henningsson
I've spent most the afternoon trying to figure out why VLC doesn't work
well with a large tlengths. I seem to have found suboptimal behaviour on
both the PulseAudio and VLC sides.
Nice.
Post by David Henningsson
What bothers me on the PulseAudio side is this call (in alsa-sink.c,
pa_sink_render_into_full(u->sink,&chunk);
For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.
However, things are probably not as bad as it looks. If a new package
comes in from the client in time, I believe PulseAudio would rewind back
the 70 ms of silence and write the new data, and no glitch will be
heard. So the worst thing is actually the somewhat "false alarm" sent to
the client.
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio stream...
this is not good if there was no actual under-run :-/
Agreed. PulseAudio should not send the underrun message if there is a
possibility that the client can avoid the underrun by sending more data.

Fixing that is quite complex though. :-(
Post by Rémi Denis-Courmont
Post by David Henningsson
However messing with PulseAudio's buffering mechanisms isn't giving me
warm and fuzzy feelings, at least not right before the 1.0 release :-D
So over to the VLC side. I started off with the current git head of VLC.
1) when the first packet arrives, notice its timestamp (pts), and set a
system timer to trigger at that point in time (i e trigger in i->pts -
mdate() usecs)
Yeah. That would certainly be better than the current zero padding of the
stream, especially when resuming from pause (i.e. PulseAudio uncorking).
But but, VLC does not have a mainloop. A timer is going to need a dedicated
thread. Or maybe libpulse can accept user timers in its threaded mainloop?
Yes you can, use the
pa_threaded_mainloop_get_api(vlc_pa_mainloop)->time_new() [1] function
to start a timer.
Post by Rémi Denis-Courmont
Post by David Henningsson
2) the callback from the system timer would then uncork/trigger the stream.
At that point, PulseAudio's buffer has been filled up by all the other
calls to Play that happened in between.
I did a quick hack myself: I didn't know how to do system timers in VLC
so I set it to check at every call to Play, if it was yet time to start
the stream. (And commented out the call to stream_resync.) That gave
good synchronisation as well as I could see (being layman on observing
synchronisation issues).
vlc_timer_*() functions. But I'd rather use the Pulse mainloop if possible.
Post by David Henningsson
AOUT_MAX_PREPARE_TIME is actually 2000 ms, not 500 ms as I originally
thought).
Given an initial filled buffer as suggested above, that did not
underrun. That was with playing back a local video file.
I set minreq to AOUT_MIN_PREPARE_TIME (40 ms), which is mostly taken out
of the air.
However, given the reasoning above, if you want to be certain to avoid
the false underrun alarms as outlined in the PulseAudio section, I
believe a minreq of AOUT_MIN_PREPARE_TIME and tlength of
AOUT_MIN_PREPARE_TIME * 4 = 160 ms should be a relatively safe setting.
In my experience, larger tlength caused more underruns. But maybe that's
because VLC is more likely to be continuously late?
I think this is mostly because the "false alarm" problem on the
PulseAudio side. However, the problem is not linear, and the boundary
for where this starts to happen should be when tlength =
AOUT_MIN_PREPARE_TIME * 4 if minreq = AOUT_MIN_PREPARE_TIME.
Post by Rémi Denis-Courmont
Post by David Henningsson
Also remember to set the PA_STREAM_ADJUST_LATENCY flag.
It's not clear to me what this actually will do.
Without knowing all the details myself, there are differences in how the
buffering is done. The calculations above are based on setting the
PA_STREAM_ADJUST_LATENCY flag, which is also what most other
applications do these days. If the flag is not set things are calculated
differently - the hw stuff is done in larger chunks, which means more
false underruns (or rather, the boundary for when this starts to happen
is lower), and I *think* (not sure) that you're also more likely to be
affected by other streams (i e buffering behaviour changes when other
streams are played in parallel to yours).
Post by Rémi Denis-Courmont
Post by David Henningsson
Hopefully this gives a little insight in the current problems with VLC
and PulseAudio!
--
David Henningsson, Canonical Ltd.
http://launchpad.net/~diwic

[1]
http://www.freedesktop.org/software/pulseaudio/doxygen/structpa__mainloop__api.html
Tanu Kaskinen
2011-08-20 15:46:28 UTC
Permalink
Post by David Henningsson
Post by Rémi Denis-Courmont
Hello,
Post by David Henningsson
I've spent most the afternoon trying to figure out why VLC doesn't work
well with a large tlengths. I seem to have found suboptimal behaviour on
both the PulseAudio and VLC sides.
Nice.
Post by David Henningsson
What bothers me on the PulseAudio side is this call (in alsa-sink.c,
pa_sink_render_into_full(u->sink,&chunk);
For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.
However, things are probably not as bad as it looks. If a new package
comes in from the client in time, I believe PulseAudio would rewind back
the 70 ms of silence and write the new data, and no glitch will be
heard. So the worst thing is actually the somewhat "false alarm" sent to
the client.
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio stream...
this is not good if there was no actual under-run :-/
Agreed. PulseAudio should not send the underrun message if there is a
possibility that the client can avoid the underrun by sending more data.
Why not? It sounds like you'd want to define "underrun" differently from
what it's currently defined as. Currently an underrun means that there
was not enough data in the stream buffer to satisfy the sink's request
when it wanted to fill its buffer. I'm not saying that the current
definition is the best possible, but I don't see anything obviously
wrong in it either.

If your explanation of the sink latency calculation is correct, then it
sounds like underruns with a reasonably high tlength (like 500 ms) and
reasonably low minreq (like 50 ms) should be rare. If VLC is having
constant underruns, that sounds like a problem at VLC's end. The sink
will never request more than 200 ms at a time. The worst case is if the
stream buffer contains 451 ms worth of audio (no request sent to VLC
yet), and the sink asks for the full 200 ms amount. After that the
buffer will contain 251 ms, and VLC will get a request to send 249 ms
worth of audio. VLC will at the very least have 251 ms margin to send
the data. That doesn't sound like a difficult target to achieve.

If VLC would support higher tlengths than 500 ms, it would be even
easier to avoid underruns. I would guess that the the minimum margin of
251 ms isn't a coincidence - the reaction time given to clients is
probably never less than tlength / 2 (so maybe minreq doesn't even play
a significant role in avoiding underruns?).

If VLC assumes that an underrun message means silence/glitch, it's a bug
in VLC, at least until someone changes the definition that Pulseaudio
uses.

Also it sounds like making things more complicated than necessary if VLC
doesn't use Pulseaudio's prebuffering feature, but corks manually the
stream during prebuffering. But maybe there are valid reasons for that.
--
Tanu
Rémi Denis-Courmont
2011-08-20 16:15:13 UTC
Permalink
Post by Tanu Kaskinen
Post by David Henningsson
Post by Rémi Denis-Courmont
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio
stream... this is not good if there was no actual under-run :-/
Agreed. PulseAudio should not send the underrun message if there is a
possibility that the client can avoid the underrun by sending more data.
Why not? It sounds like you'd want to define "underrun" differently from
what it's currently defined as.
An audio underrun is a situation whereby the next sample is not available by
the time that it is needed. That is the One And Only definition.

Getting fewer samples that you would ideally wish for, but still enough to
work properly is simply not an underrun.
Post by Tanu Kaskinen
Currently an underrun means that there was not enough data in the
stream buffer to satisfy the sink's request when it wanted to fill its
buffer. I'm not saying that the current definition is the best possible,
but I don't see anything obviously
wrong in it either.
Then I'm sorry for you. Go get yourself an English dictionary.
Post by Tanu Kaskinen
If VLC assumes that an underrun message means silence/glitch, it's a
bug in VLC,
Are you kidding me? Is this that the level of hypocrisy that I should expect
when dealing with PulseAudio?
Post by Tanu Kaskinen
at least until someone changes the definition that
Pulseaudio uses.
--
R?mi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis
Tanu Kaskinen
2011-08-20 17:20:27 UTC
Permalink
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
Post by David Henningsson
Post by Rémi Denis-Courmont
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio
stream... this is not good if there was no actual under-run :-/
Agreed. PulseAudio should not send the underrun message if there is a
possibility that the client can avoid the underrun by sending more data.
Why not? It sounds like you'd want to define "underrun" differently from
what it's currently defined as.
An audio underrun is a situation whereby the next sample is not available by
the time that it is needed. That is the One And Only definition.
To me it seems like your definition is compatible with my definition. An
underrun message is sent when there's no audio in the stream buffer when
it's needed by the sink. There just happens to be period of time after
the underrun when a glitch can still be avoided by rewriting the sink
buffer.
Post by Rémi Denis-Courmont
Getting fewer samples that you would ideally wish for, but still enough to
work properly is simply not an underrun.
If you think it's against the English dictionary to use the term
"underrun" when the stream buffer runs empty, maybe we should use some
other term then, and declare that Pulseaudio doesn't support underrun
reporting. But in any case, I don't see why a reporting a
non-recoverable underrun would be significantly more important than
reporting a maybe-recoverable underrun. Both cases are unexpected during
normal operation, and the client is advised to consider increasing the
stream buffer size.
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
Currently an underrun means that there was not enough data in the
stream buffer to satisfy the sink's request when it wanted to fill its
buffer. I'm not saying that the current definition is the best possible,
but I don't see anything obviously
wrong in it either.
Then I'm sorry for you. Go get yourself an English dictionary.
Post by Tanu Kaskinen
If VLC assumes that an underrun message means silence/glitch, it's a
bug in VLC,
Are you kidding me? Is this that the level of hypocrisy that I should expect
when dealing with PulseAudio?
No, I was not kidding. I didn't think I'd be offensive either, but maybe
I came across as rude. Sorry about that. If the term "underrun" causes
you to do invalid assumptions about Pulseaudio's internal behavior, then
the term may be wrong (which you seem to claim), or the documentation
may be lacking.
--
Tanu
Rémi Denis-Courmont
2011-08-20 17:45:26 UTC
Permalink
Post by Tanu Kaskinen
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
Why not? It sounds like you'd want to define "underrun" differently
from what it's currently defined as.
An audio underrun is a situation whereby the next sample is not available
by the time that it is needed. That is the One And Only definition.
To me it seems like your definition is compatible with my definition. An
underrun message is sent when there's no audio in the stream buffer when
it's needed by the sink. There just happens to be period of time after
the underrun when a glitch can still be avoided by rewriting the sink
buffer.
Sorry... You could maybe argue that an "underrun" is a general situation
whereby the buffer is lower than it should because the fill speed is slower
than the consumption speed.

But the libpulse API uses the term "underflow". A buffer underflow is an
_empty_ buffer, not merely a less than optimally filled buffer.
Post by Tanu Kaskinen
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
Currently an underrun means that there was not enough data in the
stream buffer to satisfy the sink's request when it wanted to fill its
buffer. I'm not saying that the current definition is the best possible,
but I don't see anything obviously
wrong in it either.
Then I'm sorry for you. Go get yourself an English dictionary.
Post by Tanu Kaskinen
If VLC assumes that an underrun message means silence/glitch, it's a
bug in VLC,
Are you kidding me? Is this that the level of hypocrisy that I should
expect when dealing with PulseAudio?
No, I was not kidding. I didn't think I'd be offensive either, but maybe
I came across as rude. Sorry about that. If the term "underrun" causes
you to do invalid assumptions about Pulseaudio's internal behavior, then
the term may be wrong (which you seem to claim), or the documentation
may be lacking.
The only way that there could be no glitch is if PulseAudio would do some
black magic to extrapolate the missing samples, in the event of an
"underflow". I don't think this is what happens?

Anyway. Ignoring underflows is not really an option. When they do _really_
happen, e.g. due to serious scheduling problems, VLC has to resynchronize the
stream somehow. I don't see any solution other than down-sampling or padding,
or is there?

Now, if we assume there was an "underflow", I think it's sane to assume that
there will be a glitch. One glitch sounds bad, but it sounds better (IMHO)
than both a glitch and then temporary downsampling... That's the rationale.

It's also a lot less CPU intensive to not resample.
--
R?mi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis
Tanu Kaskinen
2011-08-20 18:46:16 UTC
Permalink
Post by Rémi Denis-Courmont
Sorry... You could maybe argue that an "underrun" is a general situation
whereby the buffer is lower than it should because the fill speed is slower
than the consumption speed.
But the libpulse API uses the term "underflow". A buffer underflow is an
_empty_ buffer, not merely a less than optimally filled buffer.
Well, when an underflow happens, the stream buffer really is empty. But
there are two buffers: the stream buffer and the sink buffer. As long as
the data that has been previously moved from the stream buffer to the
sink buffer doesn't run out too, glitches are avoidable. Writing to the
stream buffer during an underflow causes the written data to be
immediately moved to the sink buffer, continuing exactly where the
previously written data ended.
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
Currently an underrun means that there was not enough data in the
stream buffer to satisfy the sink's request when it wanted to fill its
buffer. I'm not saying that the current definition is the best possible,
but I don't see anything obviously
wrong in it either.
Then I'm sorry for you. Go get yourself an English dictionary.
Post by Tanu Kaskinen
If VLC assumes that an underrun message means silence/glitch, it's a
bug in VLC,
Are you kidding me? Is this that the level of hypocrisy that I should
expect when dealing with PulseAudio?
No, I was not kidding. I didn't think I'd be offensive either, but maybe
I came across as rude. Sorry about that. If the term "underrun" causes
you to do invalid assumptions about Pulseaudio's internal behavior, then
the term may be wrong (which you seem to claim), or the documentation
may be lacking.
The only way that there could be no glitch is if PulseAudio would do some
black magic to extrapolate the missing samples, in the event of an
"underflow". I don't think this is what happens?
Definetely not. I think I explained above how the gliches are avoided
when underflows happen.
Post by Rémi Denis-Courmont
Anyway. Ignoring underflows is not really an option. When they do _really_
happen, e.g. due to serious scheduling problems, VLC has to resynchronize the
stream somehow. I don't see any solution other than down-sampling or padding,
or is there?
By resynchronizing I guess you mean maintaining A-V sync? I have never
been in contact with code that implements A-V sync, so I can't speak
very confidently here, but I think the idea is that you can at any time
query the current playback latency (fixed hardware latency + currently
buffered data) and use this information to schedule the video frames. In
case of a "real" underrun, the reported latency stays constant even
though you don't send any audio. In other words, the audio stream time
stops for the duration of the gap in the audio output. Since the audio
time stays stopped, your video frames get delayed accordingly, because
they are scheduled using the audio clock instead of the wall clock.

I'm not sure how downsampling is relevant here. Is the video being
synchronized to the wall clock instead of the audio clock and you need
to make the audio stream go faster to catch up with the video stream? If
that's the case, don't you run into trouble with the wall clock and the
audio clock drifting apart (the sound card most likely doesn't run
exactly at 48000 Hz even if it claims to do so)?
--
Tanu
Rémi Denis-Courmont
2011-08-20 20:31:09 UTC
Permalink
Post by Tanu Kaskinen
Post by Rémi Denis-Courmont
Anyway. Ignoring underflows is not really an option. When they do
_really_ happen, e.g. due to serious scheduling problems, VLC has to
resynchronize the stream somehow. I don't see any solution other than
down-sampling or padding, or is there?
By resynchronizing I guess you mean maintaining A-V sync?
Mostly lip sync, yes.
Post by Tanu Kaskinen
(...) I think the idea is that you can at any time query the current
playback latency (fixed hardware latency + currently buffered data)
and use this information to schedule the video frames.
That would arguably be the best way to implement a video file player.
But the display vertical refresh is an alternative master clock. In the first
case, you may need to drop or duplicate frames. In the second case, you may
need to resample the audio signal.

Anyway VLC is built with live playback in mind (it started as a DVB-IP
receiver afterall). VLC uses to the input signal as the master clock (or the
CPU monotonic clock by default). I believe gstreamer uses a similar logic
though I have not checked. In fact, that is the only practical option if the
receiver does not control the input pace.

So the audio can and does drift. This is compensated through resampling.
Normally VLC would do it internally. Now the PulseAudio is unique among VLC
audio outputs insofar as PulseAudio resamples on VLC behalf. David suggested
that a while ago.
Post by Tanu Kaskinen
I'm not sure how downsampling is relevant here. Is the video being
synchronized to the wall clock instead of the audio clock and you need
to make the audio stream go faster to catch up with the video stream?
Currently, VLC tolerates 40 ms advance and 60 ms delay as per EBU
Recommendation 37. If a PulseAudio latency update indicates that playback does
not fall within that 100 ms sliding window, VLC changes the sample rate to try
to restore synchronization without glitch.

It is thus essential that the stream gets triggered approximately on time,
whether that is initially, upon resuming from pause, or upon recovering from
underflow. Otherwise, resampling kicks in and you get to hear Doppler.
Post by Tanu Kaskinen
If
that's the case, don't you run into trouble with the wall clock and the
audio clock drifting apart (the sound card most likely doesn't run
exactly at 48000 Hz even if it claims to do so)?
Oh yeah. We do. But that's unavoidable in the general case.
--
R?mi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis
Tanu Kaskinen
2011-08-21 05:13:42 UTC
Permalink
Post by Rémi Denis-Courmont
Post by Tanu Kaskinen
(...) I think the idea is that you can at any time query the current
playback latency (fixed hardware latency + currently buffered data)
and use this information to schedule the video frames.
That would arguably be the best way to implement a video file player.
But the display vertical refresh is an alternative master clock. In the first
case, you may need to drop or duplicate frames. In the second case, you may
need to resample the audio signal.
Anyway VLC is built with live playback in mind (it started as a DVB-IP
receiver afterall). VLC uses to the input signal as the master clock (or the
CPU monotonic clock by default). I believe gstreamer uses a similar logic
though I have not checked. In fact, that is the only practical option if the
receiver does not control the input pace.
Right. Makes sense.
Post by Rémi Denis-Courmont
So the audio can and does drift. This is compensated through resampling.
Normally VLC would do it internally. Now the PulseAudio is unique among VLC
audio outputs insofar as PulseAudio resamples on VLC behalf. David suggested
that a while ago.
Post by Tanu Kaskinen
I'm not sure how downsampling is relevant here. Is the video being
synchronized to the wall clock instead of the audio clock and you need
to make the audio stream go faster to catch up with the video stream?
Currently, VLC tolerates 40 ms advance and 60 ms delay as per EBU
Recommendation 37. If a PulseAudio latency update indicates that playback does
not fall within that 100 ms sliding window, VLC changes the sample rate to try
to restore synchronization without glitch.
It is thus essential that the stream gets triggered approximately on time,
whether that is initially, upon resuming from pause, or upon recovering from
underflow. Otherwise, resampling kicks in and you get to hear Doppler.
The resampling shouldn't be hearable if it speeds up the stream at most
2% (or at least there are such comments in Pulseaudio source code where
similar adaptive resampling is done). I'd guess slight resampling would
be good for small drop-outs. For longer gaps the catch-up time might be
too long and the initial difference between video and audio too
noticeable, so dropping some audio would be better to get done with it
quickly.

So, maybe the strategy would be just to monitor the timing reports from
Pulseaudio and if the audio starts to lag, depending on the delay either
resample slightly or drop audio.

With this strategy I guess the problem is that if you drop audio at a
"random" time, after a severe underrun there will be two glitches: first
the gap and then a short period of audio continuing from where it was
before the gap, and then the audio skips as the synchronization gets
fixed. If I've understood correctly, you'd like to implement the
underrun recovery so that audio is dropped immediately after the gap in
output, so there would be only one glitch. This might or might not be
somehow doable with stream underruns, but it's too complex for me to try
to come up with a solution. I don't think it's a very bad bug if the
recovery from a severe underrun isn't as smooth as possible.

In case of a sink underrun (scheduling problem at pulseaudio's end - we
don't fill the hw buffer in time) you won't get any notification about
the gap anyway (beyond the timing info), so there's nothing you can do
but drop audio at a random time (or resample).
--
Tanu
Maarten Bosmans
2011-08-21 07:24:48 UTC
Permalink
Post by Tanu Kaskinen
Post by Rémi Denis-Courmont
It is thus essential that the stream gets triggered approximately on time,
whether that is initially, upon resuming from pause, or upon recovering from
underflow. Otherwise, resampling kicks in and you get to hear Doppler.
The resampling shouldn't be hearable if it speeds up the stream at most
2% (or at least there are such comments in Pulseaudio source code where
similar adaptive resampling is done). I'd guess slight resampling would
be good for small drop-outs. For longer gaps the catch-up time might be
too long and the initial difference between video and audio too
noticeable, so dropping some audio would be better to get done with it
quickly.
Actually, that it 2 pro-mille and refers to the jump in sample rate
that should be inaudible. The total difference from the original rate
that can't be detected is really dependent on thinks like type of
source material, absolute pitch of the listener, and whether the audio
is playing alone or together with something else. For pulseaudio I
limited to the total deviation from the original rate to 25%, which is
certainly audible, but it should get there slowly enough that you
can't hear any doppler-like effects.

Maarten
David Henningsson
2011-08-21 11:41:59 UTC
Permalink
Post by Tanu Kaskinen
Post by David Henningsson
Post by Rémi Denis-Courmont
Hello,
Post by David Henningsson
I've spent most the afternoon trying to figure out why VLC doesn't work
well with a large tlengths. I seem to have found suboptimal behaviour on
both the PulseAudio and VLC sides.
Nice.
Post by David Henningsson
What bothers me on the PulseAudio side is this call (in alsa-sink.c,
pa_sink_render_into_full(u->sink,&chunk);
For this example, assume tlength is 500 ms and minreq is 50 ms. In
adjust latency mode (which I understand is recommended for power
efficiency), this is configured to the client's tlength/2 - minreq = 200
ms. The problem here is that if the client is filled up to only e g 130
ms, PulseAudio will take the 130 ms, the client will underrun, and hand
out 70 ms of silence. A better behaviour would be to write the 130 ms
that are available, and go to sleep until the 130 ms is almost up and
see if more data has come in at that time.
However, things are probably not as bad as it looks. If a new package
comes in from the client in time, I believe PulseAudio would rewind back
the 70 ms of silence and write the new data, and no glitch will be
heard. So the worst thing is actually the somewhat "false alarm" sent to
the client.
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio stream...
this is not good if there was no actual under-run :-/
Agreed. PulseAudio should not send the underrun message if there is a
possibility that the client can avoid the underrun by sending more data.
Why not? It sounds like you'd want to define "underrun" differently from
what it's currently defined as. Currently an underrun means that there
was not enough data in the stream buffer to satisfy the sink's request
when it wanted to fill its buffer. I'm not saying that the current
definition is the best possible, but I don't see anything obviously
wrong in it either.
Seen from a libpulse user's perspective, I think it would make more
sense to report when there is an underrun in that sense that a glitch is
unavoidable, rather than the current handling.

That might be a redefinition compared to what PulseAudio currently mean
with underflow/underrun, but I think such a change in definition would
be more useful, and also more in line with what people in general would
expect.

(Just to complicate things further, add to the equation that some sink
types do not support rewinding. But in such cases I think the proposed
definition is actually equivalent to the current definition.)
Post by Tanu Kaskinen
If your explanation of the sink latency calculation is correct, then it
sounds like underruns with a reasonably high tlength (like 500 ms) and
reasonably low minreq (like 50 ms) should be rare. If VLC is having
constant underruns, that sounds like a problem at VLC's end. The sink
will never request more than 200 ms at a time. The worst case is if the
stream buffer contains 451 ms worth of audio (no request sent to VLC
yet), and the sink asks for the full 200 ms amount. After that the
buffer will contain 251 ms, and VLC will get a request to send 249 ms
worth of audio. VLC will at the very least have 251 ms margin to send
the data. That doesn't sound like a difficult target to achieve.
There was a problem in that 251 ms of data was not supplied before
stream start, which lead to underrun reports. This was what I suggested
to improve in VLC.

However, from what I've been told about VLC, the latency (e g with live
streaming) is usually around 500 ms, but can drop down to 40 ms
occasionally, e g with late packet arrival from the network. And I was
hoping that PulseAudio could cope with that, without having to choose a
tlength at 40 ms (or 40 ms * 2, 3 or 4 depending on
PA_STREAM_ADJUST_LATENCY, minreq, etc).
Post by Tanu Kaskinen
If VLC would support higher tlengths than 500 ms, it would be even
easier to avoid underruns. I would guess that the the minimum margin of
251 ms isn't a coincidence - the reaction time given to clients is
probably never less than tlength / 2 (so maybe minreq doesn't even play
a significant role in avoiding underruns?).
Yeah, usually minreq is low enough to be insignificant.
Post by Tanu Kaskinen
If VLC assumes that an underrun message means silence/glitch, it's a bug
in VLC, at least until someone changes the definition that Pulseaudio
uses.
Or - seen from the other side - it would be a bug, or at least lack of
feature, in PulseAudio, to not have underrun reporting in a way that's
useful to VLC. Anyway, that's not really relevant to me - for me finding
a scapegoat is not the point here, the point is that we both want happy
end users to enjoy all nice features of VLC and PulseAudio together,
without having glitches or other problems with their audio :-)
Post by Tanu Kaskinen
Also it sounds like making things more complicated than necessary if VLC
doesn't use Pulseaudio's prebuffering feature, but corks manually the
stream during prebuffering. But maybe there are valid reasons for that.
Yes, A/V synchronisation (in combination with not having the audio clock
as master clock) is what requires the stream to start at a specific
point in time.
--
David Henningsson, Canonical Ltd.
http://launchpad.net/~diwic
Tanu Kaskinen
2011-08-22 05:08:34 UTC
Permalink
Post by David Henningsson
Post by Tanu Kaskinen
Why not? It sounds like you'd want to define "underrun" differently from
what it's currently defined as. Currently an underrun means that there
was not enough data in the stream buffer to satisfy the sink's request
when it wanted to fill its buffer. I'm not saying that the current
definition is the best possible, but I don't see anything obviously
wrong in it either.
Seen from a libpulse user's perspective, I think it would make more
sense to report when there is an underrun in that sense that a glitch is
unavoidable, rather than the current handling.
That might be a redefinition compared to what PulseAudio currently mean
with underflow/underrun, but I think such a change in definition would
be more useful, and also more in line with what people in general would
expect.
What is the reason for the new definition to be more useful? Is it that
VLC would like to stop the stream when a glitch happens, throw away some
audio from exactly after the point where the glitch happened, and then
continue playing from a well-known point? So it's not enough to make the
underflow messages happen only when a real glitch happens, the stream
must also be automatically corked.

What about hw buffer underflows? Currently those are not reported at
all, even though I think VLC would like to use the same recovery logic
in those cases too. If stopping the stream is required in stream
underflow cases to properly handle the glitch, I guess the whole sink
needs to be stopped in case of an alsa underflow? Hmmm... no, that
should not be required. Only those streams' data needs to be rewound out
from the hw buffer that want to use "the better way" of handling
underflows.

To me the benefits of doing this don't sound very big, but I probably
won't oppose if someone wants to implement the needed changes.
Post by David Henningsson
Post by Tanu Kaskinen
If your explanation of the sink latency calculation is correct, then it
sounds like underruns with a reasonably high tlength (like 500 ms) and
reasonably low minreq (like 50 ms) should be rare. If VLC is having
constant underruns, that sounds like a problem at VLC's end. The sink
will never request more than 200 ms at a time. The worst case is if the
stream buffer contains 451 ms worth of audio (no request sent to VLC
yet), and the sink asks for the full 200 ms amount. After that the
buffer will contain 251 ms, and VLC will get a request to send 249 ms
worth of audio. VLC will at the very least have 251 ms margin to send
the data. That doesn't sound like a difficult target to achieve.
There was a problem in that 251 ms of data was not supplied before
stream start, which lead to underrun reports. This was what I suggested
to improve in VLC.
However, from what I've been told about VLC, the latency (e g with live
streaming) is usually around 500 ms, but can drop down to 40 ms
occasionally, e g with late packet arrival from the network. And I was
hoping that PulseAudio could cope with that, without having to choose a
tlength at 40 ms (or 40 ms * 2, 3 or 4 depending on
PA_STREAM_ADJUST_LATENCY, minreq, etc).
My suggestion is to just skip some audio when VLC notices the stream
lagging more than what is practical to correct with resampling, and
ignore the underflow messages if VLC doesn't want to increase the
tlength when underflows happen (maybe it's somehow important that live
streams have latency of 500 ms at most). With the current implementation
I think the underflow messages are only useful for clients that react to
them by increasing the buffer size (and for alsa applications that do
stream draining in a silly way).
--
Tanu
Rémi Denis-Courmont
2011-08-20 17:53:32 UTC
Permalink
Post by David Henningsson
Post by Rémi Denis-Courmont
VLC currently assumes that a PulseAudio under-run event implies a
silence/glitch. It uses it as an opportunity to resync the audio
stream... this is not good if there was no actual under-run :-/
Agreed. PulseAudio should not send the underrun message if there is a
possibility that the client can avoid the underrun by sending more data.
Fixing that is quite complex though. :-(
VLC could ignore underflow events. But I fear this will cause Doppler effect
when VLC resorts to resampling to catch up instead.
--
R?mi Denis-Courmont
http://www.remlab.net/
http://fi.linkedin.com/in/remidenis
Loading...