The Ogg encoder is slightly less optimal under this configuration. It
used to send shout data directly out of its ogg_page structures. Now,
in the interest of encapsulation, it copies the data from its ogg_page
structures into a buffer provided by the shout audio output plugin
(see audioOutput_shout_ogg.c, line 77.) I suspect the performance
impact is negligible.
[mk: this patch and its description was separated from Eric's patch
"Refactor and cleanup of shout Ogg and MP3 audio outputs"]
Begin dividing audioOutput_shout.c: move everything OGG Vorbis related
to audioOutput_shout_ogg.c. The header audioOutput_shout.h has to
keep its dependency on vorbis/vorbisenc.h, because it needs the vorbis
encoder types.
For this patch, we have to export several internal functions with
generic names to the ABI; these will be removed later when the encoder
plugin patches are merged.
Remove unused code which is in comments. Remove that comment about
"stolen code", since the plugin has changed much, and it isn't obvious
which parts are derived.
If the output device is already open, it may have modified
outAudioFormat; in this case, outAudioFormat is still valid, and does
not need an overwrite.
As long as the device isn't open, both attributes are not used. Since
they will both be initialized in audio_output_open(), we do not need
the initialization in audio_output_init().
Storing pointers to immutable audio_format structs isn't worth it,
because the struct itself isn't much larger than the pointer. Since
the shout plugin requires the user to configure a fixed audio format,
we can simply copy it in myShout_initDriver().
Save one allocation, since the whole audio_format struct is nearly the
same size as the pointer to it. Check audio_format_defined(af)
instead of af!=NULL.
free(NULL) isn't explicitly forbidden, but isn't exactly good style.
Check the rare case that the audio buffer isn't initialized yet in
closeAudioDevice(). In this case, we also don't have to call
flushAudioBuffer().
To make openAudioDevice() smaller and more readable, move code to a
static function. Also don't use realloc(), since the old value of the
buffer isn't needed anymore, saving a memcpy().
There are too many static variables in audio.c - organize all
properties of the audio buffer in a struct. The current audio format
is also a property of the buffer, since it describes the buffer's
data format.
audio_format_clear() sets an audio_format struct to an cleared
(undefined) state, which is both faster and smaller than memset(0).
audio_format_defined() checks if the audio_format struct actually has
a defined value (i.e. non-zero). Both can be used to avoid pointers
to audio_format, replacing the "NULL" value with an "undefined"
audio_format.
Since the caller chain doesn't care about the return value (except for
COMMAND_RETURN_KILL, COMMAND_RETURN_CLOSE), just return 0 if there is
nothing special. This saves one local variable initialization, and
one access to it.
Also remove one unreachable "return 1" from client_read().
Don't close the client within client_process_line(), return
COMMAND_RETURN_CLOSE instead. This is the signal for the caller chain
to actually close it. This makes dealing with the client pointer a
lot safer, since the caller always knows whether it is still valid.
The "!src" check in copyAudioFormat() used to hide bugs - one should
never pass NULL to it. There is one caller which might pass NULL, add
a check in this caller.
Instead of doing mempcy(), we can simply assign the structures, which
looks more natural.
The way we used non-blocking mode was HORRIBLE.
It was non-blocking to ALSA, but we end up blocking in a busy
loop that does absolutely NOTHING but retry. We don't check
for playback cancellation (like we do in decoders) or anything.
This is seriously broken and I can imagine it affects people on
fast CPUs more because we do asynchronous output buffering and
our ALSA device will always have data ready.
This is safer than the patch in
http://www.musicpd.org/mantis/view.php?id=1542
with multiple audio outputs enabled.
Sadly, I only noticed that patch/problem when I googled for
"snd_config_update_free_global"
Apparently snd_pcm_hw_params_can_resume() can return false even
though my hardware does in fact support resuming. So stop
carrying that value in the canResume flag and just try to resume
when we're in the suspended state; falling back to
snd_pcm_prepare only if resuming fails. libao does something
similar on resume, too.
While we're at it, use the E() macro which will enable us to
have better error reporting.
[mk: remove the E() macro stuff]
With a large music database, the linear string collection in
tagTracker.c becomes very slow. We implemented that in a
quick'n'dirty fashion when we removed tree.c, and now we rewrite it
using the fast hashed string set.
"struct strset" is a hashed string set: you can add strings to this
library, and it stores them as a set of unique strings. You can get
the size of the set, and you can enumerate through all values.
This will be used to replace the linear tagTracker library.
Instead of having to register each output plugin, store them
statically in an array. This eliminates the need for the List library
here, and saves some small allocations during startup.
Due to clumsy layout, the audio_format struct took 12 bytes. Move the
"channels" to the end, so it can be merged into the same 32 bit slot
as "bits", which reduces the struct size to 8 bytes.
print_playlist_result() had an assert(0) at the end, in case there was
an invalid result value. With NDEBUG, this resulted in a function not
returning a value - add a dummy "return -1" at the end to keep gcc
quiet.
Since all callers of song_id_exists() will map it to a song position
after the check, introduce a new function called song_id_to_position()
which performs both the check and the map lookup, including nice
assertions.
volatile provides absolutely no guarantee thread-safety in SMP
environments. volatile was designed to access memory locations
in peripheral hardware directly; not for SMP. If volatile is
needed to work properly on SMP, then it is only hiding subtle
bugs.
volatile only prevents the /compiler/ from making optimizations
when accessing variables. CPUs do their own optimizations at
runtime so it cannot guarantee registers of CPUs are flushed
to memory cache-coherent access on different CPUs.
Furthermore, the thread-communication via condition variables
between threads sharing audio formats already results in memory
barriers.
The tag pool is a shared global resource that is infrequently
modified. However, it can occasionally be modified by several
threads, especially by the metadata_pipe for streaming metadata
(both reading/writing).
The bulk tag_item pool is NOT locked as currently only the
update thread uses it.
Trying to read or remember
"tag->numOfItems * sizeof(*tag->items)"
requires too much thinking and mental effort on my part.
Also, favor "sizeof(struct mpd_tag)" over "sizeof(*tag->items)"
because the former is easier to read and follow, even though
the latter is easier to modify if the items member changes
to a different type.
The previous patch enabled these warnings. In Eric's branch, they
were worked around with a generic deconst_ptr() function. There are
several places where we can add "const" to pointers, and in others,
libraries want non-const strings. In the latter, convert string
literals to "static char[]" variables - this takes the same space, and
seems safer than deconsting a string literal.
All callers of fdprintf() have been converted to client_printf() or
fprintf(); it is time to remove this clumsy hack now. We can also
remove client_print() which took a file descriptor as parameter.
Now that we have removed all invocations of client_get_fd(), we can
safely remove this transitional function. All access to the file
descriptor is now hidden behind the interface declared in client.h.
The shared code in showPlaylist() isn't worth it, because we aim to
remove fdprintf(). Duplicate this small function, and enable stdio
buffering for saved playlists.
The function loadPlaylist() wants to report incremental errors to the
client, for this reason we cannot remove its protocol dependency right
now. Instead, make it use the client struct instead of the raw file
descriptor.
Don't pass the raw file descriptor around. This migration patch is
rather large, because all of the sources have inter dependencies - we
have to change all of them at the same time.
Pass the client struct to CommandHandlerFunction and
CommandListHandlerFunction. Most commands cannot take real advantage
of that yet, since most of them still work with the raw file
descriptor.
These two functions take a client struct instead of the file
descriptor. We will now begin passing the client struct around
instead of a raw file descriptor (which needed a linear lookup in the
client list to be useful).
This patch continues the work of the previous patch: don't pass a file
descriptor at all to traverseAllIn(). Since this fd was only used to
report "directory not found" errors, we can easily move that check to
the caller. This is a great relief, since it removes the dependency
on a client connection from a lot of enumeration functions.
Database traversal should be generic, and not bound to a client
connection. This is the first step: no file descriptor for the
callback functions forEachSong() and forEachDir(). If a callback
needs the file descriptor, it has to be passed in the void*data
pointer somehow; some callbacks might need a new struct for passing
more than one parameter. This might look a bit cumbersome right now,
but our goal is to have a clean API.
Continuing the effort of removing protocol specific calls from the
core libraries: let the command.c code call commandError() based on
PlaylistInfo's return value.
The playlist library shouldn't talk to the client if possible.
Introduce the "enum playlist_result" type which the caller
(i.e. command.c) may use to generate an error message.
Client's input values should be validated by the command
implementation, and the core libraries shouldn't talk to the client
directly if possible. Thus, setPlaylistRepeatStatus() and
setPlaylistRandomStatus() don't get the file descriptor, and cannot
fail (return void).
The function valid_playlist_name() checks the name, but it insists on
reporting an eventual error to the client. The new function
is_valid_playlist_name() is more generic: it just returns a boolean,
and does not care what the caller will use it for. The old function
valid_playlist_name() will be removed later.
Currently, when the tag cache is being serialized to hard disk, the
stdio buffer is flushed before every song, because tag_print.c
performs unbuffered writes on the raw file descriptor. Unfortunately,
the fdprintf() API allows buffered I/O only for a client connection by
looking up the client pointer owning the file descriptor - for stdio,
this is not possible. To re-enable proper stdio buffering, we have to
duplicate the tag_print.c code without fprintf() instead of our custom
fdprintf() hack. Add this duplicated code to tag_save.c.
Move everything which dumps song information (via tag_print.c) to a
separate source file. song_print.c gets code which writes song data
to the client; song_save.c is responsible for serializing songs from
the tag cache.
Based on client_puts(), client_printf() is the successor of
fdprintf(). As soon as all fdprintf() callers have been rewritten to
use client_printf(), we can remove fdprintf().
client_write() writes a buffer to the client and buffers it if
required. client_puts() does the same for a C string. The next patch
will add more tools which will replace fdprintf() later.
clearMpdTag could be called on a tag that was still in a
tag_begin_add transaction before tag_end_add is called. This
was causing free() to attempt to operate on bulk.items; which is
un-free()-able. Now instead we unmark the bulk.busy to avoid
committing the tags to the heap only to be immediately freed.
Additionally, we need to remember to call tag_end_add() when
a song is updated before we NULL song->tag to avoid tripping
an assertion the next time tag_begin_add() is called.
Since client->fd==-1 has become our "expired" flag, it may already be
-1 when client_close() is called. Don't assert that it is still
non-negative, and call client_set_expired() instead.
During the tag library refactoring, the shout plugin was disabled, and
I forgot about adapting it to the new API. Apply the same fixes to
the oggflac decoder plugin.
While parsing the tag cache, don't allocate the directory name from
the heap, but copy it into a buffer on the stack. This reduces heap
fragmentation by 1%.
If many tag_items are added at once while the tag cache is being
loaded, manage these items in a static fixed list, instead of
reallocating the list with every newly created item. This reduces
heap fragmentation.
Massif results again:
mk before: total 12,837,632; useful 10,626,383; extra 2,211,249
mk now: total 12,736,720; useful 10,626,383; extra 2,110,337
The "useful" value is the same since this patch only changes the way
we allocate the same amount of memory, but heap fragmentation was
reduced by 5%.
Try to detect if the string needs Latin1-UTF8 conversion, or
whitespace cleanup. If not, we don't need to allocate temporary
memory, leading to decreased heap fragmentation.
At several places, we create temporary copies of non-null-terminated
strings, just to use them in functions like validUtf8String(). We can
save this temporary allocation and avoid heap fragmentation if we
add a length parameter instead of expecting a null-terminated string.
Since the inline function cannot modify its caller's variables (which
is a good thing for code readability), the new string pointer is the
return value. The resulting binary should be the same as with the
macro.
The new source tag_pool.c manages a pool of reference counted tag_item
objects. This is used to merge tag items of the same type and value,
saving lots of memory. Formerly, only the value itself was pooled,
wasting memory for all the pointers and tag_item structs.
The following results were measured with massif. Started MPD on
amd64, typed "mpc", no song being played. My music database contains
35k tagged songs. The results are what massif reports as "peak".
0.13.2: total 14,131,392; useful 11,408,972; extra 2,722,420
eric: total 18,370,696; useful 15,648,182; extra 2,722,514
mk f34f694: total 15,833,952; useful 13,111,470; extra 2,722,482
mk now: total 12,837,632; useful 10,626,383; extra 2,211,249
This patch set saves 20% memory, and does a good job in reducing heap
fragmentation.
The value is stored in the same memory allocation as the tag_item
struct; this saves memory because we do not store the value pointer
anymore. Also remove the getTagItemString()/removeTagItemString()
dummies.
This patch makes MPD consume much more memory because string pooling
is disabled, but it prepares the next bunch of patches. Replace the
code in tagTracker.c with naive algorithms without the tree code. For
now, this should do; later we should find better algorithms,
especially for getNumberOfTagItems(), which has become wasteful with
temporary memory.
Unfortunately, the C standard postulates that the argument to free()
must be non-const. This does not makes sense, and virtually prevents
every pointer which must be freed at some time to be non-const. Use
the deconst hack (sorry for that) to allow us to free constant
pointers.
Instead of passing the pointer to the "expired" flag to
processListOfCommands(), this function should use the client API to
check this flag. We can now remove the "global_expired" hack
introduced recently.
Start exporting the client struct as an opaque struct. For now, pass
it only to processCommand() and processListOfCommands(), and provide a
function to extract the socket handle. Later, we will propagate the
pointer to all command implementations, and of course to
client_print() etc.
The old code tried to write a response to the client, without even
checking if it was already closed. Now that we have added more
assertions, these may fail... perform the "expired" check earlier.
Patch bdeb8e14 ("client: moved "expired" accesses into inline
function") was created under the wrong assumption that
processListOfCommands() could modify the expired flag, which is not
the case. Although "expired" is a non-const pointer,
processListOfCommands() just reads it, using it as the break condition
in a "while" loop. I will address this issue with a better overall
solution, but for now provide a pointer to a global "expired" flag.
client_defer_output() was modified so that it can create the
deferred_send list. With this patch, the assertion on
"deferred_send!=NULL" has become invalid. Remove it.
Previously, when select() failed, we assumed that there was an invalid
file descriptor in one of the client structs. Thus we tried select()
one by one. This is bogus, because we should never have invalid file
descriptors. Remove it, and make select() errors fatal.
All of the client's resources are freed in client_close(). It is
enough to set the "expired" flag, no need to duplicate lots of
destruction code again and again.
Due to the large buffers in the client struct, the static client array
eats several megabytes of RAM with a maximum of only 10 clients. Stop
this waste and allocate each client struct from the heap.
Second patch: rename the internal struct name. We will eventually
export this type as an opaque forward-declared struct later, so we
can pass a struct pointer instead of a file descriptor, which would
save us an expensive linear lookup.
I don't believe "interface" is a good name for something like
"connection by a client to MPD", let's call it "client". This is the
first patch in the series which changes the name, beginning with the
file name.
linux/list.h is a nice doubly linked list library - it is lightweight
and powerful at the same time. It will be useful later, when we begin
to allocate client structures dynamically. Import it, and strip out
all the stuff which we are not going to use.