Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First implementation of the visualization output plugin #1910

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

sp1ff
Copy link
Contributor

@sp1ff sp1ff commented Dec 10, 2023

NOTE TO REVIEWERS: this was discussed previously, but life got busy.

This commit contains the first implementation of a visualization output plugin. It servers a very simple network protocol, backed by a very simple audio analysis. I'm using it as my daily driver, but it needs review & hardening.

Update: Converting this to a draft while I work through the CI issues. Will squash all commits before marking "ready for review".

Update #2: CI passing, unit test suite considerably beefed-up. Using this build as my daily driver. @MaxKellermann ready for review.

@MaxKellermann
Copy link
Member

Hey, I'm happy you continued that effort. That is certainly a useful feature for many MPD users.
I hope I can take time for a proper review soon, but meanwhile, please check out your coding style:

  • indent with tabs! Your code tabs and spaces at random
  • return type should be on its own line as the length of C++ types can get out of hand and this would add indentation to all parameters
  • for multi-line comments, use multi-line comment syntax (/* ... */) and not muliple instances of the single-comment syntax (//)
  • MPD has meanwhile switched to SPDX copyright headers, please do the same for your new code
  • in some comments, you have non-ASCII emojis - I prefer to leave the code plain ASCII, except if there are good reasons to use Unicode, but those emojis don't look particularly useful
  • functions/method/structs should be CamelCase, same for source file names
  • if a function never throws, decorate it with noexcept
  • top-level definitions (e.g. structs) have pretty generic names in our code, but they may conflict with structs of the same name in other compilation units, conflicting each other's symbol namespaces; better use prefixed names, or even better: put everything in a C++ namespace

Comment on lines 52 to 44
for (auto p0 = clients.begin(), p1 = clients.end(); p0 != p1; ) {
auto p = p0++;
if ((*p)->IsClosed()) {
LogInfo(vis_server_domain, "Reaping closed client.");
clients.erase(p);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loops endlessly if at least one client isn't closed, doesn't it?

Copy link
Contributor Author

@sp1ff sp1ff Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so... the iterator p0 is unconditionally incremented on line 53, so the iteration will always make progress whether or not anyone is closed.

/// Clients have both a reference to the PCM cache as well as a
/// SoundAnalysis instance while the plugin is opened. We'll create new
/// clients with our present state
std::list<std::unique_ptr<VisualizationClient>> clients;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the point of wrapping the item type in unique_ptr? All this does is add a layer of indirection, doesn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I wrapped it because VisualizationClient, being a Buffered Socket is not copy constructable. Your comment made me realize I could emplace elements in the list rather than push them. I've removed the unique_ptr.

/// clients with our present state
std::list<std::unique_ptr<VisualizationClient>> clients;
/// invoked periodically to clean-up dead clients
CoarseTimerEvent reaper;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have a timer, why not remove dead clients right away?

Copy link
Contributor Author

@sp1ff sp1ff Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to remove dead clients right away, the server would need a means of notification. Other plugins use a "back reference" from client to server as this means. This, however, introduces a cyclic dependency (server owns clients, clients have a reference to server) between the two, which I prefer to avoid.
I wrote-up my thoughts in VisualizationOutputPlugin.cxx, line 415 (sorry, can't get GitHub to link directly to it).


using std::byte;

constexpr byte zero = byte{0};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This coding style looks clumsy. This is easier:

constexpr byte zero{0};

Why the redundant byte?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed-- changed.

*pout++ = h2;
*pout++ = h3;

*pout++ = sixteen;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't even get why you declare symbolic constants when all the name does is describe the integer value in numbers. Why not just:

*pout++ = std::byte{16};

Copy link
Contributor Author

@sp1ff sp1ff Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to avoid literals in code (well, other than their declaration). Perhaps I've adhered too literally to what was meant to be a general guideline. Changed.

FmtInfo(vis_server_domain, "VisualizationServer::OnAccept({})", gettid());

// Can we allow an additional client?
if (max_clients && clients.size() > max_clients) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off-by-one error? This allows max_clients+1 clients.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes-- fixed.

Comment on lines 79 to 68
/// ring buffer size/capacity, in bytes
std::size_t cb_ring;
/// this is the ring buffer
std::unique_ptr<uint8_t[]> ring;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MPD has class AllocatedArray which wraps these two fields nicely - it could be used to simplify your code.

Copy link
Contributor Author

@sp1ff sp1ff Dec 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick read over class AllocatedArray. Can you tell me what it offers above & beyond unique_ptr<uint8_t []>?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is mostly the same as unique_ptr, but also remembers the size of the array allocation. Thus, you have one field instead of two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Comment on lines 67 to 54
typedef std::chrono::steady_clock::duration Duration;
typedef std::chrono::microseconds Time;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get these two types. One is called "duration" and it represents a duration. The other is called "time" and it represents ..... a time? Huh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IME "time" generally represents a singular moment in time; 07:23:00 December 17, 2023, for instance. A duration is the span of time between two such moments; 10 seconds, e.g.
Do you think I should rename one or both types?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C++-speak, what you call "time" is a "time point", and C++ has a dedicated type for that (called time_point). You however (ab)use the "duration" type to express a time point. But how can you express a time point in milliseconds? What's the reference time point? It's not documented in your code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhhh... I completely missed that. Thank you. Fixed.

/// # of bytes currently in the ring buffer (as distinct from capacity)
std::size_t cb;
/// Valid PCM data exists in buf[p0, p1)
ptrdiff_t p0, p1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a signed integer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No good reason; must have been an artifact of a first implementation. Changed, in any event.

@MaxKellermann
Copy link
Member

When you fix PR commits, please amend them instead of adding fixup commits. I don't want to merge known-bad commits; a PR is "good" when all individual commits are "good".

libfftw3_dep = dependency('fftw3f', version: '>= 3.3.10', required: get_option('fftw3'))
output_features.set('ENABLE_FFTW3', libfftw3_dep.found())
output_features.set('ENABLE_VISUALIZATION_OUTPUT', get_option('visualization'))
if get_option('visualization') and libfftw3_dep.found()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems somewhat wrong. The user asks Meson to enable visualization, but then Meson doesn't find libfftw and disables the plugin, even though the user's request could not be fulfilled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have two Meson options: one to enable "FFTW support" - but what does it mean, what does it do?
Second option is "Visualization output plugin", I understand that. So I disable it, but MPD still looks for libfftw - why?
The I enable "visualization", but this still depends on the "fftw" options where I can have contradicting values.
The simples solution would be to have one single option, let's say "visualization", which is used as "required" parameter for looking up libfftw. This would be less obscure and there wouldn't be a way to have contradicting options.

#include <algorithm>
#include <cstdint>

#include <netinet/in.h>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That header isn't portable, is it?
I suggest using MPD's util/ByteOrder.hxx header, which is portable and even faster!
It also has a PackedBE32 class which you can use to write big-endian 32 bit integers to a byte stream (without reverting to std::copy()).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh... I had no idea (I only work on Linux). Replaced... but I don't see how to use PackedBE{16,32} to copy the raw bytes into a buffer without recourse to std::copy (or its equivalent)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the Packed classes are just bytes, you can simply cast a pointer to PackedBE16* and access the object from there.

return std::copy(p, p + 2, pout);
}

/* Convert an IEEE 754 single-precision floating-point number to wire format;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a static_assert to ensure that this platform really uses IEEE754 (see https://en.cppreference.com/w/cpp/types/numeric_limits/is_iec559). All commonly used do, but I want this to be safe just in case somebody compiles MPD on some very exotic future CPU architecture.

template <typename OutIter>
OutIter
SerializeFloat(float f, OutIter pout) {
auto m = htonl(*(uint32_t*)&f);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cast is undefined behavior, isn't it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting one object type to another is undefined behavior in C++ (there are a few exceptions).

};

/**
* \brief Attempt to parse a CLIHLO message from the given buffer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is CLIHLO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added references to the message documentation.


size_t max_coeffs_idx = num_samples/2;

for (uint8_t c = 0; c < num_channels; ++c) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using uint8_t here instead of unsigned has no effect at best, and can add overhead and code bloat. It's a harmful micro-optimization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't intended as an optimization or any sort; just chose the type to match that of num_channels. Changed, but can you say more about how making c a uint8_t could add overhead & code bloat?

Comment on lines 55 to 56
static constexpr size_t DEF_LO_CUTOFF = 200;
static constexpr size_t DEF_HI_CUTOFF = 10000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does DEF_ mean and why is this a size_t when the non-static fields with similar names are float?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short for "default"-- re-named. They're integral because ConfigBlock offers no accessors for floating-point values.


public:
SoundAnalysisParameters() noexcept;
SoundAnalysisParameters(const ConfigBlock &config_block);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be explicit to avoid implicit unwanted conversions.

/* In both `out` & `pout`, the coefficients are laid out as:
* | coeffs for chan #0... | coeffs for chan #1... | ... |
* so outer loop will be on channel. */
for (uint8_t chan = 0; chan < num_channels; ++chan) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another bad micro-optimization by using uint8_t

}

Visualization::SoundInfoCache::SoundInfoCache(const AudioFormat &audio_format,
const Duration &buf_span):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad indent

@sp1ff sp1ff force-pushed the soundinfo branch 3 times, most recently from 173d4a8 to 1c7c4e4 Compare January 11, 2024 01:45
@sp1ff sp1ff force-pushed the soundinfo branch 2 times, most recently from 7f52ae9 to 119c0d2 Compare February 19, 2024 23:56
@sp1ff sp1ff marked this pull request as draft February 20, 2024 00:04
@sp1ff sp1ff force-pushed the soundinfo branch 4 times, most recently from 64d8629 to a25ae7f Compare March 5, 2024 21:27
@sp1ff sp1ff marked this pull request as ready for review March 5, 2024 21:39
@@ -40,7 +40,7 @@ jobs:
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
ninja-build \
quilt
quilt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

#define THREAD_ID_FORMATTER_HXX

#include <fmt/format.h>
#include <sstream>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please please please no iostreams in MPD!
Why do we even need this whole header??

@@ -160,6 +160,28 @@ else
wasapi_dep = dependency('', required: false)
endif

libfftw3_dep = dependency('fftw3f', version: '>= 3.3.8', required: get_option('fftw3'))
output_features.set('ENABLE_FFTW3', libfftw3_dep.found())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This macro is never used

Copy link
Member

@MaxKellermann MaxKellermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read only part of the code, but found so many things that were wrong or more complicated than they should be, I stopped reading further.

return std::copy(p, p + 2, pout);
}

static_assert(std::numeric_limits<float>::is_iec559);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear why this line exists - if you move it into the function that relies on this assertion, it would be clearer.

template <typename OutIter>
OutIter
SerializeU16(uint16_t n, OutIter pout) {
auto m = PackedBE16(n);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird coding style. PackedBE16 m{n} is simpler.

*/
template <typename OutIter>
OutIter
SerializeU16(uint16_t n, OutIter pout) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason you decided to make this a template instead of just passing a std::byte*?

SerializeU16(uint16_t n, OutIter pout) {
auto m = PackedBE16(n);
auto p = (std::byte*)(&m);
return std::copy(p, p + 2, pout);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using std::copy after creating a copy on the stack seems pretty clumsy. Now we have three copies of the value. Can we do better than that?

Comment on lines +47 to +48
auto r = PackedBE32(*(const uint32_t*)&(c[0]));
auto i = PackedBE32(*(const uint32_t*)&(c[1]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yet more undefined behavior here. Why are you not calling SerializeFloat here twice? Too simple?

/* FSM initial state; the socket has been established, but no
* communication has taken place; we are expecting a CLIHLO
* message to arrive (i.e. a READ/POLLIN notification) */
Init,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enum choices should be UPPER_CASE


inline
typename std::chrono::microseconds::rep
NowTicks() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only used for your excessive debug prints, but those should be removed before I merge this. What's even the point of this function, why do you insert those time stamps in all log messages?

if (!clients.empty()) {
LogInfo(vis_server_domain, "Scheduling another reaping in 3 "
"seconds.");
reaper.Schedule(std::chrono::seconds(3));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This periodically calls ReapClients, all the time, while MPD runs! This is bad, because it causes unnecessary CPU wakeups. Never wake up the CPU unless you explicitly have a reason to do so.
I also don't understand the point of this whole method. Why does it exist? If a client is closed, why not delete it immediately?

{
state = HavePcmData{pcache };

for (auto p0 = clients.begin(), p1 = clients.end(); p0 != p1; ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This iteration happens in the OutputThread, but other methods happen in the i/o thread (via EventLoop), thus all methods in this class may crash all the time.

/// only valid when the plugin is open
struct HavePcmData {
// I wish C++ had a `not_null` class
std::shared_ptr<SoundInfoCache> pcache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a shared_ptr? What's the point of adding reference counting overhead here?

@sp1ff
Copy link
Contributor Author

sp1ff commented Mar 26, 2024

@MaxKellermann This isn't going well. Before I embark on another round of low-level changes (addressing "excessive logging", for instance), I think it might be beneficial to first verify that that basic architecture of the plugin is acceptable (not much use updating code that's going to need to be re-written, after all). I've documented that at some length here, where I hope you'll find the answers to some of your questions above.

In the interests of time, if you do want changes, it would help me if you could indicate what you'd like to see instead of what I've written. I realize you've done that in some cases, but the review so far contains many open-ended questions (such as "why?"). When reviewing others' code, I've gotten better results with concrete suggestions.

Finally, and in particular, could you tell me a bit about casting being undefined behavior? When, e.g. I cast a sixteen bit value to a uint16_t I expect to get the bit pattern. This has worked for me on multiple platforms, in both kernel & user mode, for nearly three decades now; that said, I'm always interesting in honing my craft. Could you say more?

@MaxKellermann
Copy link
Member

contains many open-ended questions (such as "why?"). When reviewing others' code, I've gotten better results with concrete suggestions.

When I see weird code, I want to know why you wrote it that way. I assume you have good reasons which are just outside of my grasp. So I ask "why".

Maybe you don't have good reasons, but only after I find that out I can make suggestions.

could you tell me a bit about casting being undefined behavior?

You mustn't cast an unaligned pointer to a pointer to a 16 bit integer (and then dereference it). This works on most CPUs, but may crash on others (with SIGBUS) - but for certain the C++ language does not allow it. Maybe you got away with that mistake for three decades, but that doesn't mean it's correct and doesn't mean it will never crash for anybody. Let's write the code to be correct, maybe more correct than it is necessary for the CPUs you happen to have been using for the last three decades, hoping that MPD will remain stable for everybody.

@MaxKellermann MaxKellermann added the waiting Waiting for more information from reporter label May 6, 2024
@MaxKellermann
Copy link
Member

Is this going anywhere?

This commit contains the first implementation of an output
plugin that streams sound analysis to clients (presumably
visualizers of some sort).
This patch will:

    - address the change to `OnSocketInput()` in which the argument
      was changed to a span
    - reduce logging considerably (which had been irritating me, and
      which prompted complaints on the PR)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting Waiting for more information from reporter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants