Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zephyr: sof-logger #4503

Merged
merged 6 commits into from
Jul 26, 2021
Merged

zephyr: sof-logger #4503

merged 6 commits into from
Jul 26, 2021

Conversation

marc-hb
Copy link
Collaborator

@marc-hb marc-hb commented Jul 16, 2021

Reviewing commits one by one is highly recommended.

This is a superset of #4452

EDIT: due to CML-specific suspend/resume issue #4558 this was partly reverted by #4578

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 16, 2021

This is "feature complete" however I tested it only with a complete revert of scheduler PR #4377, see discussion at the bottom of #4377

@marc-hb marc-hb changed the title Zephyr dma trace zephyr: DMA trace Jul 16, 2021
@lgirdwood
Copy link
Member

@marc-hb is DMA trace now using a DP/EDF (low priority thread) on Zephyr ? No revert of #4377 is needed then.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 16, 2021

@marc-hb is DMA trace now using a DP/EDF (low priority thread) on Zephyr ?

Not yet.

No revert of #4377 is needed then.

Hopefully not, need more testing.

@marc-hb marc-hb requested a review from lyakh July 17, 2021 00:54
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this all that's needed to enable SOF DMA trace under Zephyr?

@@ -532,7 +532,7 @@ void trace_init(struct sof *sof)
dma_trace_init_early(sof);
}

void mtrace_dict_entry(uint32_t dict_entry_address, int n_args, ...)
void mtrace_dict_entry(bool atomic, uint32_t dict_entry_address, int n_args, ...)
Copy link
Collaborator

@lyakh lyakh Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atomic sounds a bit confusing to me in this case. You want it to say, that it's already called with the lock held / from an atomic context, right? Maybe external_lock? But I see that it's following other similar functions.

Copy link
Collaborator Author

@marc-hb marc-hb Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name of this parameter and in fact it already caused confusion and one bug in the past that lasted for months! see #4246 . HOWEVER this is the exact same parameter name used everywhere else and I don't want to break consistency in this PR. I will add some doxygen.

src/include/sof/trace/trace.h Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this all that's needed to enable SOF DMA trace under Zephyr?

After many bug fixes and attempted alternatives, yes. Except for the untested #4377 scheduler change and fixing build failures but please try it and report any issue if you think this can be useful to you it, should be "good enough" already.

@@ -532,7 +532,7 @@ void trace_init(struct sof *sof)
dma_trace_init_early(sof);
}

void mtrace_dict_entry(uint32_t dict_entry_address, int n_args, ...)
void mtrace_dict_entry(bool atomic, uint32_t dict_entry_address, int n_args, ...)
Copy link
Collaborator Author

@marc-hb marc-hb Jul 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name of this parameter and in fact it already caused confusion and one bug in the past that lasted for months! see #4246 . HOWEVER this is the exact same parameter name used everywhere else and I don't want to break consistency in this PR. I will add some doxygen.

@marc-hb marc-hb requested a review from kv2019i July 19, 2021 16:03
Copy link
Collaborator

@kv2019i kv2019i left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @marc-hb

DECLARE_SOF_UUID("zll-schedule", zll_sched_uuid, 0x1547fe68, 0xde0c, 0x11eb,
0x84, 0x61, 0x31, 0x58, 0xa1, 0x29, 0x48, 0x53);

DECLARE_TR_CTX(ll_tr, SOF_UUID(zll_sched_uuid), LOG_LEVEL_INFO);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you, please, rename ll_tr to something else, maybe zll_tr.
In i.MX case, we don't want to use zephyr_ll since this is limited to TIMER_DOMAIN (see here) and we are using DMA_DOMAIN.
We want, for now, to keep using ll_schedule.

Therefore, when compiling this, for i.MX, I get: zephyr_ll.c:22: multiple definition of ll_tr;

Copy link
Collaborator Author

@marc-hb marc-hb Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to help but you're asking me to fix a problem with a configuration that is currently not possible in this branch, probably only on your branch. With this branch there is AFAIK no way to compile ll_schedule.c in Zephyr. After a fair amount of time spent realizing this I tried this naive:

--- a/zephyr/CMakeLists.txt
+++ b/zephyr/CMakeLists.txt
@@ -479,6 +479,7 @@ zephyr_library_sources(
 	${SOF_SRC_PATH}/schedule/schedule.c
 	${SOF_SRC_PATH}/schedule/dma_single_chan_domain.c
 	${SOF_SRC_PATH}/schedule/dma_multi_chan_domain.c
+	${SOF_SRC_PATH}/schedule/ll_schedule.c
 	${SOF_SRC_PATH}/schedule/zephyr_ll.c
 	${SOF_SRC_PATH}/schedule/zephyr.c

... but that does not work because:

sof/src/schedule/zephyr_ll.c:418: multiple definition of `zephyr_ll_task_init'; 
modules/sof/libmodules_sof.a(ll_schedule.c.obj):sof/src/schedule/ll_schedule.c:464: first defined here

sof/src/schedule/zephyr_ll.c:447: multiple definition of `zephyr_ll_scheduler_init';
 modules/sof/libmodules_sof.a(ll_schedule.c.obj):sof/src/schedule/ll_schedule.c:624: first defined here

In other words @lyakh in PR #4377 made zephyr_ll.c and ll_schedule.c somehow mutually exclusive even before this PR does with ll_tr.

Copy link
Collaborator Author

@marc-hb marc-hb Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot this: a simple rename and declaration of DECLARE_TR_CTX(Zll_tr,... in zephyr_ll.c is not possible because ll_tr is used in at least 5 different C files, some shared across both Zephyr and non-Zephyr.

Maybe I can figure out something else but I need to reproduce your problem first, so if you have a way to compile both zephyr_ll.c and ll_schedule.c together please submit it first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iuliana-prodan @dbaluta so DMA domain is becoming problematic as it does not scale well to multiple low latency streams (it's simple and great when there is a smaller number of streams though) or when the codec is driving the clocks.
I think we need to improve the DMA driven domain as follows.

Today we have

  1. Zephyr LL gets its LL system tick from the timer. This runs in IRQ context and is non preemtable.
  2. LL system tick wakes up high priority LL thread.
  3. LL thread executes pipeline work.

We need to make one change to support DMA domain here and make it synchronous. Step 1 should also be able to be

  1. Zephyr LL gets its LL system tick from the DMA IRQ. This runs in IRQ context and is non preemtable.

This means we keep the DMA based time domain and we can also scale all LL streams to this time domain.

Fwiw, this is how things worked on OMAP4, the topology would tell FW which DMA channel was to be used as time domain. If the DMA channel stopped then FW had to then use the DMA channel from the next running stream. PR #4471 is allowing the trigger(start|stops) to be synchronised and this would mean that DMA IRQs could also be synchronised.

Copy link
Collaborator Author

@marc-hb marc-hb Jul 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iuliana-prodan can you please:

  • File a new issue about the longer term DMA domain discussion and solutions so @lgirdwood 's words of wisdom above do not get lost, buried deep down in this unrelated pull request
  • Share a (hopefully simple) draft PR showing how you compile zephyr_ll.c and ll_schedule.c together so I can reproduce the problem specific to your branch. It does not matter to me if that draft never gets merged, I just need a way to reproduce your problem if you would like me to solve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood @marc-hb I've created a pull request with my fixes for i.MX with Zephyr -see #4524
Now we can run successfully playback and record scenarios with Zephyr.

@lgirdwood until we can extend ll_zephyr with DMA_IRQ (I've created a feature request for this - #4523) we'll be using ll_schedule. Hope it's ok with you.

@marc-hb using my pull request (#4524) and enabling dma-trace you'll run into zephyr_ll.c:22: multiple definition of ll_tr;
That's because in commit 1 I'm adding ll_schedule to be compiled for Zephyr and the zephyr_ll is added mandatory for all platforms - see here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in #4524 and #4525 it's not possible to compile-test any CONFIG_IMX yet but if I understood this issue correctly the latest version should be compatible with your branch, please test it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iuliana-prodan I commented in your PR - maybe you shouldn't include both ll_zephyr.c and ll_schedule.c in your builds?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @lyakh.
I've replied to you in my PR.
Please let me know if you agree with my proposed solution - from here, and I can make the changes and update that PR.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 20, 2021

Don't try this yet, I screwed up the va_list passing.

EDIT: fixed in v2, go for it.

v2:

  • Fixed broken va_list (sorry)
  • Fixes user space build like the fuzzer
  • Other minor review comments

I don't see how the aplay: pcm_write:2061: write error: Input/output error and src/audio/dai.c:949 WARN dai_copy(): Copy_bytes 0 + avail bytes 288 < period bytes 384, possible glitch in
https://sof-ci.01.org/sofpr/PR4503/build9708/devicetest/?model=BDW_WSB_RT286&testcase=multiple-pipeline-capture can have anything to do with this PR. Everything else is green.

@marc-hb marc-hb marked this pull request as ready for review July 21, 2021 09:01
@marc-hb marc-hb changed the title zephyr: DMA trace zephyr: sof-logger Jul 21, 2021
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just the the ll_tr change and we are good.

Copy link
Collaborator Author

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just the the ll_tr change and we are good.

I still need to stop trace_work() from running every millisecond, see July 14th comments in #4377

DECLARE_SOF_UUID("zll-schedule", zll_sched_uuid, 0x1547fe68, 0xde0c, 0x11eb,
0x84, 0x61, 0x31, 0x58, 0xa1, 0x29, 0x48, 0x53);

DECLARE_TR_CTX(ll_tr, SOF_UUID(zll_sched_uuid), LOG_LEVEL_INFO);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As explained in #4524 and #4525 it's not possible to compile-test any CONFIG_IMX yet but if I understood this issue correctly the latest version should be compatible with your branch, please test it.

@marc-hb marc-hb marked this pull request as draft July 22, 2021 06:04
@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 22, 2021

I don't see how the alsabat failure in
https://sof-ci.01.org/sofpr/PR4503/build9719/devicetest/?model=CML_SKU0955_HDA&testcase=check-alsabat-headset-playback
could have anything to do with this PR.

On the other hand will need to fix:

/quickbuild/workspace1/24733/SOF_FW/test/cmocka/src/common_mocks.c:105: error: conflicting types for ‘trace_log_filtered’
/quickbuild/workspace1/24733/SOF_FW/src/include/sof/trace/trace.h:154: error: previous declaration of ‘trace_log_filtered’ was here

Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -544,9 +544,18 @@ void mtrace_dict_entry(uint32_t dict_entry_address, int n_args, ...)
dict_entry_address, tstamp);

va_start(ap, n_args);
for (i = 0; i < n_args; i++)
for (i = 0; i < n_args && i < _TRACE_EVENT_MAX_ARGUMENT_COUNT; i++)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i < MIN(n_args, _TRACE_EVENT_MAX_ARGUMENT_COUNT)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this but I have bad MIN() memories #4002

Amazingly this saves 12 bytes so I'm doing it. I would have bet the compiler was able to optimize this but apparently not.

Copy link
Collaborator Author

@marc-hb marc-hb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why the UTs would fail, so rerun just in case a CI issue.

Because I changed one function prototype.

Remembering how to compile unit tests was quite the ordeal so I fixed the docs: thesofproject/sof-docs#363

SOFCI TEST

Only Jenkins reads that.

@lgirdwood we wanted to switch trace do an EDF task

It seems to work as is but @kv2019i has been keeping an eye on performance so he may not be happy with trace_work() waking up every 1ms (instead of 500ms as requested) to do nothing most of the time.

@@ -544,9 +544,18 @@ void mtrace_dict_entry(uint32_t dict_entry_address, int n_args, ...)
dict_entry_address, tstamp);

va_start(ap, n_args);
for (i = 0; i < n_args; i++)
for (i = 0; i < n_args && i < _TRACE_EVENT_MAX_ARGUMENT_COUNT; i++)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered this but I have bad MIN() memories #4002

Amazingly this saves 12 bytes so I'm doing it. I would have bet the compiler was able to optimize this but apparently not.

@kv2019i
Copy link
Collaborator

kv2019i commented Jul 23, 2021

@marc-hb wrote:

@lyakh wrote:

@lgirdwood we wanted to switch trace do an EDF task
It seems to work as is but @kv2019i has been keeping an eye on performance so he may not be happy with trace_work() waking up every 1ms (instead of 500ms as requested) to do nothing most of the time.

No, I'm ok with going ahead and merging this in the ll trace. It's still fairly close to XTOS build behaviour where the DMA traces were flushed from the idle thread. Now we have tools to measure perf, so it's a simpler to also optimize the DMA trace code in incremental PRs.

@lgirdwood
Copy link
Member

@marc-hb wrote:

@lyakh wrote:

@lgirdwood we wanted to switch trace do an EDF task
It seems to work as is but @kv2019i has been keeping an eye on performance so he may not be happy with trace_work() waking up every 1ms (instead of 500ms as requested) to do nothing most of the time.

No, I'm ok with going ahead and merging this in the ll trace. It's still fairly close to XTOS build behaviour where the DMA traces were flushed from the idle thread. Now we have tools to measure perf, so it's a simpler to also optimize the DMA trace code in incremental PRs.

Yes, we can go ahead and merge today because its better than the trace we have today and we need this to help developers. Still draft status @marc-hb ?

That being said, the trace work needs to be an small preemptable thread (with minimal stack) that wakes up every 500ms or when the buffer > watermark in order to program the DMA. @marc-hb when will this part be ready ?

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 26, 2021

That being said, the trace work needs to be an small preemptable thread (with minimal stack) that wakes up every 500ms or when the buffer > watermark in order to program the DMA. @marc-hb when will this part be ready ?

After I understand something about schedulers.

What sort of regular timer should I use? edf_schedule.c silently ignores start and period.

PS: I realized #4428 is "busy looping" too.

@marc-hb
Copy link
Collaborator Author

marc-hb commented Jul 26, 2021

The ll_tr IMX situation is now resolved thanks to @iuliana-prodan's new ebf5f2ddebc, the latest force-push is just the small, corresponding adjustment.

So the only known issue left is trace_work "busy-looping" due to inconsistencies between Zephyr and XTOS schedulers (I found more). Does not look so bad now that I discovered that #4428 is busy looping too.

No, I'm ok with going ahead and merging this in the ll trace. It's still fairly close to XTOS build behaviour where the DMA traces were flushed from the idle thread. Now we have tools to measure perf, so it's a simpler to also optimize the DMA trace code in incremental PRs.

If you agree with this then I think this is ready to be merged. If not then please 'request changes' and please help and offer some concrete trace_work scheduling suggestions.

https://sof-ci.01.org/sofpr/PR4503/build9755/devicetest/ is all green except one unavailable system
https://sof-ci.01.org/sofpr/PR4503/build9758/devicetest/ same
https://sof-ci.01.org/sof-pr-viewer/#/build/PR4503/build6850972 same

PS: this PR should not have any significant functional change for XTOS-SOF; only for Zephyr
PPS: checkpatch warning 1 is unsolvable because fixing the spelling then complains about a commit subject mismatch. Checkpatch warning 2 is on a line that I didn't (and don't want to) change.

@marc-hb marc-hb marked this pull request as ready for review July 26, 2021 06:23
@marc-hb marc-hb marked this pull request as draft July 26, 2021 07:22
Commit a439ea9 ("zephyr: ll-schedule: switch over to a simplified
implementation") replaced ll_schedule.c with zephyr_ll.c when compiling
with Zephyr. So the struct tr_ctx named "ll_tr" is now missing. Declare
it in ll_schedule.c too.

No one noticed that ll_tr was missing because the DMA trace does not
work in Zephyr and the main branch yet.

Signed-off-by: Marc Herbert <[email protected]>
Because it's mostly working now and avoids macro nesting complexity;
there is already #ifdef CONFIG_DW_SPI #elif CONFIG_TRACE

Just for the record this is reverting a very tiny part of
commit cf8e35f ("zephyr: init: create a zephyr entry point in SOF.")

Signed-off-by: Marc Herbert <[email protected]>
Reverts commit 74cacc3 ("zephyr: ipc: dont enable DMA trace
transport.") modified by commit d728276 ("ipc3: don't declare unused
variables")

Signed-off-by: Marc Herbert <[email protected]>
Preparation to use mtrace_dict_entry() also the _log_message() level too
and not just for very early mtrace_printf() tracing.

Signed-off-by: Marc Herbert <[email protected]>
From deep down trace.c:va_tracelog() up to the _log_message() level.

Also rename va_tracelog() to the more specific dma_tracelog()

Preparation to support the DMA trace in Zephyr.

The only functional change in this commit is that DMA messages copied to
the shared memory are not de-duplicated any more (a.k.a "adaptive rate
limiting" or CONFIG_TRACE_FILTERING_ADAPTIVE). These are generally
supposed to be high level hence rare enough; otherwise there is probably
a "bigger problem".

Signed-off-by: Marc Herbert <[email protected]>
As of July 2021 we support (too) many tracing options and this
duplication is unfortunately the only way I found to support them all
while giving the compiler the opportunity to optimize away as many
strings as possible.

Supported configurations:

- Systems with limited memory and zero space for full strings, must use
  SOF dictionary only.
- Systems with enough space for all strings to be in memory.
- Anything in between

- Support to duplicate only important message to both the DMA and the
  mailbox (the default)
- CONFIG_TRACEM: supports duplicating ALL messages to both the DMA and
  the mailbox
- CONFIG_TRACEV: supports deleting verbose statements at compile time to
  save space
- CONFIG_TRACE: support turning off all traces at compile-time

- SOF dict trace de-duplication a.k.a. "adaptive filtering"

- Dynamic log levels per component

- Redirection to Zephyr's shared memory tracing that requires full strings.

Signed-off-by: Marc Herbert <[email protected]>
@marc-hb marc-hb marked this pull request as ready for review July 26, 2021 07:30
@marc-hb marc-hb dismissed iuliana-prodan’s stale review July 26, 2021 07:31

Change of plans, ll_scheduler.c and zephyr_ll.c are mutually exclusive again

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets merge and fixup any integration issues as we go. This PR is important as trace is needed for everyone.

@lgirdwood
Copy link
Member

The CI Zephyr DUT needs a reboot.

@lgirdwood lgirdwood merged commit 57ee04f into thesofproject:main Jul 26, 2021
@marc-hb marc-hb deleted the zephyr-dma-trace branch September 15, 2021 16:07
@marc-hb
Copy link
Collaborator Author

marc-hb commented Oct 13, 2021

Due to CML-specific suspend/resume issue #4558 this was partly reverted by #4578

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants