arch/xtensa: Add special register allocation generator #41676

andyross · 2022-01-10T01:12:15Z

Zephyr likes to use the various Xtensa scratch registers for its own purposes in several places. Unfortunately, owing to the configurability of the architecture, we have to use different registers for different platforms. This has been done so far with a collection of different tricks, some... less elegant than others.

Put it all in one place. This is a python script that emites a "zsr.h" header with register assignments for all the existing users.

[This particular PR is mostly a wash in terms of complexity, removing some hacks and adding an equivalent amount of python and cmake. But I have plans for this: a similar trick could be played to integrate the long-dormant second-level interrupt handler generator (or better: a rewritten replacement) into the build, so we don't have to have a copy of the output for every platform. Similar tricks could be played in the link to remove all the boilerplate around the vector table sections (those offsets are all in core-isa.h). Lots of future here, I promise.]

andyross · 2022-01-10T01:15:12Z

(Tried to pick a representative set of reviewers from across the zephyr/xtensa community. Please add anyone I missed.)

marc-hb · 2022-01-10T16:51:22Z

arch/xtensa/core/gen_zsr.py

+with open(outfile, "w") as f:
+    f.write("/* Generated File, see gen_zsr.py */\n")
+    f.write("#ifndef ZEPHYR_ZSR_H\n")
+    f.write("#define ZEPHYR_ZSR_H\n")


Suggested change

f.write("#define ZEPHYR_ZSR_H\n")

f.write("""

/* Generated File, see gen_zsr.py */

#ifndef ZEPHYR_ZSR_H

#define ZEPHYR_ZSR_H

""")

marc-hb · 2022-01-10T16:52:59Z

arch/xtensa/core/CMakeLists.txt

+set(CORE_ISA_IN ${CMAKE_BINARY_DIR}/zephyr/include/generated/core-isa-dM.c)
+file(WRITE ${CORE_ISA_IN} "#include <xtensa/config/core-isa.h>")
+add_custom_command(OUTPUT ${CORE_ISA_DM}
+  COMMAND ${CMAKE_C_COMPILER} -E -dM


@tejlmand is this the right level of toolchain abstraction? Just making sure. I mean maybe there is some ${CMAKE_CPP} or something.

It's more than just preprocessing, -dM expands all the recursive definitions and emits a line of output like "#define A xxxx" for every macro it sees. It's a gcc feature, obviously, but supported just fine on xt-xcc and xt-clang (I checked). So it really does have to be the compiler binary here.

I would expect something like ${CMAKE_CPP} to expand to something like gcc -E. Who wouldn't? :-)

grep found this in cmake/dts.cmake. It seems to be the only place where CPPFLAGS are used.

if(NOT DEFINED CMAKE_DTS_PREPROCESSOR) set(CMAKE_DTS_PREPROCESSOR ${CMAKE_C_COMPILER}) endif() ... # Run the preprocessor on the DTS input files. We are leaving # linemarker directives enabled on purpose. This tells dtlib where # each line actually came from, which improves error reporting. execute_process( COMMAND ${CMAKE_DTS_PREPROCESSOR} -x assembler-with-cpp -nostdinc ${DTS_ROOT_SYSTEM_INCLUDE_DIRS} ${DTC_INCLUDE_FLAG_FOR_DTS} # include the DTS source and overlays ${NOSYSDEF_CFLAG} -D__DTS__ ${DTS_EXTRA_CPPFLAGS} -E # Stop after preprocessing -MD # Generate a dependency file as a side-effect -MF ${DTS_DEPS} -o ${DTS_POST_CPP} ${ZEPHYR_BASE}/misc/empty_file.c

Most of this code is old though, may predate any fancy toolchain abstraction.

marc-hb · 2022-01-10T16:54:12Z

arch/xtensa/core/CMakeLists.txt

+
+# Generates a list of device-specific scratch register choices
+set(ZSR_H ${CMAKE_BINARY_DIR}/zephyr/include/generated/zsr.h)
+add_custom_command(OUTPUT ${ZSR_H} DEPENDS ${CORE_ISA_DM}


Suggested change

add_custom_command(OUTPUT ${ZSR_H} DEPENDS ${CORE_ISA_DM}

add_custom_command(OUTPUT ${ZSR_H} MAIN_DEPENDENCY ${CORE_ISA_DM}

This is supposed to help some IDEs. Probably not important.

marc-hb · 2022-01-10T17:30:05Z

arch/xtensa/core/gen_zsr.py

+coreisa = sys.argv[1]
+outfile = sys.argv[2]
+
+syms = {}


Can you do the usual if __name__ == "__main__": dance for import+interactive debug?

Heh, uh... no? :) Just let this be my one bit of iconoclasm. I genuinely don't understand why python people think it's a good thing to indent an entire file by an extra tab stop completely needlessly and then put a noop function call at the end of the file. I mean, functional decomposition in a large script with lots of stuff going on? Sure. But this is a trivial straight-through logic kind of thing. We never did that in shell or perl, I don't see why the culture needs to change here. As this evolves, we can totally move it in that direction if needed. For for the logic as it stands, I just like it a lot better like this.

We never did that in shell or perl,

Here are a few technical, non-subjective reasons to do this in shell: thesofproject/sof-test#740

I genuinely don't understand why python people think it's a good thing to indent an entire file...

Most of the reasons above seem to apply to Python the same. I bet you can find more on the Internet.

I agree it does not matter for "small" scripts. The size threshold is of course much more subjective; I feel like this one is very slightly crossing the line. No big deal.

I genuinely don't understand why python people think it's a good thing to indent an entire file...

You just made me wonder why it is customary to have a first level of indentation in all functions in all programming languages. In most languages there is little or no code outside functions so the first indentation level adds no information after all.

Maybe it just subconsciously follows standard typographic usage https://en.wikipedia.org/wiki/Margin_(typography)#The_Digital_Page

Off topic sorry.

marc-hb · 2022-01-10T17:47:43Z

arch/xtensa/core/gen_zsr.py

+
+with open(coreisa) as infile:
+    for line in infile.readlines():
+        m = re.match(r"^#define ([^ ]+) ?(.*)", line.rstrip())


I think this fails to match multiple spaces or tabs:

#define XCHAL...

Not a problem thanks to some -dM guarantee(s)?

Yeah, the syntax is strict: exactly one space between "#define", the macro name, and its expansion. It's not like it's specified anywhere in a document though. Can't hurt to robustify.

marc-hb · 2022-01-10T17:50:51Z

arch/xtensa/core/gen_zsr.py

+    for line in infile.readlines():
+        m = re.match(r"^#define ([^ ]+) ?(.*)", line.rstrip())
+        if m:
+            syms[m.group(1)] = 1 if m.group(2) == "" else m.group(2)


I find this quite cryptic sorry; when is group(2) empty and why? Could you add a couple comments or maybe even better: example(s)?

The preprocessor behavior of "#define FOO" without a value is to expand FOO to a literal 1. I don't know why -dM emits it like this, it seems like a wart to me. Will comment.

Hm... never mind. I went back and checked, and in fact all three toolchains do this correctly and emit "#define FOO 1" and not "#define FOO". Not sure how I got it into my head that they didn't. Simplified.

BTW Python regex have the useful \w+ and \S+ to match words, clearer and more robust than dealing with whitespace.

https://docs.python.org/3/library/re.html

Zephyr likes to use the various Xtensa scratch registers for its own purposes in several places. Unfortunately, owing to the configurability of the architecture, we have to use different registers for different platforms. This has been done so far with a collection of different tricks, some... less elegant than others. Put it all in one place. This is a python script that emites a "zsr.h" header with register assignments for all the existing users. Signed-off-by: Andy Ross <[email protected]>

Use the zsr.h assignments for the special register containing the current CPU pointer. Signed-off-by: Andy Ross <[email protected]>

This is actually Cadence-authored code, but its use of EXCSAVE1 as a sideband input to the exception handler is very much in the same family of tricks. Use ZSR assignments here too. Signed-off-by: Andy Ross <[email protected]>

The kernel coherence cache flush code was using a scratch register to mark the top of the stack. Likewise a good candidate for ZSR use. Signed-off-by: Andy Ross <[email protected]>

We had a similar sequence for interrupt return, where we were selecting (actually only for the benefit of qemu) the highest priority EPCn/EPSn registers for our RFI instruction. That works much better in python the preprocessor. Signed-off-by: Andy Ross <[email protected]>

hongshui3000 · 2022-01-12T08:21:51Z

@andyross
I see some code in the PR that is not dealt with, such as the issue I mentioned.
#40974

andyross · 2022-01-12T16:26:25Z

@hongshui3000 indeed, this is is just cleanup (though it does get the EPC/EPS usage off of the debug interrupt level, which I know was one of your concerns). I know you have an app architecture with non-Zephyr interrupt handling mixed with Zephyr interrupts. It's not that we refuse to support that, you just have to realize that it's "unsupported" architecturally and that none of us are likely to fix it for you. Please consider addressing the issues in an upstreamable way and submitting the fixes yourself.

hongshui3000 · 2022-01-13T03:02:50Z

@hongshui3000 indeed, this is is just cleanup (though it does get the EPC/EPS usage off of the debug interrupt level, which I know was one of your concerns). I know you have an app architecture with non-Zephyr interrupt handling mixed with Zephyr interrupts. It's not that we refuse to support that, you just have to realize that it's "unsupported" architecturally and that none of us are likely to fix it for you. Please consider addressing the issues in an upstreamable way and submitting the fixes yourself.

Ok, I see

MaureenHelm · 2022-01-14T16:34:17Z

arch/xtensa/core/gen_zsr.py

+        f.write(f"# define ZSR_{need} {regs[i]}\n")
+        f.write(f"# define ZSR_{need}_STR \"{regs[i]}\"\n")


I was curious if any of the NXP platforms needed to be modified in addition to the Intel platforms, so I tried building this PR for nxp_adsp_imx8. The generated output looks like this:

/* Generated File, see gen_zsr.py */ #ifndef ZEPHYR_ZSR_H #define ZEPHYR_ZSR_H # define ZSR_ALLOCA MISC0 # define ZSR_ALLOCA_STR "MISC0" # define ZSR_CPU MISC1 # define ZSR_CPU_STR "MISC1" # define ZSR_FLUSH EXCSAVE1 # define ZSR_FLUSH_STR "EXCSAVE1" # define ZSR_EXTRA0 EXCSAVE2 # define ZSR_EXTRA1 EXCSAVE3 # define ZSR_EXTRA2 EXCSAVE4 # define ZSR_RFI_LEVEL 3 # define ZSR_EPC EPC3 # define ZSR_EPS EPS3 #endif

Is the space between # and define deliberate?

Mildly? I wanted to distinguish the macro definitions from the include guards for anyone reading the generated file, but I wasn't willing to dedicate lines in the script to emitting blanks, I guess?

And in theory this is only touching arch-layer code and everyone should pick it up seamlessly, but yeah: it's always possible there's downstream code somewhere already stepping on these specific register assignments that will need to be aware (or potentially integrated).

BTW GNU indent -ppi option does this. I find this type of indentation makes macros more readable (but OK: not for the whole file)

andyross · 2022-01-20T16:34:36Z

Ping. Would be good to get this merged.

andyross requested review from dcpleung and nashif as code owners January 10, 2022 01:12

github-actions bot added area: API Changes to public APIs area: Xtensa Xtensa Architecture labels Jan 10, 2022

andyross requested review from sylvioalves, lyakh and iuliana-prodan January 10, 2022 01:14

andyross force-pushed the xtensa-zsr branch from 1dd36b8 to 5b772b5 Compare January 10, 2022 04:00

zephyrbot added area: Kernel area: Build System platform: Intel ADSP Intel Audio platforms labels Jan 10, 2022

zephyrbot requested review from ceolin, kv2019i, lgirdwood, marc-hb, mengxianglinx, peter-mitsis, SebastianBoe and tejlmand January 10, 2022 13:39

zephyrbot assigned nashif Jan 10, 2022

ceolin approved these changes Jan 10, 2022

View reviewed changes

marc-hb reviewed Jan 10, 2022

View reviewed changes

dcpleung approved these changes Jan 10, 2022

View reviewed changes

Andy Ross added 5 commits January 10, 2022 14:50

arch/xtensa: Use ZSR assignments for the CPU pointer

28f7a48

Use the zsr.h assignments for the special register containing the current CPU pointer. Signed-off-by: Andy Ross <[email protected]>

arch/xtensa: Use ZSR assignments for the alloca exception

14ecf70

This is actually Cadence-authored code, but its use of EXCSAVE1 as a sideband input to the exception handler is very much in the same family of tricks. Use ZSR assignments here too. Signed-off-by: Andy Ross <[email protected]>

arch/xtensa: Use ZSR assignments for stack flush markers

145843f

The kernel coherence cache flush code was using a scratch register to mark the top of the stack. Likewise a good candidate for ZSR use. Signed-off-by: Andy Ross <[email protected]>

andyross force-pushed the xtensa-zsr branch from 5b772b5 to 4b55b49 Compare January 10, 2022 22:51

teburd approved these changes Jan 11, 2022

View reviewed changes

SebastianBoe removed their request for review January 11, 2022 08:52

MaureenHelm reviewed Jan 14, 2022

View reviewed changes

nashif merged commit d175c18 into zephyrproject-rtos:main Jan 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

arch/xtensa: Add special register allocation generator #41676

arch/xtensa: Add special register allocation generator #41676

andyross commented Jan 10, 2022

andyross commented Jan 10, 2022

marc-hb Jan 10, 2022

marc-hb Jan 10, 2022

andyross Jan 10, 2022 •

edited

Loading

marc-hb Jan 10, 2022

marc-hb Jan 11, 2022

marc-hb Jan 10, 2022

marc-hb Jan 10, 2022

andyross Jan 10, 2022

marc-hb Jan 10, 2022 •

edited

Loading

marc-hb Jan 11, 2022 •

edited

Loading

marc-hb Jan 10, 2022

andyross Jan 10, 2022

marc-hb Jan 10, 2022

andyross Jan 10, 2022

andyross Jan 10, 2022

marc-hb Jan 10, 2022 •

edited

Loading

hongshui3000 commented Jan 12, 2022

andyross commented Jan 12, 2022

hongshui3000 commented Jan 13, 2022

MaureenHelm Jan 14, 2022

andyross Jan 15, 2022

andyross Jan 15, 2022

marc-hb Jan 20, 2022

andyross commented Jan 20, 2022

	add_custom_command(OUTPUT ${ZSR_H} DEPENDS ${CORE_ISA_DM}
	add_custom_command(OUTPUT ${ZSR_H} MAIN_DEPENDENCY ${CORE_ISA_DM}

		f.write(f"# define ZSR_{need} {regs[i]}\n")
		f.write(f"# define ZSR_{need}_STR \"{regs[i]}\"\n")

arch/xtensa: Add special register allocation generator #41676

arch/xtensa: Add special register allocation generator #41676

Conversation

andyross commented Jan 10, 2022

andyross commented Jan 10, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyross Jan 10, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marc-hb Jan 10, 2022 • edited Loading

Choose a reason for hiding this comment

marc-hb Jan 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marc-hb Jan 10, 2022 • edited Loading

Choose a reason for hiding this comment

hongshui3000 commented Jan 12, 2022

andyross commented Jan 12, 2022

hongshui3000 commented Jan 13, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyross commented Jan 20, 2022

andyross Jan 10, 2022 •

edited

Loading

marc-hb Jan 10, 2022 •

edited

Loading

marc-hb Jan 11, 2022 •

edited

Loading

marc-hb Jan 10, 2022 •

edited

Loading