Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arch/xtensa: Add special register allocation generator #41676

Merged
merged 5 commits into from
Jan 20, 2022

Conversation

andyross
Copy link
Contributor

Zephyr likes to use the various Xtensa scratch registers for its own purposes in several places. Unfortunately, owing to the configurability of the architecture, we have to use different registers for different platforms. This has been done so far with a collection of different tricks, some... less elegant than others.

Put it all in one place. This is a python script that emites a "zsr.h" header with register assignments for all the existing users.

[This particular PR is mostly a wash in terms of complexity, removing some hacks and adding an equivalent amount of python and cmake. But I have plans for this: a similar trick could be played to integrate the long-dormant second-level interrupt handler generator (or better: a rewritten replacement) into the build, so we don't have to have a copy of the output for every platform. Similar tricks could be played in the link to remove all the boilerplate around the vector table sections (those offsets are all in core-isa.h). Lots of future here, I promise.]

@github-actions github-actions bot added area: API Changes to public APIs area: Xtensa Xtensa Architecture labels Jan 10, 2022
@andyross
Copy link
Contributor Author

(Tried to pick a representative set of reviewers from across the zephyr/xtensa community. Please add anyone I missed.)

with open(outfile, "w") as f:
f.write("/* Generated File, see gen_zsr.py */\n")
f.write("#ifndef ZEPHYR_ZSR_H\n")
f.write("#define ZEPHYR_ZSR_H\n")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f.write("#define ZEPHYR_ZSR_H\n")
f.write("""
/* Generated File, see gen_zsr.py */
#ifndef ZEPHYR_ZSR_H
#define ZEPHYR_ZSR_H
""")

set(CORE_ISA_IN ${CMAKE_BINARY_DIR}/zephyr/include/generated/core-isa-dM.c)
file(WRITE ${CORE_ISA_IN} "#include <xtensa/config/core-isa.h>")
add_custom_command(OUTPUT ${CORE_ISA_DM}
COMMAND ${CMAKE_C_COMPILER} -E -dM
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tejlmand is this the right level of toolchain abstraction? Just making sure. I mean maybe there is some ${CMAKE_CPP} or something.

Copy link
Contributor Author

@andyross andyross Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more than just preprocessing, -dM expands all the recursive definitions and emits a line of output like "#define A xxxx" for every macro it sees. It's a gcc feature, obviously, but supported just fine on xt-xcc and xt-clang (I checked). So it really does have to be the compiler binary here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect something like ${CMAKE_CPP} to expand to something like gcc -E. Who wouldn't? :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grep found this in cmake/dts.cmake. It seems to be the only place where CPPFLAGS are used.

  if(NOT DEFINED CMAKE_DTS_PREPROCESSOR)
    set(CMAKE_DTS_PREPROCESSOR ${CMAKE_C_COMPILER})
  endif()
  ...
  # Run the preprocessor on the DTS input files. We are leaving
  # linemarker directives enabled on purpose. This tells dtlib where
  # each line actually came from, which improves error reporting.
  execute_process(
    COMMAND ${CMAKE_DTS_PREPROCESSOR}
    -x assembler-with-cpp
    -nostdinc
    ${DTS_ROOT_SYSTEM_INCLUDE_DIRS}
    ${DTC_INCLUDE_FLAG_FOR_DTS}  # include the DTS source and overlays
    ${NOSYSDEF_CFLAG}
    -D__DTS__
    ${DTS_EXTRA_CPPFLAGS}
    -E   # Stop after preprocessing
    -MD  # Generate a dependency file as a side-effect
    -MF ${DTS_DEPS}
    -o ${DTS_POST_CPP}
    ${ZEPHYR_BASE}/misc/empty_file.c

Most of this code is old though, may predate any fancy toolchain abstraction.


# Generates a list of device-specific scratch register choices
set(ZSR_H ${CMAKE_BINARY_DIR}/zephyr/include/generated/zsr.h)
add_custom_command(OUTPUT ${ZSR_H} DEPENDS ${CORE_ISA_DM}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
add_custom_command(OUTPUT ${ZSR_H} DEPENDS ${CORE_ISA_DM}
add_custom_command(OUTPUT ${ZSR_H} MAIN_DEPENDENCY ${CORE_ISA_DM}

This is supposed to help some IDEs. Probably not important.

coreisa = sys.argv[1]
outfile = sys.argv[2]

syms = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you do the usual if __name__ == "__main__": dance for import+interactive debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, uh... no? :) Just let this be my one bit of iconoclasm. I genuinely don't understand why python people think it's a good thing to indent an entire file by an extra tab stop completely needlessly and then put a noop function call at the end of the file. I mean, functional decomposition in a large script with lots of stuff going on? Sure. But this is a trivial straight-through logic kind of thing. We never did that in shell or perl, I don't see why the culture needs to change here. As this evolves, we can totally move it in that direction if needed. For for the logic as it stands, I just like it a lot better like this.

Copy link
Collaborator

@marc-hb marc-hb Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never did that in shell or perl,

Here are a few technical, non-subjective reasons to do this in shell: thesofproject/sof-test#740

I genuinely don't understand why python people think it's a good thing to indent an entire file...

Most of the reasons above seem to apply to Python the same. I bet you can find more on the Internet.

I agree it does not matter for "small" scripts. The size threshold is of course much more subjective; I feel like this one is very slightly crossing the line. No big deal.

Copy link
Collaborator

@marc-hb marc-hb Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I genuinely don't understand why python people think it's a good thing to indent an entire file...

You just made me wonder why it is customary to have a first level of indentation in all functions in all programming languages. In most languages there is little or no code outside functions so the first indentation level adds no information after all.

Maybe it just subconsciously follows standard typographic usage https://en.wikipedia.org/wiki/Margin_(typography)#The_Digital_Page

Off topic sorry.


with open(coreisa) as infile:
for line in infile.readlines():
m = re.match(r"^#define ([^ ]+) ?(.*)", line.rstrip())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this fails to match multiple spaces or tabs:

#define       XCHAL...

Not a problem thanks to some -dM guarantee(s)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the syntax is strict: exactly one space between "#define", the macro name, and its expansion. It's not like it's specified anywhere in a document though. Can't hurt to robustify.

for line in infile.readlines():
m = re.match(r"^#define ([^ ]+) ?(.*)", line.rstrip())
if m:
syms[m.group(1)] = 1 if m.group(2) == "" else m.group(2)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this quite cryptic sorry; when is group(2) empty and why? Could you add a couple comments or maybe even better: example(s)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preprocessor behavior of "#define FOO" without a value is to expand FOO to a literal 1. I don't know why -dM emits it like this, it seems like a wart to me. Will comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... never mind. I went back and checked, and in fact all three toolchains do this correctly and emit "#define FOO 1" and not "#define FOO". Not sure how I got it into my head that they didn't. Simplified.

Copy link
Collaborator

@marc-hb marc-hb Jan 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW Python regex have the useful \w+ and \S+ to match words, clearer and more robust than dealing with whitespace.

https://docs.python.org/3/library/re.html

Andy Ross added 5 commits January 10, 2022 14:50
Zephyr likes to use the various Xtensa scratch registers for its own
purposes in several places.  Unfortunately, owing to the
configurability of the architecture, we have to use different
registers for different platforms.  This has been done so far with a
collection of different tricks, some... less elegant than others.

Put it all in one place.  This is a python script that emites a
"zsr.h" header with register assignments for all the existing users.

Signed-off-by: Andy Ross <[email protected]>
Use the zsr.h assignments for the special register containing the
current CPU pointer.

Signed-off-by: Andy Ross <[email protected]>
This is actually Cadence-authored code, but its use of EXCSAVE1 as a
sideband input to the exception handler is very much in the same
family of tricks.  Use ZSR assignments here too.

Signed-off-by: Andy Ross <[email protected]>
The kernel coherence cache flush code was using a scratch register to
mark the top of the stack.  Likewise a good candidate for ZSR use.

Signed-off-by: Andy Ross <[email protected]>
We had a similar sequence for interrupt return, where we were
selecting (actually only for the benefit of qemu) the highest priority
EPCn/EPSn registers for our RFI instruction.  That works much better
in python the preprocessor.

Signed-off-by: Andy Ross <[email protected]>
@SebastianBoe SebastianBoe removed their request for review January 11, 2022 08:52
@hongshui3000
Copy link
Contributor

@andyross
I see some code in the PR that is not dealt with, such as the issue I mentioned.
#40974

@andyross
Copy link
Contributor Author

@hongshui3000 indeed, this is is just cleanup (though it does get the EPC/EPS usage off of the debug interrupt level, which I know was one of your concerns). I know you have an app architecture with non-Zephyr interrupt handling mixed with Zephyr interrupts. It's not that we refuse to support that, you just have to realize that it's "unsupported" architecturally and that none of us are likely to fix it for you. Please consider addressing the issues in an upstreamable way and submitting the fixes yourself.

@hongshui3000
Copy link
Contributor

@hongshui3000 indeed, this is is just cleanup (though it does get the EPC/EPS usage off of the debug interrupt level, which I know was one of your concerns). I know you have an app architecture with non-Zephyr interrupt handling mixed with Zephyr interrupts. It's not that we refuse to support that, you just have to realize that it's "unsupported" architecturally and that none of us are likely to fix it for you. Please consider addressing the issues in an upstreamable way and submitting the fixes yourself.

Ok, I see

Comment on lines +46 to +47
f.write(f"# define ZSR_{need} {regs[i]}\n")
f.write(f"# define ZSR_{need}_STR \"{regs[i]}\"\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious if any of the NXP platforms needed to be modified in addition to the Intel platforms, so I tried building this PR for nxp_adsp_imx8. The generated output looks like this:

/* Generated File, see gen_zsr.py */                                                                
#ifndef ZEPHYR_ZSR_H
#define ZEPHYR_ZSR_H
# define ZSR_ALLOCA MISC0
# define ZSR_ALLOCA_STR "MISC0"
# define ZSR_CPU MISC1
# define ZSR_CPU_STR "MISC1"
# define ZSR_FLUSH EXCSAVE1
# define ZSR_FLUSH_STR "EXCSAVE1"
# define ZSR_EXTRA0 EXCSAVE2
# define ZSR_EXTRA1 EXCSAVE3
# define ZSR_EXTRA2 EXCSAVE4
# define ZSR_RFI_LEVEL 3
# define ZSR_EPC EPC3
# define ZSR_EPS EPS3
#endif

Is the space between # and define deliberate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mildly? I wanted to distinguish the macro definitions from the include guards for anyone reading the generated file, but I wasn't willing to dedicate lines in the script to emitting blanks, I guess?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in theory this is only touching arch-layer code and everyone should pick it up seamlessly, but yeah: it's always possible there's downstream code somewhere already stepping on these specific register assignments that will need to be aware (or potentially integrated).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW GNU indent -ppi option does this. I find this type of indentation makes macros more readable (but OK: not for the whole file)

@andyross
Copy link
Contributor Author

Ping. Would be good to get this merged.

@nashif nashif merged commit d175c18 into zephyrproject-rtos:main Jan 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Build System area: Kernel area: Xtensa Xtensa Architecture platform: Intel ADSP Intel Audio platforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants