Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding section describing the software view of the load/store instrs #8

Conversation

christian-herber-nxp
Copy link
Collaborator

No description provided.

@christian-herber-nxp christian-herber-nxp linked an issue Mar 18, 2024 that may be closed by this pull request
Copy link
Collaborator

@tariqkurd-repo tariqkurd-repo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine, and it avoids the problem of overwriting the address operand if it is also one of the destination registers.

@christian-herber-nxp christian-herber-nxp merged commit 026036a into main Mar 19, 2024
1 check passed
@christian-herber-nxp christian-herber-nxp deleted the 7-further-question-about-instruction-atomicity-vs-faults branch March 19, 2024 08:09
@ubc-guy
Copy link
Collaborator

ubc-guy commented Mar 19, 2024

with the currently written up description, ld and sd instructions may appear as a load or store of byte operations in any order and may be repeated; further, the load performs a single atomic update to the register pair.

I have a few concerns with the way this is written:

a) atomic updates to the register pair require additional resources, particularly a dword size buffer which first accepts the byte and then commits; to avoid increasing latency by 1 cycle, the entire dword would also have to be bypassed with muxes. in other words, providing this simple software view requires an expensive hardware solution.

b) this atomic update text seems to contradict an earlier note within the current spec:

NOTE: Therefore, implementations are not required to ensure atomicity in loading/storing to/from the individual registers relating to a 64-bit operand.

c) the cm.pop instruction does not require atomic updates of their multi-register destinations

d) do lh/lw/sh/sw instructions behave similarly, ie breaking down into bytes? if so, there are no further concerns here. however, the only similar description I could find regarding a breakdown into bytes is cm.pop and cm.push so we'll continue on to (e) below.

e) from the Unpriv Manual, naturally aligned operations (eg halfword aligned for halfword operations) behave atomically but unaligned ones do not. the precise text from Unpriv is below:

Furthermore, whereas naturally aligned loads and stores are guaranteed to execute atomically, misaligned loads and
stores might not, and hence require additional synchronization to ensure atomicity.

NOTE: We do not mandate atomicity for misaligned accesses so execution environment
implementations can use an invisible machine trap and a software handler to handle
some or all misaligned accesses. If hardware misaligned support is provided, software can
exploit this by simply using regular load and store instructions. Hardware can then
automatically optimize accesses depending on whether runtime addresses are aligned.

f) the UnPriv Manual seems to contradict itself between lh/sw/sh/sw and cm.push/cm.pop, which is not good.

g) i think the correct response is a compromise, where individual words (not bytes) are updated atomically by cm.push/cm.pop; the solution here, for ld/sd in rv32, would then to also have individual words be updated atomically (but not the pair).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

further question about instruction atomicity vs. faults
4 participants