Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NonNull should document that From<&T> and friends preserve provenance #116181

Open
joshlf opened this issue Sep 26, 2023 · 11 comments · May be fixed by #130571
Open

NonNull should document that From<&T> and friends preserve provenance #116181

joshlf opened this issue Sep 26, 2023 · 11 comments · May be fixed by #130571
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@joshlf
Copy link
Contributor

joshlf commented Sep 26, 2023

The implementations of From<&T> and From<&mut T> for NonNull<T> preserve pointer provenance. This should be documented as a guarantee so callers can rely on it.

@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Sep 26, 2023
@GuillaumeGomez GuillaumeGomez added A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Sep 27, 2023
@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Oct 29, 2023
@mj10021
Copy link
Contributor

mj10021 commented Nov 15, 2023

@rustbot claim

@RalfJung
Copy link
Member

RalfJung commented Nov 29, 2023

The implementations of From<&T> and From<&mut T> for NonNull preserve pointer provenance.

In which sense do they do that? Functions that take references always generate a fresh aliasing tag for their arguments, so in that sense they decisively do not preserve provenance in Miri today.

@RalfJung
Copy link
Member

Cc @rust-lang/opsem

@joshlf
Copy link
Contributor Author

joshlf commented Dec 4, 2023

The implementations of From<&T> and From<&mut T> for NonNull preserve pointer provenance.

In which sense do they do that? Functions that take references always generate a fresh aliasing tag for their arguments, so in that sense they decisively do not preserve provenance in Miri today.

I'm confused by this. The examples in the offset_from docs and the discussion of "Original Pointers" in the provenance docs suggest that obtaining two pointers from the same &T will result in pointers with the same provenance. E.g., from offset_from:

let a = [0; 5];
let ptr1: *const i32 = &a[1];
let ptr2: *const i32 = &a[3];
unsafe {
    assert_eq!(ptr2.offset_from(ptr1), 2);
    assert_eq!(ptr1.offset_from(ptr2), -2);
    assert_eq!(ptr1.offset(2), ptr2);
    assert_eq!(ptr2.offset(-2), ptr1);
}

ptr1 and ptr2 may be used as arguments to offset_from, and this is considered sound despite the fact that they're obtained via separate &a[x] operations.

My understanding is that the point of provenance is to be able to restrict operations like offset_from, and in particular to require that two pointers must have the same provenance in order for them to be soundly compared in any way. If &T generates fresh provenance, I would expect this example code to exhibit UB. I assume I'm missing something here?

@RalfJung
Copy link
Member

RalfJung commented Dec 5, 2023

Ah, you seem to confuse "has the same provenance" with "derived from a pointer to the same object".

We have not finalized what exactly the provenance of a pointer in Rust will be, but it will likely involve two components:

  • the allocation the pointer is derived from
  • a "tag" for tracking aliasing requirements (both stacked borrows and tree borrows have such a tag)

offset_from requires the first component to be the same, but by saying "they have the same provenance" you would say both components are the same and that's just not true.

The first component stays the same on every operation you do with that pointer. It is literally impossible to change. It gets set when the pointer is created (i.e., when the memory is allocated and the initial pointer is returned -- or when an int2ptr cast happens) So it doesn't make a lot of sense to document this for each and every operation, IMO.

@joshlf
Copy link
Contributor Author

joshlf commented Dec 5, 2023

Okay, I think I understand. I re-read the provenance docs in the ptr module. If I'm reading them correctly, the idea is that provenance in some sense "refers to" a particular Allocated Object, and all pointer operations require that the argument pointer(s) satisfy the following conditions:

  • Has provenance
  • That provenance refers to an Allocated Object, A
  • The pointer's address is contained in A

The provenance docs in the ptr module use the phrase "provenance over that memory", and the above is my attempt to clarify for myself what "provenance over that memory" means.

Does that match your understanding?

@RalfJung
Copy link
Member

RalfJung commented Dec 7, 2023

Hm, no not quite. Zero-sized accesses do not require the pointer to have provenance, that's why they are allowed on ptr::invalid.

But other than that, what you say is correct, but does not completely characterize provenance. (IOW, the conditions you list are necessary but not sufficient for a memory access to be allowed.) We deliberately are leaving the full characterization of provenance open. Provenance might restrict a pointer further, for example:

  • We might not just require the pointer to be in some allocated object but in some part of the allocated object. (For instance, in Stacked Borrows, after let r = &s.field;, the provenance of r only allows this pointer and everything derived from it to be used inside that field. This is called "subobject provenance". C has subobject provenance for everything except arrays, but in Rust we probably want to not be quite so strict. Tree Borrows has no subobject provenance.)
  • Provenance also almost certainly has a temporal component, e.g. in Stacked Borrows and Tree Borrows, certain actions on other pointers can invalidate one pointer's provenance so that pointer may not be used any more for the affected memory (or even for any memory at all, though neither Tree Borrows nor Stacked Borrows are that strict). This is how we enforce no-alias requirements.

Also, I don't know what you mean by "all pointer operations". I am assuming you are referring to loads and stores here. There are other pointer operations, such as deref (*ptr, in place context), offset, wrapping_offset, place projection, offset_from.

@mj10021
Copy link
Contributor

mj10021 commented Jan 16, 2024

@rustbot release-assignment

@joshlf
Copy link
Contributor Author

joshlf commented May 11, 2024

Okay, based on your comment, I'm updating from:

Okay, I think I understand. I re-read the provenance docs in the ptr module. If I'm reading them correctly, the idea is that provenance in some sense "refers to" a particular Allocated Object, and all pointer operations require that the argument pointer(s) satisfy the following conditions:

  • Has provenance
  • That provenance refers to an Allocated Object, A
  • The pointer's address is contained in A

...to:

Given t: *const T (or *mut T) where size_of_val_raw(t) > 0, load and store operations on t must satisfy the following:

  • t has provenance
  • t has provenance that refers to an Allocated Object, A
  • t refers to a byte range entirely contained in A
  • t has provenance that is valid for all of the bytes to which it refers (this is sub-object provenance)
  • t's provenance is valid at the point in the program's execution where the load/store occurs (this is the temporal component)*

* I suppose you could also just argue that some operations can "remove" provenance from other points, so this is unnecessary to explicitly state.

Is this both necessary and sufficient (ignoring non-provenance-related rules, of course)? Or am I still missing aspects?


Based on this discussion, maybe a more appropriate promise would be that From<&'a mut T> produces a NonNull<T> with provenance which is valid for its referent and for the lifetime 'a?

@RalfJung
Copy link
Member

RalfJung commented May 12, 2024

Is this both necessary and sufficient (ignoring non-provenance-related rules, of course)? Or am I still missing aspects?

That sounds like a really complicated way to say: t must have provenance that is valid for the affected memory range and the given access (read/write) in the current program state. (Which says basically nothing since we're not defining when which provenance is valid for which ranges and accesses.)

But it does sound equivalent to that, yes.

Based on this discussion, maybe a more appropriate promise would be that From<&'a mut T> produces a NonNull with provenance which is valid for its referent and for the lifetime 'a?

There's nothing different about NonNull vs regular raw pointers. Whether it is actually valid for lifetime 'a depends on which other actions are taken -- for instance, if the parent mutable reference gets used again during 'a (which it can, since the NonNull does not carry a lifetime), then that may invalidate the raw pointer.

The most sensible thing to say about From<&'a mut T>, in my eyes, is just to say that it behaves like casting the mutable reference to a *mut T and then returning that. (The "data" in a NonNull is the same as that of a *mut T, after all, it just has the extra promise of being non-null.)

@joshlf
Copy link
Contributor Author

joshlf commented Sep 19, 2024

Sounds good! Put up a PR: #130571

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-docs Area: documentation for any part of the project, including the compiler, standard library, and tools T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
6 participants