[conv.lval] Add example of indeterminate values that are not valid for the type CWG2899 #7047

Eisenwave · 2024-06-05T07:44:32Z

If the result is an erroneous value ([basic.indet]) and the bits in the value representation are not valid for the object's type, the behavior is undefined.

It's not obvious how you would run into this case given that generally, memory is initialized for erroneous values, and this initialization can typically done so that values are valid.

To be honest, I don't understand how you can run into this case, and it's not going to be obvious to others who follow, so we should add an example of how this can happen.

jensmaurer · 2024-06-05T10:54:43Z

Your erroneous value might have used a bit-pattern that's not valid for the type, e.g. "2" for a bool. So, a simple

bool x; // not initialized

at block scope is the example.

Eisenwave · 2024-06-05T11:38:12Z

Hmm, I've kinda suspected that this could happen, and it's pretty unfortunate that it can.

On another note, I don't think that sentence 2 should be normative wording at all. Lvalue-to-rvalue conversion requires reading the value of an object, and by definition, a value must exist. A value representation of 0x02 for bool corresponds to no value, so reading the bool is impossible, and there is no "result" of an lvalue-to-rvalue conversion in the first place.

Two changes make sense to me:

Don't imply that this effect is limited to erroneous values by singleing them out. You cannot perform lvalue-to-rvalue conversion when there is no value in general.
Turn the sentence into a note.

jensmaurer · 2024-06-05T13:15:40Z

Values can be invalid (e.g. trapping). For example, a pointer value that has a segment component where the segment no longer exists might trap when read.

The question here seems to be whether the value representation (i.e. set of bits) for a bool can have more than one bit. Presumably it can, for example one could say that 0x00 is false and 0xff is true. There are 8 bits to the value representation, but 0x01 is not a valid value for a bool.

And yes, we believe that situation only makes it to [conv.lval] in the erroneous value case; it's caught (with UB) earlier in other situations.

"A value representation of 0x02 for bool corresponds to no value, so reading the bool is impossible, and there is no "result" of an lvalue-to-rvalue conversion in the first place."

I don't know what "impossible" means in standardese. The best I can come up with is "undefined behavior", which is exactly what we do here.

In practical terms, there is no question that

bool x; // at block scope

creates an object of type bool, and we must allow (for the erroneous behavior mechanics) for the bytes here to be initialized to something like 0xdeadbeef. Yet, we know there are implementations that will violate the "either true or false" semantics of bool when reading such a value from x. We have to allow for that.

Eisenwave · 2024-06-05T13:33:45Z

The question here seems to be whether the value representation (i.e. set of bits) for a bool can have more than one bit.

Yes, I believe that's how it also works in the Itanium ABI. A bool is always required/guaranteed to be 0x00 or 0x01, which implies that the upper seven bits are not considered to be padding bits, but bits of the value representation that are always required to be zero.

And yes, we believe that situation only makes it to [conv.lval] in the erroneous value case; it's caught (with UB) earlier in other situations.

I believe it never makes it to the last sentence of [conv.lval], even for erroneous values. More explanation at #7051

In short, [conv.lval] says that it "reads" the object ([defns.access]) but according to [defns.access], by definition, this means reading the value. A "value" by defintion is "one discrete element of an implementation-defined set of values.". If the value representation 0x02 doesn't correspond to true or false, by definition, no value exists, and the first sentence of [conv.lval] already implies UB.

I don't know what "impossible" means in standardese. The best I can come up with is "undefined behavior", which is exactly what we do here.

Yeah, that is what I mean. You would run into UB before running into that [conv.lval] case, always, including for erroneous values.

Yet, we know there are implementations that will violate the "either true or false" semantics of bool when reading such a value from x. We have to allow for that.

Such implementations could consider any value representation other than 0x00 to correspond to true, for example. That's still perfectly valid, with the example (or other PRs I've made).

jensmaurer · 2024-06-05T15:10:32Z

If the value representation 0x02 doesn't correspond to true or false, by definition, no value exists, and the first sentence of [conv.lval] already implies UB.

We strive never to imply undefined behavior. Undefined behavior should be spelled as such whenever it appears.

Such implementations could consider any value representation other than 0x00 to correspond to true, for example.

That would be the easy case. No, they consider (in some situations) 0x02 as both true and false (or neither).

jensmaurer · 2024-06-05T15:12:45Z

Would something like

If the ~~result is~~ object has an erroneous value ([basic.indet]) and the bits in the value representation are not valid for the object's type, the behavior is undefined.

help?

Eisenwave · 2024-06-05T18:53:24Z

We strive never to imply undefined behavior. Undefined behavior should be spelled as such whenever it appears.

I agree that this would be ideal. To be honest I feel like the clearest way forward would be to re-introduce the notion of a "trap representation" (now called "value-less representation" in C23) into the standard. This concept already exists (e.g. a 0x02 bit pattern may be a value representation for bool, but not correspond to any value), however, we don't give it a name, and this is making things harder and turning explicit UB into implied UB.

No, they consider (in some situations) 0x02 as both true and false (or neither).

I'm not really getting what the meaning of that in terms of standardese would be (if it's both). Is 0x02 a value representation that corresponds to no value? Does it correspond to multiple values simultaneously? I don't believe such "quantum superstates" are allowed.

Eisenwave · 2024-06-05T18:55:28Z

Would something like

If the ~~result is~~ object has an erroneous value ([basic.indet]) and the bits in the value representation are not valid for the object's type, the behavior is undefined.

help?

Yes, that is substantially better. Wording the effect in terms of the input instead of a "result" that (to my understanding) never actually exists is a major improvement.

jensmaurer · 2024-06-06T06:42:17Z

I've created CWG2899 to address this.

Eisenwave linked a pull request Jun 5, 2024 that will close this issue

[conv.lval] Add example of erroneous 'trap representation' being read #7049

Open

jensmaurer changed the title ~~[conv.lval] Add example of indeterminate values that are not valid for the type~~ [conv.lval] Add example of indeterminate values that are not valid for the type CWG2899 Jun 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[conv.lval] Add example of indeterminate values that are not valid for the type CWG2899 #7047

[conv.lval] Add example of indeterminate values that are not valid for the type CWG2899 #7047

Eisenwave commented Jun 5, 2024

jensmaurer commented Jun 5, 2024 •

edited

Loading

Eisenwave commented Jun 5, 2024

jensmaurer commented Jun 5, 2024

Eisenwave commented Jun 5, 2024

jensmaurer commented Jun 5, 2024

jensmaurer commented Jun 5, 2024

Eisenwave commented Jun 5, 2024 •

edited

Loading

Eisenwave commented Jun 5, 2024 •

edited

Loading

jensmaurer commented Jun 6, 2024

[conv.lval] Add example of indeterminate values that are not valid for the type CWG2899 #7047

[conv.lval] Add example of indeterminate values that are not valid for the type CWG2899 #7047

Comments

Eisenwave commented Jun 5, 2024

jensmaurer commented Jun 5, 2024 • edited Loading

Eisenwave commented Jun 5, 2024

jensmaurer commented Jun 5, 2024

Eisenwave commented Jun 5, 2024

jensmaurer commented Jun 5, 2024

jensmaurer commented Jun 5, 2024

Eisenwave commented Jun 5, 2024 • edited Loading

Eisenwave commented Jun 5, 2024 • edited Loading

jensmaurer commented Jun 6, 2024

jensmaurer commented Jun 5, 2024 •

edited

Loading

Eisenwave commented Jun 5, 2024 •

edited

Loading

Eisenwave commented Jun 5, 2024 •

edited

Loading