Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inspecting errors thrown from ShadowRealm #353

Closed
legendecas opened this issue Mar 17, 2022 · 62 comments
Closed

Inspecting errors thrown from ShadowRealm #353

legendecas opened this issue Mar 17, 2022 · 62 comments

Comments

@legendecas
Copy link
Member

legendecas commented Mar 17, 2022

The errors thrown from ShadowRealm, withShadowRealm.prototype.evaluate, WrappedFunction.[[Call]], and ShadowRealm.prototype.importValue, are all wrapped with TypeError created in the caller realm. Exposing inspection of these errors to the executing realm can be very helpful in the real world. However, the spec didn't have a word if the creation of the TypeError can cause side effects in the executing realm.

Like say:

const fn = new ShadowRealm().evaluate(`
function createErrorObject() {
  return {
    get message() {
      globalThis.called = true;
      return String('foobar');
    }
  }
}

function foo() {
  throw createErrorObject();
}

foo;
`);

try {
  fn();
} catch (e) {
  e.message; // <= 
}

In this case, can the message getter be called? should the return value of the getter be wrapped? Or the implementation should only provide a non-observable inspection -- only exposing the inner name and message when the name and message of the error objects are data properties of primitive values?

Edit: although the title says "the error thrown from ShadowRealm", this problem also applies to the WrappedFunctions created for passing arguments into the ShadowRealm.

@ljharb
Copy link
Member

ljharb commented Mar 17, 2022

I would assume the thrown exception can't contain any object references from inside the shadow realm, which means it either can't have a getter, or, the object would be cloned such that the getter was wrapped. If the getter was wrapped, then it would set the shadow realm's globalThis.called, not the outer realm, which is fine.

@Jack-Works
Copy link
Member

I would assume the thrown exception can't contain any object references from inside the shadow realm, which means it either can't have a getter, or, the object would be cloned such that the getter was wrapped. If the getter was wrapped, then it would set the shadow realm's globalThis.called, not the outer realm, which is fine.

It can access the getter once and copy the name & message (but should we?).

I don't think we should wrap the getter.

@ljharb
Copy link
Member

ljharb commented Mar 17, 2022

it would make sense to me that all getters would be reified as own properties prior to crossing the realm boundary.

Since objects aren't permitted to cross the boundary, i don't think it's a reasonable expectation that obscure things about a thrown object will survive to the outer realm.

@legendecas
Copy link
Member Author

legendecas commented Mar 17, 2022

It sounds good to me to reify name and message as own properties on crossing the realm boundary. However, the problem is when the reification is performed regardless of if accessing thrown objects' name or message may cause side effects, the following case may be surprisingly a dead loop -- even the code below didn't access the TypeError created in the caller realm:

const fn = new ShadowRealm().evaluate(`
function createErrorObject() {
  return {
    get message() {
      throw createErrorObject();
    }
  }
}

function foo() {
  throw createErrorObject();
}

foo;
`);

try {
  fn();
} catch {}

@leobalter
Copy link
Member

This is interesting.

where (in specs) do we connect the getter of message to this new TypeError?

I'm not sure if I want to have a wrapped functions for it, but if triggering a get is too much, I'd rather not have any reference to a cross-realm message unless it is a data property.

@mhofman
Copy link
Member

mhofman commented Mar 17, 2022

What happens if the error object is a proxy that claims it has an own message data property ?

@mhofman
Copy link
Member

mhofman commented Mar 17, 2022

I think there are 3 options:

  • We always read the message to construct the error
  • We never read the message to construct the error
  • We create a new error object with a get message that basically wraps () => originalError.message

@ljharb
Copy link
Member

ljharb commented Mar 17, 2022

why get message on a new error instead of just making a data property? message isn't a getter on built-in errors.

@mhofman
Copy link
Member

mhofman commented Mar 17, 2022

To build the data property, you have to eagerly read the value on the original error (option 1), unless you make this new error object exotic.

@ljharb
Copy link
Member

ljharb commented Mar 17, 2022

ahhh right, then yes that's what i'd suggest.

@mhofman
Copy link
Member

mhofman commented Mar 17, 2022

To clarify, what do you suggest: eager read or using an exotic error object ?

@ljharb
Copy link
Member

ljharb commented Mar 17, 2022

Eager read - like getters and setters, exotics should be eliminated as much as is possible :-D

@mhofman
Copy link
Member

mhofman commented Mar 17, 2022

So I suppose the question is, what should the behavior be if accessing message on the original error throws (or if the original thrown thing isn't an error at all)

@ljharb
Copy link
Member

ljharb commented Mar 17, 2022

i'd assume whatever it throws would then be cloned in the same way as the original error was attempted to be, and that is what makes it across. Certainly if the original thing doesn't have an [[ErrorData]] slot i wouldn't expect .message to be accessed at all.

@Jack-Works
Copy link
Member

I think if reading the message throws, or the result is not a string, we ignore it.

@mhofman
Copy link
Member

mhofman commented Mar 23, 2022

We discussed this in the SES meeting today, and given that ShadowRealm is an advanced API, I don't think it makes sense to try and provide some improvements to the handling of Error instances thrown across the callable boundary. Aka, if a wrapped function throws an error, the caller should not get any more information than a generic TypeError, with no name or stack information about where the original error was thrown.

However it was raised that we may want to allow values that would otherwise be allowed through the callable boundary to be thrown, aka pass thrown primitives as-is, and wrap thrown functions. This may simplify the handling of objects, including error instances, for membranes working across the callable boundary.

@legendecas
Copy link
Member Author

@mhofman the problem about leaving the wrapping to userland is that values thrown from the shadow realm are not always from user codes.

Like, ShadowRealm.prototype.importValue can be rejected by the host with an error about target module not found, etc. In this case, there is no chance for userland code to wrap the thrown values. Without a mechanism to expose the info about the original error, it would be hard for people to figure out why ShadowRealm.prototype.importValue fails.

@mhofman
Copy link
Member

mhofman commented Apr 27, 2022

Afaik, both .evaluate() and .importValue() can already throw fully detailed errors for syntax or other evaluation exceptions in the caller realm without going through the attenuation of the callable boundary, or am I missing the point of your example?

@mhofman
Copy link
Member

mhofman commented Apr 27, 2022

Aka either the error is thrown by the ShadowRealm API itself, in which case it can include details about the error since the error is created in the caller realm, or the wrapped function is invoked, and the target code can make sure any error thrown during its execution is handled appropriately.

@legendecas
Copy link
Member Author

legendecas commented Apr 27, 2022

Specifically, I mean the HostImportModuleDynamically can reject the promise with host defined errors, and step 13 of https://tc39.es/proposal-shadowrealm/#sec-shadowrealmimportvalue replaces the rejection with TypeError and discard the host defined errors. In HostImportModuleDynamically, user code may not get a chance to be evaluated.

@mhofman
Copy link
Member

mhofman commented Apr 27, 2022

Ok so that case doesn't deal with the callable boundary, right? I'm not sure I grok the different places that can throw an error in this case, but maybe there is a way to let errors resulting from the parsing and loading of the module through without neutering them, while making sure runtime errors while evaluating the module end up as a caller realm TypeError, similarly to how PerformShadowRealmEval works?

@legendecas
Copy link
Member Author

Ok so that case doesn't deal with the callable boundary, right?

I'm not sure I get your point clearly. The host errors are created in the shadow realm (the executing context on HostImportModuleDynamically is the one shadow realm created), so the error values still need to go through the boundary.

I'm not sure I grok the different places that can throw an error in this case, but maybe there is a way to let errors resulting from the parsing and loading of the module through without neutering them, while making sure runtime errors while evaluating the module end up as a caller realm TypeError, similarly to how PerformShadowRealmEval works

Right. However, if I recall correctly, Hosts like Node.js can expose module loader hooks to userland so user code can still get a chance to evaluate to "load the module". In this case, the problem can still arise that inspecting the error may cause side effects.

@mhofman
Copy link
Member

mhofman commented Apr 27, 2022

By callable boundary in this case I meant the wrapping that is done by the wrapped function exotics.

Here it seems the issues you raise stem from the "entrypoints" into the ShadowRealm, aka HostImportModuleDynamically and possibly PerformShadowRealmEval, where something fails before obtaining a result value from the evaluation or import.

As I said, I don't fully grasp the layering for these operations, but as long as the following invariants are preserved, I think it's acceptable for errors to contain useful information:

  • User code (through host hooks or any other means) running in a given realm can never get direct access to "objects from another realm".
  • user visible stack traces from one realm do not seamlessly include stacks from another realm

(The above only applies where at least one ShadowRealm is involved, of course there is nothing preventing 2 legacy realms from leaking to each other)

@syg
Copy link

syg commented Apr 27, 2022

@mhofman I don't understand SES's recommendation here to not do anything about the errors. If HostImportModuleDynamically fails, it'll throw an error constructed in the shadow realm. That error contains the actionable message (e.g. "file not found") that we need to surface to the outside user of a shadowRealm.importValue() call. It's unacceptable to not surface that info IMO, otherwise how are you supposed to debug errors during importing?

At the very least I'd expect reading the .message property and copying it in the re-thrown error.

Edit: This opaque errors thing has bitten me already! In @legendecas's implementation in V8, we forgot to upload the imported module files in some importValue tests to test runner bots, and those bots were failing with a generic "cannot import value" error instead of a more descriptive "cannot find file" error. That made debugging much harder than it should've been.

@mhofman
Copy link
Member

mhofman commented Apr 27, 2022

Ok I probably didn't express myself correctly.

Conceptually I have no problems with errors thrown in the ShadowRealm context with no user code on the stack to contain debugging information visible from the caller realm. That includes the host's module resolution mechanism, parsing or any early syntax errors.

My concern is errors thrown when user code running in the context of the ShadowRealm is on the stack. That includes errors thrown while executing a module graph's top level code, including top level await.

The first can only ever happen during the HostImportModuleDynamically and PerformShadowRealmEval operations, not the wrapped function exotic's [[Call]]. I don't know how the spec text can be changed to accommodate this, if at all possible, but I am entirely open to improve the user debugging experience for this specific case.

@syg
Copy link

syg commented May 3, 2022

Hm, I don't have time to dig in right now on how to give an allowance in spec text for actionable error messages if they're not user-originated, but that's a pretty important thing to allow. I don't think there is a way to distinguish errors that are user-originated vs not. Perhaps we'd need to manually examine throw completions from HostImportModuleDynamically and PerformShadowRealmEval to re-wrap messages with the actionable intact.

@syg
Copy link

syg commented May 12, 2022

I've put this issue on the agenda for the June meeting to get clarity on what should or should not be exposed on errors across ShadowRealm boundaries. It's uncontroversial that errors should not pierce the callable boundary via direct object references, but not at all clear to me:

  • Whether stacks should also be prohibited (it's not a specced thing, but good to get discussion on in that it is a reality)
  • Whether messages can be passed through
  • Whether we should, and how to, distinguish user-thrown and platform-thrown errors

@mhofman
Copy link
Member

mhofman commented May 12, 2022

Thanks @syg. We plan to also discuss this in next week's SES meeting, if you're able to join.

@annevk
Copy link
Member

annevk commented May 23, 2022

So HTML does define serialization and deserialization of objects that have [[ErrorData]]. This should not end up contradicting that, right? How will that be ensured? (See also #336 which dealt with what seems like a very similar issue. My concerns are similar to those raised there, but perhaps I'm missing something.)

@mhofman
Copy link
Member

mhofman commented May 24, 2022

@annevk I'm failing to understand what you mean. Objects with an [[ErrorData]] slot do not have a special treatment through the callable boundary, and are currently just objects which will cause a new TypeError.

In what case does HTML serialization come into play?

FYI, I am looking at an explicit structured cloning like API as a followup proposal to handle sharing objects through the callable boundary, but that's out of scope for the current proposal.

@ljharb
Copy link
Member

ljharb commented Jul 6, 2022

imo it should definitely do a brand check, and it sounds fine to only do it for data properties.

@littledan
Copy link
Member

littledan commented Jul 6, 2022

I was proposing that there be no brand check, and that it would invoke a getter if present. The algorithm would just be a normal Get. What is the motivation for doing otherwise, or trying to limit observability?

@mhofman
Copy link
Member

mhofman commented Jul 6, 2022

I was just asking for a clarification, I do not currently hold a strong opinion one way or another (well I'd probably be against an observable Get on the original error deferred to the time of Get on the new error).

@caridy
Copy link
Collaborator

caridy commented Jul 6, 2022

Keep it mind something that was discussed in plenary, which is that there is no evidence, or strong desire to hide error information from the incubator realm, in fact it is desirable that the incubator can get a lot more info when possible. If course, this is a two ways street, and sometimes it is the shadow realm the one calling into the incubator realm, in which case the error should be completely hidden.

@legendecas
Copy link
Member Author

@littledan: I was proposing that there be no brand check, and that it would invoke a getter if present. The algorithm would just be a normal Get. What is the motivation for doing otherwise, or trying to limit observability?

I'd be happy to see a uniform approach to transferring exceptions across the boundary. But I'm curious about how to handle the possible failure on the normal Get -- like in the example raised previously #353 (comment).

@littledan
Copy link
Member

littledan commented Jul 7, 2022

  • @mhofman Yes, I was picturing that the Get would be eager, when crossing the boundary, not lazy through a getter, for exactly the reason you state, as well as for general simplicity. I think this addresses the issue @caridy mentions, right?
  • @legendecas Yes, it does seem like that loop case is important. I would say that, if the Get throws, we simply leave the .message property undefined (just as I am proposing would happen if it is something other than a string).

@ljharb
Copy link
Member

ljharb commented Jul 7, 2022

Leaving it undefined would conflate it with throwing an error with an actual undefined message property, no?

@acutmore
Copy link

acutmore commented Jul 14, 2022

With debugging being a core motivator, would one option be to keep the originally thrown error referenced in a hostInternalSlot of the new TypeError. Which hosts are permitted to read in a non-observable (to the application) way. e.g if the error was passed to the platforms built-in console.error it could log a more detailed message to stderr.

I can imagine a similar thing happening for errors thrown in their own task. e.g a setTimeout(() => { throw new Error(msg) }) happening inside the shadowRealm wouldn’t be passed back to the incubatorRealm, but the host could still log the uncaught exception.

@mhofman
Copy link
Member

mhofman commented Jul 14, 2022

During the SES call yesterday I suggested constructing a new shadow realm error that captures the message and name of the original error, and is used as a cause for a typeerror thrown across the callable boundary. I'm not entirely sure we need the indirection, but it's an option. Anyway, if we have a custom error type, we could store a link to the original error, but being non observable, does the link actually need to be specified or should that be left to implementors choice

@acutmore
Copy link

we could store a link to the original error, but being non observable, does the link actually need to be specified or should that be left to implementors choice

Maybe the normative spec specifies what, if any, data can observably be cloned to the new error, and an editorial note suggesting additional, non-observable from the application, developer-experience improvements?

@littledan
Copy link
Member

What engines have asked us for is guidance on what they should actually do, rather than latitude to do a wide range of things as such a note would leave them. (Note that a "MAY" is normative, not a note/editorial, and that might be what we want to do here.) Let's see if we can answer the implementers' question concretely.

@littledan
Copy link
Member

We discussed this issue at length in the SES call this week. After some thought, I believe we reached the shared conclusion that it should be OK to copy a string .message to the new error, and that this level of transparency makes sense to apply "reflexively", that is, when crossing the ShadowRealm boundary in either direction.

We did not settle on whether to do an arbitrary Get of the "message" property, or whether to avoid calling arbitrary JS code by only getting the own "message" property of an original error object (as marked by an internal slot that Errors have, which proves that it's not a Proxy). Although I argued for the former above (out of simplicity), I'd be OK with the latter as well.

@caridy
Copy link
Collaborator

caridy commented Jul 14, 2022

I believe we also agreed during the call that the type of error is not relevant or important, and that we will continue throwing the TypeError with the proper data attached to it.

@mhofman
Copy link
Member

mhofman commented Sep 13, 2022

I just discussed this issue at length again with @erights, and we arrived at the following observations:

  • Stack traces should be censored across the callable boundary. Code on one side of a ShadowRealm's callable boundary should not be able to discover any information about stack frames of functions that belong on the other side, including name and filename, but also the number of stack entries. It can however discover if there are stacks entries of another realm interspersed with or below its own. For example a blue realm calling into a yellow realm, calling back into a blue realm can be made aware there is another realm (yellow) invocation between its own (blue) stack frames, but not any more details.
    • Please note this censoring does not require the error to cross the callable boundary like when thrown, but mere access to the stack of a constructed error. For previous literature on the risks of exposing stack information, please refer to http://combex.com/papers/darpa-review/security-review.html#universalscope.
    • Please also note this censoring does not apply to developer tooling, only to program self introspection through Host/implementation APIs. Prior art on this includes async stack traces which are only available through developer tooling in current engines.
  • Host guarantees on boundary piercing #324 would apply to any host or implementation specific APIs for Errors that deal with structured stack information, and preclude them from piercing the object graph isolation created by the callable boundary. However given the first bullet point above on stack censoring, I cannot imagine this scenario occurring in the first place.
  • There is no reason a value which is allowed through the callable boundary as an argument or return value, shouldn't also be allowed through as a thrown value. Throwing a TypeError for those values is actually confusing. As such, we believe the proposal should be changed to simply call GetWrappedValue on the thrown value before rethrowing it. That also means no TypeError would need to be explicitly constructed outside of the one in the GetWrappedValue operation when exiting execution from the callable boundary.
  • We remembered that the spec leaves the "message" property of errors as implementation defined, and never specifies the content for errors thrown by the spec. As such we believe it would be acceptable to leave the "message" of the TypeError thrown by GetWrappedValue to be implementation defined, with a note allowing the implementation to un-observably include the value of the message and name string properties of an Error object in the "message" of this TypeError. Outside of the special case of Error objects, we do not think implementations should be allowed to reveal more information about the cause of the TypeError than the type of the object which triggered the TypeError.

@ljharb
Copy link
Member

ljharb commented Sep 13, 2022

Why should it not? A stack string can be freely passed across explicitly, and unlike type classes like "objects" etc, there's no way to prevent that. Developer tooling includes "sending data to remote loggers with javascript", so this isn't something that can be relegated only to the executing environment's privileged tools.

It seems perfectly fine to me to require that a membrane library do the censoring. Why is that not workable?

@mhofman
Copy link
Member

mhofman commented Sep 13, 2022

Passing a stack string explicitly through the callable boundary is different than an uncensored stack string being available by the mere act of constructing an error object (possibly implicitly).

It sounds perfectly fine to me to require debug tooling that runs with the same privilege of the code to be limited, the same way they already are regarding async stack traces. These libraries can wrap ShadowRealm with membrane like Error tracking if they want (and are purposely placed by the author in a position to do so) to restore full stack traces across ShadowRealms.

@mhofman
Copy link
Member

mhofman commented Sep 13, 2022

Btw, I do not expect logging libraries to ever need to actually do that. ShadowRealm is not intended to be an API most developer will use directly, and I expect libraries that use ShadowRealm to handle errors in accordance with their purpose. For these libraries, recomposing stack traces in user code when an error crosses the boundary is a lot easier than taming stack traces on all created errors.

@littledan
Copy link
Member

I'm having a lot of trouble understanding @mhofman 's proposal. Wouldn't GetWrappedValue on a normal error object fail, because it's a non-callable object?

@mhofman
Copy link
Member

mhofman commented Sep 13, 2022

Exactly. What I'm suggesting is that we provide guidelines that in that TypeError's message, the implementation is allowed to include the content of the name and message properties of the error object (as long as the implementation grabs those strings in an unobservable way)

@mhofman
Copy link
Member

mhofman commented Sep 13, 2022

E.g. an implementation would be allowed to do:

const thrower = (new ShadowRealm()).evaluate(`() => {
  throw new RangeError('0 is too small');
}`);

try {
  thrower();
} catch (err) {
  console.log(err); // TypeError: RangeError thrown across callable boundary: 0 is too small
}

@littledan
Copy link
Member

This behavior seems reasonable, but I'd prefer that we specify exactly what the message is, rather than asking implementations to come up with their own algorithm. I was proposing that we just include message and not the name because the name is generally not very meaningful. Also I don't know how to access anything about an object unobservably (given Proxy).

@mhofman
Copy link
Member

mhofman commented Sep 13, 2022

This would be the only place in the spec where we force a specific message on the implementation. Anywhere else the implementation is free to define its own message.

Regarding the unobservable access, the implementation is free to check whether the object is a proxy or has accessors. Or maybe we could make this stricter: if the object has an [[ErrorData]] slot, and an own message data property (the name would be trickier to specify since it's usually looked up the prototype chain). I believe leaving the constraint at "don't execute user code" is sufficient, and engines can decide how best to accomplish that. (I'm aware this would provide a capability to detect whether an error is proxy, but I think it's acceptable in this case)

@syg
Copy link

syg commented Oct 11, 2022

Agreed with @ljharb on #353 (comment). I see no reason to censor stacks. Ignoring the procedural question of what normative thing 262 can even say, ShadowRealms have use beyond SES's membrane model, so I'm also inclined to to think that if the SES security model requires it, then it's those libraries' job to ensure that property holds.

@leobalter
Copy link
Member

Thanks @syg!

It's common understanding that ShadowRealms is to be considered low-level while membrane systems are supposed to maintain any ShadowRealm output and not just use it as-is.

@caridy
Copy link
Collaborator

caridy commented Dec 8, 2022

Solved by #382

More details on the errors section in the explainer.

@caridy caridy closed this as completed in d370263 Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants