Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section on JSON Processing. #1202

Merged
merged 14 commits into from
Aug 9, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 73 additions & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3558,7 +3558,7 @@ <h3>Reserved Extension Points</h3>
<tr>
<td>`confidenceMethod`</td>
<td>
A property used for specifying one or more methods that a verifier
A property used for specifying one or more methods that a verifier
might use to increase their confidence that the value of an attribute in or of
a verifiable credential or verifiable presentation is accurate, including but not
limited to attributes such as `initialRecipient` (a/k/a `issuee`), `presenter`,
Expand Down Expand Up @@ -3952,6 +3952,78 @@ <h3>Media Type Precision</h3>
</p>
</section>

<section class="informative">
<h2>JSON Processing</h2>

<p>
While the media types describing conforming documents defined in this
specification always express JSON-LD, JSON-LD processing is not required to be
performed, since JSON-LD is JSON. Some scenarios where processing a
<a>verifiable credential</a> or a <a>verifiable presentation</a> as JSON is
desirable include, but are not limited to:
</p>

<ul>
<li>
Before securing or after verifying content
that requires <a href="https://csrc.nist.gov/glossary/term/data_integrity">data
integrity</a>, such as a
<a>verifiable credential</a> or <a>verifiable presentation</a>.
</li>
<li>
When performing JSON Schema validation, as described in Section
<a href="#data-schemas"></a>.
</li>
<li>
When serializing or deserializing <a>verifiable credentials</a> or
<a>verifiable presentations</a> into systems that store or index their contents.
</li>
<li>
When operating on <a>verifiable credentials</a> or <a>verifiable
presentations</a> in a software application, after verification or validation
is performed for securing mechanisms that require an understanding of
and/or processing of JSON-LD.
</li>
<li>
When an application chooses to process the media type using the `+json`
structured media type suffix.
</li>
</ul>

<p>
That is, JSON processing is allowed as long as the document being consumed or
produced is a <a>conforming document</a>. If JSON processing is desired, an
implementer is advised to follow the following rule:
</p>

<ul>
<li>
Ensure that all values associated with a `@context` property are in the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values are URLs or objects right?

Copy link
Member Author

@msporny msporny Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, correct; the values can be whatever is legal for the VCDM @context value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't understand this section in general. What is the purpose of adding this? How does this help a developer implementing the VCDM 2.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this very helpful to implementers. It tells implementers how they can be sure they interoperate with others who use JSON-LD transformations without having to use a JSON-LD library in their own applications. This has sometimes been a point of confusion. This section informs implementers how they can use VCs as JSON with confidence that it will be interoperable with someone else who may, for example, convert VCs to RDF.

I also think it's important to highlight to implementers that systems can be architected such that verification can occur in one component, wholly separate from other components of the application that need to read the VC, such that those other components only need to check the above rules to ensure they will continue to get interop with other people or software components that might use a JSON-LD library. This separation of concerns is helpful for a variety of reasons, not limited to code modularity, reusability, and maintenance, and to help divide labor amongst different developers with different experience and knowledge.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe community would greatly benefit from such explanation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through the PR many times and I still don't understand why it is actually there. What does this PR really want to express or explain? Because I don't understand what it is.

What I'm worried about is the following, the base media type is based on JSON-LD, and there are probably some situations where a specific constellation of @context entries can make a difference between JSON and JSON-LD processing and in those cases JSON processing is probably not enough.

On the other hand, issuers should implement linked data best practices if they are issuing VCs in the base media type. That includes authoring proper vocabs and context definitions. In both scenarios, verifier and issuer, need to have a decent understanding of how JSON-LD works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base media type is JSON-LD and this implies JSON-LD processing in my opinion. Everything else feels it has the potential to cause a lot of troubles.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awoie Could it be possible to be more specific, maybe include the RFC reference?

JSON-LD 1.1, the data format you are referring into, declares the following:

JSON-LD is designed to be usable directly as JSON, with no knowledge of RDF [RDF11-CONCEPTS].

A JSON-LD document is always a valid JSON document. This ensures that all of the standard JSON libraries work seamlessly with JSON-LD documents.

Information content from IETF rfc6839 also defines how the +json suffix is used in the ld+json.

In my opinion it is a quite a big burden for adoption to require immediately authoring the vocabs and context definitions. These anyway have no meaning if everyone defines their own, the power comes from unification, and that takes time.

Copy link
Contributor

@dlongley dlongley Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@awoie,

That includes authoring proper vocabs and context definitions. In both scenarios, verifier and issuer, need to have a decent understanding of how JSON-LD works.

If an issuer (or creator of a shared vocab and context, such as the 1EdTech community that works on Open Badges) provides a spec that says: "This is what all these properties mean and what values they can have", then a consuming application can write code against that just like they would with any other spec. What they do need to do, to ensure interoperability with other people in the ecosystem that do want to perform JSON-LD transformations, is follow the rules being expressed here. These rules might be explicitly provided (or now more easily linked to) in that spec or that spec can be designed such that they will be implicitly followed through the requirement of readers of their spec to use JSON schema and hash-checking and / or static context files.

None of this requires "a decent understanding of how JSON-LD works" for these consumers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't disagree with JSON-LD being valid JSON. I was just arguing about the meaningfulness and the justification of JSON-LD in this standard. I'm wondering what is the value of JSON-LD processing and having context normative in that case. It would really help to get answers on that #1227.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will document the benefits of JSON-LD in #1227.

What we have found is that governments, education institutions, and large enterprises DO define interop vocabularies and have done so for many years. A number of developers creating simple/single systems are rarely exposed to this stuff because they're not in the business of integrating large, complex systems that are used by these large enterprises and large governments, nor are they exposed to having to connect systems between different countries. The goal of the VCDM is to ensure that both simple systems and complex systems are possible to support. To support simple systems, we provide this section (which has always been supported in the VCDM spec) and the base vocabulary so that if people don't want well defined semantics, they just use the base context and stop there. However, that only solves a small part of the problem space -- you can do identity cards, or simple JWT-style token use cases, but you can't scale beyond that level of simplicity. To address the larger enterprise, government, cross-border use cases, you need to interop at a much larger scale with more complex data. This is where formal semantics come in -- again, you don't have to use those with VCDM, but VCDM has to support that mechanism in some way.

In order to do the more complex tasks, where a lot of value can be unlocked, there is a need for formally defined vocabularies so that you know where you can and cannot map data between systems. Some examples are below... starting with education:

Credential Transparency Initiative, which is broadly used in education:

https://credreg.net/ctdl/terms#classes

The 1EdTech (standards setting body for Education) uses JSON-LD vocabularies heavily in their global standards:

https://1edtech.github.io/openbadges-specification/ob_v3p0.html

The Rich Skills Descriptions is another initiative that is used by education:

https://www.openskillsnetwork.org/rsd

In the Supply Chain space, the GS1 Web Vocabulary is used for global trade (using JSON-LD and vocabularies):

https://www.gs1.org/voc/?show=properties

The Traceability Vocabulary is yet another:

https://w3id.org/traceability

... and the list goes on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will document the benefits of JSON-LD in #1227.

What we have found is that governments, education institutions, and large enterprises DO define interop vocabularies and have done so for many years. A number of developers creating simple/single systems are rarely exposed to this stuff because they're not in the business of integrating large, complex systems that are used by these large enterprises and large governments, nor are they exposed to having to connect systems between different countries. The goal of the VCDM is to ensure that both simple systems and complex systems are possible to support. To support simple systems, we provide this section (which has always been supported in the VCDM spec) and the base vocabulary so that if people don't want well defined semantics, they just use the base context and stop there. However, that only solves a small part of the problem space -- you can do identity cards, or simple JWT-style token use cases, but you can't scale beyond that level of simplicity. To address the larger enterprise, government, cross-border use cases, you need to interop at a much larger scale with more complex data. This is where formal semantics come in -- again, you don't have to use those with VCDM, but VCDM has to support that mechanism in some way.

In order to do the more complex tasks, where a lot of value can be unlocked, there is a need for formally defined vocabularies so that you know where you can and cannot map data between systems. Some examples are below... starting with education:

Credential Transparency Initiative, which is broadly used in education:

https://credreg.net/ctdl/terms#classes

The 1EdTech (standards setting body for Education) uses JSON-LD vocabularies heavily in their global standards:

https://1edtech.github.io/openbadges-specification/ob_v3p0.html

The Rich Skills Descriptions is another initiative that is used by education:

https://www.openskillsnetwork.org/rsd

In the Supply Chain space, the GS1 Web Vocabulary is used for global trade (using JSON-LD and vocabularies):

https://www.gs1.org/voc/?show=properties

The Traceability Vocabulary is yet another:

https://w3id.org/traceability

... and the list goes on.

I'd argue that in cross-border and cross-jurisdictional use cases JSON-LD is not required either. There is evidence that ICAO and ISO did a pretty good job at that in the past. We have Digital Traveler Credentials (DTC) and mobile driver's license (mDL) and also mobile identity documents (mID) all defined in ISO. Also AAMVA extends their non-JSON-LD-based vocab by including additional terms. What I'm asking for is the underlying reasons why "cross-border use cases require JSON-LD and context" require that because apparently those use cases can be solved without JSON-LD and context as well. Also note that I'm totally fine with context and JSON-LD being included but developers need to understand why they are doing what and what are the benefits and implications of the approaches. Let's have that discussion in #1227.

expected order, the contents of the context files match known good
cryptographic hashes for each file, and domain experts have deemed that the
contents are appropriate for the intended use case.
</li>
</ul>

<p>
Using static context files with a JSON Schema is one acceptable approach to
implementing the rule above. This can ensure proper term identification,
typing, and order, when a JSON document is processed as JSON-LD.
</p>

<p>
The rule above guarantees semantic interoperability between JSON and JSON-LD for
literal JSON keys mapped to URIs by the `@context` mechanism. While JSON-LD
processors will use the specific mechanism provided and can verify that all
terms are correctly specified, JSON-based processors implicitly accept the same
semantics without performing any JSON-LD transformations, but instead by
applying the above rules. In other words, the context in which the data exchange
happens is explicitly stated for both JSON and JSON-LD by using the same
mechanism. With respect to JSON-based processors, this is achieved in a
lightweight manner, without having to use JSON-LD processing libraries.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I think this is a great addition.

I would like to see something similar for JSON-LD processing, highlighting it's benefits, and in particular what the value of converting away from +ld+json to n-quads is... With special focus on @type and @id.... in a separate PR.

</p>

</section>
</section>

<section>
Expand Down