Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SHACL-C: analyze EBNF grammar complexity #67

Open
VladimirAlexiev opened this issue Aug 2, 2024 · 1 comment
Open

SHACL-C: analyze EBNF grammar complexity #67

VladimirAlexiev opened this issue Aug 2, 2024 · 1 comment

Comments

@VladimirAlexiev
Copy link
Contributor

VladimirAlexiev commented Aug 2, 2024

Followup to #52:

  • @afs : Have you decided on what complexity the grammar should be?

SPARQL is LL(1), which is simple in tech terms. It is compatible with LALR(1) there are parser generators for every programming ecosystem freely available.

@parrt @svenefftinge @zaach @lolo101 @tunnelvisionlabs Can you help us here?
Your names and github nicks were recommended by chatGPT.

@afs
Copy link
Contributor

afs commented Aug 2, 2024

Apache Jena's SHACL-C grammar is LL(1) using JavaCC. There is a tokenizer (longest match wins) and an LL(1) grammar parser.SPARQL, No special features of JavaCC are used (e.g. no local lookahead of 2 for example).

SHACL-C, SPARQL and Turtle call out the terminals separately from the grammar rules.

LL(1) has one big advantage - better error messages by default!

The SPARQL grammar in the SPARQL spec is LL(1) because it is produced from JavaCC. The SPARQL is large enough that a tool to manage and check it is essential. It is too easy to include things that weren't intended. In practice, there are points where strict LL(1) is potentially inefficient. Jena has alternative definitions that are rewritten (these are not in the input to the spec HTML production).

So I suggest for a spec, write for LL(1) to get widest coverage and identify any points where it deviates from that and justify them.

The RDF 1.2 Turtle recently had a proposal that requires a lookahead of 2 and left factoring the graph to remove the requirement made the grammar ugly. in teh end it wasn't necessary.

BNF in specs has to communicate as well as being a formal definition. Requiting implementers to modify the spec structure for their local tools risks weaker compliance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants