Skip to content

Document file specification

Tom Browder edited this page Aug 4, 2019 · 7 revisions

In the path towards a new documentation system, we need to settle on a specification of the document page format first. There's no such thing now, although there are some tests that check certain aspects of the documentation (and the rest of the files in the repository). Let's see what a documentation page really is.

  1. Content in Pod6 format.
  2. Document metadata
  3. Content metadata, including, but not limited to, indexes.

Let's see first what kinds of metadata we are using.

Metadata for Perl 6 documentation files

There are several pieces of metadata.

  1. A broad topic, which is contained in the directory it occupies in the repository. There are Language, Type and Program documents.
  2. A narrow topic, which is implicit in the file name. For instance, some Language files that deal with transition from perl5 to 6 get special treatment when processed.
  3. A section, which is conceptually the same as above, only it's processed by a different file and contained in the 00-POD-CONTROL file. This is essentially used to generate the language.html page.
  4. A tag at the beginning of the file, =begin pod :tag<perl6>. It's apparently unused, although it's a good-intentioned way of establishing a topic as in the first item above.
  5. Title and subtitle.
  6. Class or role definition, if it's a class.
  7. Example code output.

1 to 3 above are external to the file. 4 to 6 are internal. 2 and 3 are roughly the same, but they are dealt with in different ways. 2 and 4 are also roughly the same, but 4 is apparently unused and there's no specification of what values it should take and how to deal with them.

So, here's the spec proposal

  1. All metadata should be included as pod config in the =begin pod line. (Note that the '=begin pod' line should be the first non-comment line in the file, there should be no other '=begin pod' line in the file, and the matching '=end pod' line should be the last non-comment line in the file.)
  2. The documents should have a topic with three possible values: Reference, Tutorial or Runtime (equivalent to the actual Type, Language and Runtime). All of these might have subtopics.
    1. Reference would have the possible values: unit (class, role) (or possibly package), routine (functions that are defined outside classes), or statement (whatever is not any of those, such as use or unit).
    2. Language will have as many subtopics as sections now.
  3. Title and subtitle will be compulsory.
  4. Reference documents will include a definition which will be compulsory and, if possible, checked against actual definition in Rakudo.
  5. Documentation will include no code that could produce side effects in the pre-compilation phase.
  6. Superclasses and roles will be obtained from instrospection. If some class needs to be hidden (NQP classes, for instance, or maybe native), metadata should be indicated to do so.

Internal metadata is good, and Pod6 does have a nice way of simply using it.

Document content

The documents provide not only content to be rendered in different ways, but also some sections are rendered in secondary syntax and method name files. How it's rendered depends on meta data

Document content metadata

There are several types of document content metadata

  1. Rakudo version metadata. Right now, this is not really coded as such, but as plain text, as in "This feature was introduced in Rakudo version whatever".

  2. Rakudo-specific or Perl 6-generic. In the same way, this is simply described as content.

  3. Rakudo bugs. This is related to the previous item, but it should be dealt with specifically, as said in this issue.

  4. Indexing and content-generation metadata. This is used in principle for creating unique URLs that point to the content; but, for the same reason, it can be used to create a single page that contains it. Again, there are several types of this indexing metadata.

    1. Explicit or implicit. There are some implicit rules that make something (for instance, class methods) to show up in the index with a certain form. Explicit indexing uses Pod6's X<> form.
    2. Index category, which are broad categories used mainly for search index presentation.
    3. Index kind and subkind. kind is used for creating new, secondary content, additionally. subkind is probably not used, but would be hard pressed to tell if it does.
    4. "Real name". In many cases, the name of the thing you are indexing can't be (easily) used in URLs. Having a "real name" will help for indexing and search purposes.
  5. Class hierarchy metadata. This data is right now in type-graph.txt, which causes lots of asynchrony problems: classes not included and classes included but not documented. This is used for two different things, if I'm not wrong (which I might be): to generate the type graph (obviously), but also to by htmlify.p6 to generate and append methods for all super-classes and mixed-in roles. This is also now document-level metadata, it should maybe be moved to class-level metadata.

1 to 3 are not really used right now. And 4 has some obscure rules that are difficult to parse even for people who have been working with them for some time. There are not clear rules, either, on what should be indexed and why, or what categories, kinds or even subkinds should be used with them. So we will have to find out the indexing rules before actually proposing new ones.

5 is heavily used, but it's not clear if this thing is really needed, or current introspection facilities will do a better job. Besides, type-graph is actually out of sync with the actual documentation, and it might be even worse, because class hierarchies might have changed.