Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate HCFILES workflow (also add "up one level" button to index.html pages) #1784

Merged
merged 31 commits into from
Jul 15, 2024

Conversation

fghalasz
Copy link
Member

This PR automates the building of HCFILES. It consists of 4 things:

  1. The scripts/do_hcfiles.sh script. This is a loadup-style script that will run HCFILES and MAKE-INDEX-HTMLS in a Medley directory. The *.pdf and index.html files are left in the Medley directory (git clean -f from the top level will clean them out). It assumes the apps.sysout is in loadups.

  2. doHCFILES.yml Github Actions workflow that will take the latest release, run do_hcfiles.sh, and store the resulting Medley directory up on https://files.interlisp.org/medley. doHCFILES.yml can be called directly from the github actions page, but is also designed to run automatically as described in unicode and FILEPOS #3.

  3. Modification to the buildReleaseIncDocker.yml workflow so that the doHCFILES workflow is run after a new release is created successfully.

  4. Not really part of the automation - but modified the MAKE-INDEX-HTMLS function so that all index.html files (except the top level one) have a button that "goes up one level" in the directory hierarchy of Medley files.

You can go to https://files.interlisp.org/medley now to see the latest Medley release files.

… files/directory names are preserved since (DIRECTORY) seems to return names ia all-caps, always
…l files. Move fio files to medley instead of source. Streamline doHCFILES workflow
…production; add doHCFILES workflow into buildReleraseInclDocker workflow
@fghalasz fghalasz added the enhancement New feature or request label Jul 15, 2024
@fghalasz fghalasz requested a review from masinter July 15, 2024 07:37
@fghalasz fghalasz self-assigned this Jul 15, 2024
@pamoroso
Copy link
Contributor

Just a heads up that visiting https://files.interlisp.org yields Internal Server Error

Copy link
Member

@masinter masinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, seems to have produced good results.

I noticed that FNS MAKE-INDEX-HTMLS was on the file twice.

@masinter masinter merged commit 1ffcde1 into master Jul 15, 2024
13 checks passed
@masinter
Copy link
Member

Just a heads up that visiting https://files.interlisp.org yields Internal Server Error

The medley source and derived PDFs are all under //files.interlisp.org/medley/

I think the current arrangement is fine, but I think we should wait a while before declaring these URLs are "stable".

@masinter
Copy link
Member

It might be a good idea to include the dribble file of running HCFILES or even make a separate list of files that ended in a "FAILED" rather than a "DONE".

@fghalasz
Copy link
Member Author

Just a heads up that visiting https://files.interlisp.org/ yields Internal Server Error

I will fix this to have better behavior - it will be in a PR in the Online repo, not the Medley repo. Meanwhile, as Larry mentions, the https://files.interlisp.org/medley URL does work as expected.

@fghalasz
Copy link
Member Author

It might be a good idea to include the dribble file of running HCFILES or even make a separate list of files that ended in a "FAILED" rather than a "DONE".

Yes. I meant to at least preserve the dribble file. Plan is to put it in the loadups/ directory. Didn't get around to it in this PR, but will submit another PR soon incorporating this. Also need to document the do_hcfiles.sh script.

@nbriggs
Copy link
Contributor

nbriggs commented Jul 15, 2024

Should it really be including the ".git..." (and other dot files) in what it publishes? And should it have something to set the types of (all the) files to text so that browsers don't just offer to download rather than display them?

@masinter
Copy link
Member

Please avoid making the index.html files fancier. if you want a fancier format for the index.html files, do it in another file somewhere else. I would like it to be possible to have a read-only medley sources which works by wget of the URLs. For DIRECTORY to work you need to be able to enumerate. For this to be complete we can decorate  the index.html with additional metadata. Perhaps other features could be managed by css so html remains the same. It would be good to change the default font to something more appropriate for a directory listing, and whatever SEO we have to do to get Google & etc to index.
https://larrymasinter.net/  https://interlisp.org/

On Mon, Jul 15, 2024 at 1:06 PM Nick Briggs [email protected] wrote:

@nbriggs ".git..."

(and other dot files) in what it publishes?
i wanted the contents of the repos without '.git'. I think that's what it does.

something to set the types of (all the) files to text so that browsers don't just offer to download rather than display them?—

Github Pages sets file type from file extension and only allows a small number of file types. I think everything that does't have a known file extensiion is (or should be) served as application/octet-stream.

But please try to avoid making the index.html files fancier. if you want a fancier format for the index.html files, do it in another file somewhere else. I would like it to be possible to have a read-only medley sources which works by wget of the URLs. For DIRECTORY to work you need to be able to enumerate. For this to be complete we can decorate  the index.html with additional metadata. Perhaps other features could be managed by css so html remains the same. It would be good to change the default font to something more appropriate for a directory listing, and whatever SEO we have to do to get Google & etc to index.
.
You are receiving this because you modified the open/close state.

@nbriggs
Copy link
Contributor

nbriggs commented Jul 16, 2024

It's currently including the dot files - this is taken from the current page (though here the links are visible):

Index page for {MEDLEY}

This is an index of the files just to link them in.

[.dockerignore](https://files.interlisp.org/medley/.dockerignore)
[.gitattributes](https://files.interlisp.org/medley/.gitattributes)
[.gitignore](https://files.interlisp.org/medley/.gitignore)
[.gitmodules](https://files.interlisp.org/medley/.gitmodules)
[.nojekyll](https://files.interlisp.org/medley/.nojekyll)
[BUILDING.md](https://files.interlisp.org/medley/BUILDING.md)
[clos/](https://files.interlisp.org/medley/clos/)
[CLTL2/](https://files.interlisp.org/medley/CLTL2/)
[CODE_OF_CONDUCT.md](https://files.interlisp.org/medley/CODE_OF_CONDUCT.md)
[CONTRIBUTING.md](https://files.interlisp.org/medley/CONTRIBUTING.md)

@masinter
Copy link
Member

including the . files except for .git was intentional. They're in the repository and interesting for understanding our automation.

@masinter masinter deleted the fgh_hcfiles-workflow branch July 16, 2024 17:23
@fghalasz
Copy link
Member Author

@nbriggs

And should it have something to set the types of (all the) files to text so that browsers don't just offer to download rather than display them?

I've updated the web server for files.interlisp.org so that it sets content type headers according to the following:

Extension .pdf => pdf
Extensions .md, .txt, .sh, .command, .awk => text
No extension: if file is binary => octet-stream; otherwise => text. (Lisp source files compute as binary.)
All other extensions (e.g., .LCOM, DFASL, .sysout) => octet-stream

@fghalasz
Copy link
Member Author

@pamoroso

Just a heads up that visiting https://files.interlisp.org/ yields Internal Server Error

Fixed on file.interlisp.org server - now returns an index.html file with a single link to the medley source files. Can later expand if we serve more type of files from this server.

@fghalasz
Copy link
Member Author

Note that the indexed hcfiles includes all the files in loadups - including all sysouts. These files are not part of the repo per se. Should we arrange to skip these?

@masinter
Copy link
Member

masinter commented Jul 17, 2024 via email

@masinter
Copy link
Member

masinter commented Jul 17, 2024

oh
.html => text/html
.png => image.png (for the PDF=>html conversions in docs))
.css => text/css (?)
.jpeg => image/jpeg
.lisp => text/plain (in our repos, e.g. in CLOS.)

@fghalasz
Copy link
Member Author

I should be clearer about how files.interlisp.org maps file extensions to content-type of the http response.

The base line is the standard mappings as found in [MIME Extension Mappings.xml](https://gist.github.com/adamfisher/16fe8c619ea389944d0f). This covers the extensions .png, .jpg, .css, .html, etc.

After the base line mapping, there are three additional cases to consider:

  1. Extensions not included in the standard mime list such as .lisp, .dfasl, .lcom, .awk
  2. Extensions included in list but which we want to handle differently - such as .sh (which we want to handle as text/plain rather than application/x-sh)
  3. Files with no extensions including (most) dot files.

Right now these additional cases are handled as follows:

  1. Extensions .lisp, .awk, & .command are treated as text/plain; all other "unknown" extensions are treated as application/octet-stream.

  2. .sh is treated as text/plain

  3. For no extension files: if binary file => application/octet-stream; if text-file => text/plain (Interlisp source files => application/octet-stream since they compute as binary).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

Successfully merging this pull request may close these issues.

4 participants