Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add file input selection and deletion #173

Open
JaeAeich opened this issue Nov 13, 2023 · 1 comment
Open

feat: add file input selection and deletion #173

JaeAeich opened this issue Nov 13, 2023 · 1 comment
Labels
status: new Has not been triaged by admin

Comments

@JaeAeich
Copy link
Contributor

JaeAeich commented Nov 13, 2023

Description

For the attachments, it would be nice (and I think also important) to see the names of the files that have been uploaded, especially since it may be necessary to select one of them as the "Workflow URL". And I think it is a better user experience if the user can delete individual files, and if a new selection of files does not clear the old selection. So for example, I might select a primary CWL descriptor file and two secondary descriptor files. But accidentally I have selected a wrong file. So I should be able to remove the wrong file and then select only the single missing descriptor file, without having to select again the two others.

As an extension of the issue above, it would actually be great if after selecting one or more files for upload and listing their file names and a button to remove them, there would also be a checkbox that would optionally allow selecting exactly one file as the Workflow URL. If selected, this should auto-populate (and ideally hidden or grayed out unless a file is unselected) the "Workflow URL" field with the name of the file. As an alternative (possibly even better) we could maybe have an upload file button next to the "Workflow URL" field (maybe with an OR in between) that can only be used to select the primary descriptor file. If it is used, the workflow URL is then auto-populated. Additional files can then still be attached through the button/field at the bottom.

@JaeAeich JaeAeich added the status: new Has not been triaged by admin label Nov 13, 2023
@uniqueg
Copy link
Member

uniqueg commented Nov 13, 2023

Thanks @JaeAeich!

Regarding the file attachments, I think the following clarification from the WES specification is important:

The workflow_attachment array may be used to upload files that are required to execute the workflow, including the primary workflow, tools imported by the workflow, other files referenced by the workflow, or files which are part of the input. The implementation should stage these files to a temporary directory and execute the workflow from there. These parts must have a Content-Disposition header with a "filename" provided for each part. Filenames may include subdirectories, but must not include references to parent directories with '..' -- implementations should guard against maliciously constructed filenames.

The ability to specify file paths will allow reconstructing a directory tree for the uploaded files. This is crucial, because workflow directories are generally not flat. In fact, best practices for most workflow languages prescribe complex nested workflow directory structures, e.g., Snakemake.

This means that we would have to find a way to:

  • Enable users to provide the desired (relative) file paths for each attached file
  • Sanitize these user-provided file paths and guard against code injection etc.
  • Pass the sanitized file paths in the Content-Disposition headers for each file

We should put some thought into designing this in a way that is not too painful and error-prone for the user.

One user-friendly alternative to setting file paths (and selecting multiple files) manually, we could allow users to upload entire directories of files, which we would then parse to automatically create the file paths for the Content-Disposition headers from, according to the directories' subdirectory structures.

All files (whether selected manually or as part of a directory or its subdirectories) could then be used to populate a file table, which could be further amended by going through the file and directory selection process multiple times (double entries should be filtered automatically and a maximum number of files should be enforced as well).

The user could manipulate the file table to remove individual files (and possibly individual subdirectories in one go, if sorted accordingly?) and to optionally select at most one primary descriptor file via a checkbox. Checking one of the checkboxes should auto-populate and hide the "Workflow URL" field until unchecked (and, of course, checking a checkbox for a different file should uncheck the previously checked box, leave the "Workflow URL" field hidden and change its hidden value). The table could also include a column that is collapsed by default but could be uncollapsed to manually edit the file paths of each file. This column could be autopopulated on a best guess basis, i.e., include any subdirectories if parsed from a selected directory, or not include any subdirectories if selected directly/manually.

I think it would then make sense to put this file selection at the very top of the form to signal to the user the importance of selecting a workflow and all required files first, before doing anything else - especially because the user's choices on the file upload determine whether a "Workflow URL" needs to be provided or not.

As far as I can see, this design would cover for most common use cases:

  • A user could easily select an entire workflow directory to include all files in their correct relative locations
  • A user could easily select only the relevant subdirectories and/or root level files from a workflow directory (e.g., only the workflow/ and config/ subdirectories of a Snakemake workflow following best practices or only the
  • A user could remove any unnecessary files relatively easily
  • For complex/advanced use cases, a user could manually modify file paths

We could even extend the file table by adding another set of columns of checkboxes for auto-populating "Workflow parameters" and "Workflow engine parameters" (grayed out for any files that aren't .yaml or .json files). Again, only at most one file could be selected for auto-populating the contents. However, in this case, I would probably leave the fields editable, so that users could make use of template files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: new Has not been triaged by admin
Projects
None yet
Development

No branches or pull requests

2 participants