A discussion on needed `/paths` properties and metadata #1851

yarikoptic · 2024-02-01T19:31:23Z

yarikoptic

@jwodder please review / comment on / accept where my suggestion about state corresponds to what we already have or I suggested in #1837 (comment)

doc/design/paths-endpoint-webdav.md

yarikoptic · 2024-02-01T19:35:52Z

doc/design/paths-endpoint-webdav.md

+* Asset metadata used by dandidav:
+    * `encodingFormat` (blobs only)
+    * `contentUrl` (API download URL for blobs, S3 URL for Zarrs)
+    * `digest["dandi:dandi-etag"]` (blobs only)


what this one for?

Etag (reported in PROPFIND responses)

in principle could be any other checksum I guess , right?

note: I thought that besides ETag functionality a commonly known checksum could then be used by receiving end to validate the download if that would be the target purpose for such a listing.

for versioned zarrs we would need "dandi:dandi-zarr-checksum" to use in conjunction with the zarr_id.

Well, the dandi:dandi-etag is the same as the ETag header reported by S3 when downloading a blob asset. In contrast, there's no single file that has a dandi:dandi-zarr-checksum as an ETag.

I meant that we can use dandi:dandi-zarr-checksum as a value for ETag on zarr collections (folders).

I'm not sure if it makes sense to apply etags to collections, as GETing a collection won't give you a response with that etag. (Cf. dandi/dandidav#26)

yarikoptic · 2024-02-01T21:03:30Z

doc/design/paths-endpoint-webdav.md

+
+* Asset properties used by dandidav:
+    * `asset_id` (PRESENT)
+    * `blob_id`


what for do we need blob_id if we have contentUrl?

The contentUrl returned in .../assets/paths/ responses is a plain S3 URL, and serving it to the user in a web browser would result in the file being downloaded and named with the blob ID. In order for the downloaded file to be named with the asset's filename instead, we need a signed S3 URL, which is obtained via the API download URL in the contentUrl field of the metadata.

EDIT: Sorry, I misinterpreted the question. For blob ID, the code just checks whether blob_id or zarr_id is non-None in order to determine whether the asset is a blob or a Zarr.

ok, so we the actually need result_type (Folder, AssetBlob, AssetZarr) as I suggested in #1837 (comment) .

yarikoptic · 2024-02-01T21:04:47Z

doc/design/paths-endpoint-webdav.md

+* Asset properties used by dandidav:
+    * `asset_id` (PRESENT)
+    * `blob_id`
+    * `zarr_id`


I guess zarr_id is needed for traversal etc? or could contentUrl version be used? (probably we do not want to rely on parsing it).

So it feels that we might ask for asset record to include blob_id (although not sure what for yet exactly) and zarr_id, hence

Suggested change

* `zarr_id`

* `zarr_id` (DESIRED)

dandidav does parse S3 contentUrls, as that's necessary in order to discover the name of the S3 bucket without hardcoding it or requiring it to be passed on the command line.

so in principle we can just parse out bucket, zarr_id and blob_id from contenturl, correct?

Yes, but I'd rather dandidav not have to make too many assumptions about how our S3 URLs are structured.

agree. We would only need to be able to construct zarr "http s3" URLs from manifests so it would need to know that , right?
For blobs -- contentUrl should be good enough.

Note that asset metadata lists two contentUrls, an unsigned S3 URL and an API download URL of the form https://api.dandiarchive.org/api/assets/{asset_id}/download/.

For Zarr assets, dandidav needs the S3 URL in order to know how to query the Zarr's entries from S3.

However, for blob assets, dandidav currently uses the API download URL, as it provides a better UX. Specifically, if dandidav were to redirect GET requests for a blob to the blob's unsigned S3 URL, then the browser would save the response to a file named after the last component of the S3 URL, which is (always?) the blob ID. In order to get the downloaded files to be named after the filename portion of the asset's path instead, a signed S3 URL with a Content-Disposition has to be used, and that is acquired by redirecting to the API download URL, which in turn redirects to such a signed S3 URL.

doc/design/paths-endpoint-webdav.md

Co-authored-by: John T. Wodder II <[email protected]>

Just a discussion pasted from #1837

c87cc8c

yarikoptic commented Feb 1, 2024

View reviewed changes

yarikoptic mentioned this pull request Feb 1, 2024

Add endpoint for querying a folder or asset path in a Dandiset #1837

Open

yarikoptic added 2 commits February 1, 2024 16:01

First batch of annotations on what we have already or desire

24e1188

Forgotten path

e3c4500

yarikoptic commented Feb 1, 2024

View reviewed changes

jwodder reviewed Feb 2, 2024

View reviewed changes

doc/design/paths-endpoint-webdav.md Outdated Show resolved Hide resolved

codespell ;)

9994f2c

Co-authored-by: John T. Wodder II <[email protected]>

waxlamp added the design-doc Involves creating or discussing a design document label Feb 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A discussion on needed `/paths` properties and metadata #1851

A discussion on needed `/paths` properties and metadata #1851

yarikoptic commented Feb 1, 2024

yarikoptic left a comment

yarikoptic Feb 1, 2024

jwodder Feb 1, 2024

yarikoptic Feb 3, 2024

jwodder Feb 3, 2024

yarikoptic Feb 12, 2024

jwodder Feb 12, 2024

yarikoptic Feb 1, 2024

jwodder Feb 1, 2024 •

edited

Loading

yarikoptic Feb 3, 2024

yarikoptic Feb 1, 2024

jwodder Feb 1, 2024

yarikoptic Feb 2, 2024

jwodder Feb 2, 2024

yarikoptic Feb 3, 2024

jwodder Feb 3, 2024

A discussion on needed /paths properties and metadata #1851

Are you sure you want to change the base?

A discussion on needed /paths properties and metadata #1851

Conversation

yarikoptic commented Feb 1, 2024

yarikoptic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwodder Feb 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

A discussion on needed `/paths` properties and metadata #1851

A discussion on needed `/paths` properties and metadata #1851

jwodder Feb 1, 2024 •

edited

Loading