Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derive asset access field from asset blob #2010

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

jjnesbitt
Copy link
Member

Closes #1997

Previously, this field was left unaltered, which meant it assumed whatever value was provided during creation through the API. This meant a default of dandi:OpenAccess most of the time, but it could have actually had any value. This has led to almost all of the currently embargoed assets to mistakenly display an access value of dandi:OpenAccess.

This PR changes this behavior by instead using the asset blob embargoed field to determine the value of 'access'.

Determine if asset is open access or embargoed based on if the asset
blob is embargoed.
Comment on lines 11 to 14
# Use the postgres jsonb '-' operator to delete the 'access' field from metadata
Asset.objects.filter(metadata__access__isnull=False).update(
metadata=RawSQL("metadata - 'access'", [])
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, I tested this locally and it works as expected. Here is the generated SQL:

UPDATE "api_asset"
   SET "metadata" = (metadata - 'access')
 WHERE ("api_asset"."metadata" -> 'access') IS NOT NULL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use Django ORM style for this? i.e., replace Asset.metadata with the access-less copy, followed by .save()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that, but that means we'd be doing as many .save calls are there are assets, which seems unnecessarily burdensome, as well as the potential for a race condition (if someone were to update the asset at the same time, either their update may be lost, or this one could be). It would also probably take much longer.

@mvandenburgh
Copy link
Member

The AuditRecord model also stores a copy of Asset.metadata, which means that any asset audit records created up until this PR is merged will have an access field, while future ones won't.

This also made me realize we're just storing the metadata, not full_metadata in the asset audit record; i.e. all the "computed" fields are being left out of the audit records. I wonder if we should do the latter (@waxlamp).

@jjnesbitt
Copy link
Member Author

The AuditRecord model also stores a copy of Asset.metadata, which means that any asset audit records created up until this PR is merged will have an access field, while future ones won't.

Since the access field for assets has always been superfluous, I think it would be appropriate to also purge it from the existing audit records. What do you think @waxlamp?

@waxlamp
Copy link
Member

waxlamp commented Aug 24, 2024

This also made me realize we're just storing the metadata, not full_metadata in the asset audit record; i.e. all the "computed" fields are being left out of the audit records. I wonder if we should do the latter (@waxlamp).

I'm open to this. If the full metadata can be derived from the "small" metadata, I think leaving it as is would be ok too. But let's discuss next week.

@waxlamp
Copy link
Member

waxlamp commented Aug 24, 2024

The AuditRecord model also stores a copy of Asset.metadata, which means that any asset audit records created up until this PR is merged will have an access field, while future ones won't.

Since the access field for assets has always been superfluous, I think it would be appropriate to also purge it from the existing audit records. What do you think @waxlamp?

I think that would be ok.

Copy link
Member

@mvandenburgh mvandenburgh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just need the migration for the audit record table 👍

Published assets actually MUST have this field present, and so removing
it here would violate a check constraint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Asset's access/status is dandi:OpenAccess even though it is uploaded to an embargoed Dandiset
3 participants