-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validator MUST NOT accept identical files under different extensions #1107
Comments
Thanks for the report @arnodelorme, it seems like you're going through a lot of datasets these days :-) I agree that the validator should catch these cases. A given EEG file such as |
I haven't checked whether the BDF file is corrupted, but if it truly is, that raises another, already known, concern: We are not validating the contents of binary EEG files. This problem is hard to solve, because we would need to implement data format readers in Javascript. So that the bids-validator can go into the files and check for their validity. Currently, this is already being done for NIfTI files (and only for NIfTI files). I tried many months ago to implement a reader/validator for the BrainVision format using Javascript here: https://github.com/sappelhoff/brainvision-validator/ ... see also #475 However, I ran into problems integrating it with the bids-validator, because it runs both on the browser, and the CLI. --> and the "file access" API for the browser is significantly different and more complicated than accessing files from the CLI (or from programs written in Matlab or Python). But I will open this post as a separate issue and we certainly should address it as soon as we have some resources available. (And with resources, I mean people who have expertise, energy, and time) |
In this issue, let's track our progress to prevent users from storing the same data under different extensions. This should be some rule that:
sounds difficult but possible to implement. |
Yes, this sounds like a good rule.
… On Nov 4, 2020, at 10:33 PM, Stefan Appelhoff ***@***.***> wrote:
In this issue, let's track our progress to prevent users from storing the same data under different extensions.
This should be some rule that:
• IF a file sub-01/ses-01/eeg/sub-01_ses-01_task-offline_run-01_eeg.<ext> is present
• AND is from the list LIST_OF_ACCEPTED_DATA_FORMAT_EXTENSIONS
• then there MUST NOT be any other file with the same name and an ext from that list
sounds difficult but possible to implement.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
This BIDS dataset contains both .edf and .bdf file (which are very small)
https://openneuro.org/datasets/ds002034/versions/1.0.1
sub-01/ses-01/eeg/sub-01_ses-01_task-offline_run-01_eeg.edf
sub-01/ses-01/eeg/sub-01_ses-01_task-offline_run-01_eeg.bdf
I believe it should not have passed the validator since there are 2 types of binary files and the BDF file is obviously corrupted.
The text was updated successfully, but these errors were encountered: