Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralize derivatives conventions for BIDS datasets #94

Merged
merged 17 commits into from
Jan 31, 2023

Conversation

valosekj
Copy link
Member

@valosekj valosekj commented Jan 24, 2023

Purpose

This PR intends to centralize the discussion about the derivatives/label conventions.

Motivation

Currently, many projects use their own derivatives convention, usually described in README, for example:

Description

This PR proposes the usage of _label-<region>_<task>.nii.gz tag, for example:

  • _label-SC_seg.nii.gz - SC binary segmentation
  • _label-GM_seg.nii.gz - GM binary segmentation
  • _label-SC_mask.nii.gz - binary mask with diameter of XXmm centered at the center of the SC
  • ...

For "tasks" such as centerline, disc, or pmj, the region is omitted; for example:

  • _label-centerline.nii.gz - binary SC centerline
  • _label-disc.nii.gz - voxels located at the posterior tip of each intervertebral disc
  • label-pmj.nii.gz - single voxel with value of 50 corresponding to the pontomedullary junction (PMJ)
  • ...

The full description is provided within this PR here.

Also, this PR proposes the usage of derivatives/manual_labels (instead of derivatives/labels). Thanks to that, we can omit -manual from the filename of each file (i.e., sub-001_T1w_label-SC_seg-manual.nii.gz --> sub-001_T1w_label-SC_seg.nii.gz).

Pros

  • BIDS compatibility

Cons

Previous discussions

Useful links

TODO

Questions

  • _label-SC_seg.nii.gz vs _label-SC_dseg.nii.gz - BIDS suggests _dseg.nii.gz suffix for discrete segmentation instead _seg.nii.gz. However, _seg.nii.gz also seems to pass the bids-validator.
  • _label-centerline.nii.gz vs _label-SC-centerline.nii.gz - should we omit or include the region for "tasks" where it is obvious that they are related to the spinal cord (SC)
  • derivatives/labels vs derivatives/label - this comment suggest singular (label), but so far, we have been using plural (labels).
  • derivatives/manual_labels/.../sub-XXX_T1w_label-SC_seg.nii.gz vs derivatives/labels/.../sub-XXX_T1w_label-SC_seg-manual.nii.gz - if we use manual_labels instead of labels, we can omit manual from filenames

@jcohenadad jcohenadad changed the title Centralize derivatives convection Centralize derivatives conventions Jan 24, 2023
@jcohenadad jcohenadad changed the title Centralize derivatives conventions Centralize derivatives conventions for BIDS datasets Jan 24, 2023
data/dataset-curation.md Show resolved Hide resolved
data/dataset-curation.md Outdated Show resolved Hide resolved
data/dataset-curation.md Outdated Show resolved Hide resolved
data/dataset-curation.md Show resolved Hide resolved
@mariehbourget
Copy link

Great initiative!
I'll cross-ref some research that I did previously in ivadomed with BIDS derivatives that may be helpful for you to centralize the convention.

In particular, I asked a question on the BIDS mailing list for derivatives chains filenaming (derivatives of a derivatives) but unfortunately I did not receive any answer so far: https://groups.google.com/g/bids-discussion/c/6UDCso4mCXc/m/VvuG0Vk3CAAJ?utm_medium=email&utm_source=footer

Hope that helps!

@jcohenadad jcohenadad marked this pull request as ready for review January 31, 2023 16:51
Copy link
Member

@jcohenadad jcohenadad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great! few comments to address before merging

@valosekj valosekj merged commit 6b95075 into master Jan 31, 2023
@NadiaBlostein
Copy link
Contributor

Hi, to me it's still unclear whether labels or manual_labels (in derivatives/labels or derivatives/manual_labels) should be pluralized or not. Is the consensus to keep it pluralized, as can be seen in the documentation? Thank you!

@jcohenadad
Copy link
Member

Hi, to me it's still unclear whether labels or manual_labels (in derivatives/labels or derivatives/manual_labels) should be pluralized or not. Is the consensus to keep it pluralized, as can be seen in the documentation? Thank you!

Good question. I think BIDS examples tend to pluralize this. Example for "masks": https://bids-specification.readthedocs.io/en/latest/derivatives/imaging.html#masks

@NadiaBlostein
Copy link
Contributor

@valosekj A couple additional questions:

  1. For disc level labels (what sct_label_vertebrae outputs with the suffix _seg_labeled_discs.{json, nii.gz}, should the "BIDSified" suffix be _label-disc_level.nii.gz ?

  2. Is it okay to put everything in the same labels directory (as opposed to labels and manual_labels) if one changes the suffixes for the files in manual_labels from label-<region>.nii.gz to manual_label-<region>.nii.gz and the suffixes for the files in manual_labels_softseg from label-<region>.nii.gz to manual_label_softseg-<region>.nii.gz

  3. A pedantic point about the BIDSification of these file names but which could be good to standardize "earlier on": should "label" in the file name suffixes also be pluralized? Or at this point, people can just read the documentation which is pretty clear as is.

Cheers!

@valosekj
Copy link
Member Author

valosekj commented Feb 24, 2023

Sorry, @NadiaBlostein. Your first message slipped through the cracks due to my holiday.

Hi, to me it's still unclear whether labels or manual_labels (in derivatives/labels or derivatives/manual_labels) should be pluralized or not. Is the consensus to keep it pluralized, as can be seen in the documentation? Thank you!

I quickly checked several of our git-annexed datasets, and all use pluralized forms such as labels, labels_softseg, or manual_labels. So, yes, I think our consensus is to keep it pluralized, as can be seen in the documentation.

  1. For disc level labels (what sct_label_vertebrae outputs with the suffix _seg_labeled_discs.{json, nii.gz}, should the "BIDSified" suffix be _label-disc_level.nii.gz ?

I think our consensus is label-disc.nii.gz (i.e., without level).

  1. Is it okay to put everything in the same labels directory (as opposed to labels and manual_labels) if one changes the suffixes for the files in manual_labels from label-.nii.gz to manual_label-.nii.gz and the suffixes for the files in manual_labels_softseg from label-.nii.gz to manual_label_softseg-.nii.gz

Good question! We use -manual in file names like this:

data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_seg-manual.nii.gz

But based on the new convention, the valid approach is also:

data-multi-subject/derivatives/manual_labels/sub-amu01/anat/sub-amu01_T1w_seg.nii.gz

(i.e., using manual_labels instead of labels and omitting -manual from the file name)

  1. A pedantic point about the BIDSification of these file names but which could be good to standardize "earlier on": should "label" in the file name suffixes also be pluralized? Or at this point, people can just read the documentation which is pretty clear as is.

label entity in the file name should be singular; see BIDS documentation here

@NadiaBlostein
Copy link
Contributor

Thank you @valosekj !

Continuing on with point 2: how is one then to differentiate disc labels (_label-disc.nii.gz) and intervertebral level labels (output by sct_label_vertebrae)? Could one add this suffix label-disc_level.nii.gz to the documentation?

According to the new convention, shouldn't
data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_seg-manual.nii.gzbe data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_label-SC_seg-manual.nii.gz ?

Thank you for answering my questions!!!

@valosekj
Copy link
Member Author

Continuing on with point 2: how is one then to differentiate disc labels (_label-disc.nii.gz) and intervertebral level labels (output by sct_label_vertebrae)? Could one add this suffix label-disc_level.nii.gz to the documentation?

Good point! We usually store only intervertebral disc labels (_label-disc.nii.gz). Then there is no need for differentiation of sct_label_vertebrae output 😅

According to the new convention, shouldn't data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_seg-manual.nii.gzbe data-multi-subject/derivatives/labels/sub-amu01/anat/sub-amu01_T1w_label-SC_seg-manual.nii.gz ?

Yes, you are right! spine-generic/data-multi-subject uses the "old convention" because it was initially curated >2 years ago. I documented it here. Thank you for this relevant point!

@mguaypaq mguaypaq deleted the jv/centralize_derivatives_convection branch June 11, 2024 15:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants