Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding micr_XPCTzarr dataset #455

Merged
merged 20 commits into from
Aug 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,12 +188,13 @@ TABLE BELOW IS GENERATED AUTOMATICALLY.
DO NOT EDIT DIRECTLY.
-->

| name | description | datatypes | suffixes | link to full data | maintained by |
|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------|:-----------------------------|:--------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------|:---------------------------------------------|
| [eeg_ds003645s_hed_demo](https://github.com/bids-standard/bids-examples/tree/master/eeg_ds003645s_hed_demo) | Shows usage of Hierarchical Event Descriptor (HED) in .tsv files | anat, beh, eeg, micr, motion | KSSSleep, SPIM, beh, channels, coordsystem, defacemask, eeg, electrodes, events, headshape, motion, photo, samples, scans | [link](https://openneuro.org/datasets/ds003645) | [@VisLab](https://github.com/VisLab) |
| [micr_SEM](https://github.com/bids-standard/bids-examples/tree/master/micr_SEM) | Example SEM dataset in PNG format with 1 sample imaged over 2 sessions | micr | SEM, photo, samples, sessions | [link](https://doi.org/10.5281/zenodo.5498378) | [@jcohenadad](https://github.com/jcohenadad) |
| [micr_SEMzarr](https://github.com/bids-standard/bids-examples/tree/master/micr_SEMzarr) | Example SEM dataset in PNG and OME-ZARR format with 1 sample imaged over 2 sessions | micr | SEM, SPIM, samples, sessions | n/a | [@TheChymera](https://github.com/TheChymera) |
| [micr_SPIM](https://github.com/bids-standard/bids-examples/tree/master/micr_SPIM) | Example SPIM dataset in OME-TIFF format with 2 samples from the same subject with 4 chunks each | micr | SPIM, photo, samples | [link](https://doi.org/10.5281/zenodo.5517223) | [@jcohenadad](https://github.com/jcohenadad) |
| name | description | datatypes | suffixes | link to full data | maintained by |
|:------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------|:-----------------------------|:--------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------|:---------------------------------------------|
| [eeg_ds003645s_hed_demo](https://github.com/bids-standard/bids-examples/tree/master/eeg_ds003645s_hed_demo) | Shows usage of Hierarchical Event Descriptor (HED) in .tsv files | anat, beh, eeg, micr, motion | KSSSleep, SPIM, beh, channels, coordsystem, defacemask, eeg, electrodes, events, headshape, motion, photo, samples, scans | [link](https://openneuro.org/datasets/ds003645) | [@VisLab](https://github.com/VisLab) |
| [micr_SEM](https://github.com/bids-standard/bids-examples/tree/master/micr_SEM) | Example SEM dataset in PNG format with 1 sample imaged over 2 sessions | micr | SEM, photo, samples, sessions | [link](https://doi.org/10.5281/zenodo.5498378) | [@jcohenadad](https://github.com/jcohenadad) |
| [micr_SEMzarr](https://github.com/bids-standard/bids-examples/tree/master/micr_SEMzarr) | Example SEM dataset in PNG and OME-ZARR format with 1 sample imaged over 2 sessions | micr | SEM, SPIM, samples, sessions | n/a | [@TheChymera](https://github.com/TheChymera) |
| [micr_SPIM](https://github.com/bids-standard/bids-examples/tree/master/micr_SPIM) | Example SPIM dataset in OME-TIFF format with 2 samples from the same subject with 4 chunks each | micr | SPIM, photo, samples | [link](https://doi.org/10.5281/zenodo.5517223) | [@jcohenadad](https://github.com/jcohenadad) |
| [micr_XPCTzarr](https://github.com/bids-standard/bids-examples/tree/master/micr_XPCTzarr) | Example XPCT dataset in OME-ZARR format with 1 sample imaged | micr | XPCT, photo, samples, sessions | [link](https://human-organ-atlas.esrf.eu/datasets/572252538) | [@chourroutm](https://github.com/chourroutm) |

### Motion

Expand Down
1 change: 1 addition & 0 deletions dataset_listing.tsv
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@ pet005 T1w, PET [@mnoergaard](https://github.com/mnoergaard) anat, pet T1w, eve
micr_SEM Example SEM dataset in PNG format with 1 sample imaged over 2 sessions [link](https://doi.org/10.5281/zenodo.5498378) [@jcohenadad](https://github.com/jcohenadad) micr SEM, photo, samples, sessions
micr_SEMzarr Example SEM dataset in PNG and OME-ZARR format with 1 sample imaged over 2 sessions [@TheChymera](https://github.com/TheChymera) micr SEM, SPIM, samples, sessions
micr_SPIM Example SPIM dataset in OME-TIFF format with 2 samples from the same subject with 4 chunks each [link](https://doi.org/10.5281/zenodo.5517223) [@jcohenadad](https://github.com/jcohenadad) micr SPIM, photo, samples
micr_XPCTzarr Example XPCT dataset in OME-ZARR format with 1 sample imaged [link](https://human-organ-atlas.esrf.eu/datasets/572252538) [@chourroutm](https://github.com/chourroutm) micr XPCT, photo, samples, sessions
fnirs_tapping Example fNIRS measurement with three conditions from five subjects [link](https://doi.org/10.5281/zenodo.5529797) [@rob-luke](https://github.com/rob-luke) nirs channels, coordsystem, events, nirs, optodes, scans
fnirs_automaticity 24 subjects performing (non-)automatic finger tapping and foot stepping [link](https://doi.org/10.34973/vesb-mh30) [@robertoostenveld](https://github.com/robertoostenveld) nirs channels, coordsystem, events, nirs, optodes, practicelogbook, scans
motion_systemvalidation Example dataset of two different motion captured system recorded almost simultaneously, but no brain data [link](https://doi.org/10.6084/m9.figshare.20238006.v2) [@JuliusWelzel](https://github.com/JuliusWelzel) motion channels, motion, scans
Expand Down
3 changes: 3 additions & 0 deletions micr_XPCTzarr/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
1 human brain sample imaged over 1 session.
Example dataset with empty images containing X-ray phase-contrast tomography (XPCT) data to illustrate BIDS convention.
Using NGFF nested directory format example
77 changes: 77 additions & 0 deletions micr_XPCTzarr/code/data_retrieval.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
from pathlib import Path
import zarr.convenience
import dask.array
import ome_zarr, ome_zarr.io, ome_zarr.writer
from numcodecs import Blosc, Delta
import time

"""
You may use the following command to prepare a Python 3.8+ environment for the download of the dataset: `pip install -r data_retrieval_reqs.txt`
"""

data_path = Path("../sub-LADAF-2020-31/ses-01/micr/") # path to session
data_path.mkdir(exist_ok=True) # create directory

# short utility function
def zarr_array(url,selector):
print('Retrieving data from', url)
n5_store = zarr.N5FSStore(url)
root = zarr.group(store=n5_store)
return dask.array.from_zarr(root[selector])

dataset_full = zarr_array("gs://ucl-hip-ct-35a68e99feaae8932b1d44da0358940b/LADAF-2020-31/brain/25.08um_complete-organ_bm05/","s0") # get a dask.array.Array that points to the whole N5 dataset

# default filters and compressors made the script crash; the ones below work:
filters = [Delta(dtype='i4')]
compressor = Blosc(cname='zstd', clevel=1, shuffle=Blosc.SHUFFLE)

path_roi = data_path / "sub-LADAF-2020-31_ses-01_sample-brain_XPCT.ome.zarr" # full name of the dataset following BIDS specification

print('Writing in', str(path_roi))

tic = time.time()

store = ome_zarr.io.parse_url(path_roi,mode="a").store # NB: `mode="a"` should allow overwrite but it does not at the moment, see https://github.com/ome/ome-zarr-py/issues/376
root = zarr.group(store=store)
ome_zarr.writer.write_image(image=dataset_full,
group=root,
scaler=None,
axes=[
{
"name": "z",
"type": "space",
"units": "micrometer" # voxel size is an isotropic 25.08 um
},
{
"name": "y",
"type": "space",
"units": "micrometer" # voxel size is an isotropic 25.08 um
},
{
"name": "x",
"type": "space",
"units": "micrometer" # voxel size is an isotropic 25.08 um
}
], # axis order of the dataset following BIDS specification
coordinate_transformations=[
[
{
"scale": [
25.08,
25.08,
25.08
], # voxel size is an isotropic 25.08 um
"type": "scale"
}
]
],
storage_options=dict(
chunks=(512,512,512), # this chunk size may be altered depending on someone's needs
filters=filters, # default filters made the script crash
compressor=compressor # default compressors made the script crash
)
)

toc = time.time()

print('Writing completed in', (toc - tic)/60, 'min!')
5 changes: 5 additions & 0 deletions micr_XPCTzarr/code/data_retrieval_reqs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
dask[complete]
zarr
ome-zarr
numcodecs
gcsfs
39 changes: 39 additions & 0 deletions micr_XPCTzarr/code/metadata_formatter.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import urllib.request
import json

"""
You may use the following command to prepare a Python 3.8+ environment for the download of the dataset: `pip install -r metadata_formatter_reqs.txt`
"""

# The following metadata file can be downloaded from https://human-organ-atlas.esrf.eu/datasets/572252538
metadata_txt_file_uri = 'https://ids.esrf.fr/ids/getData?sessionId=182d0a3b-de3b-4602-8caf-9bd91dc5b0e5&datafileIds=572252539' # This URI expires quite often

req = urllib.request.urlopen(metadata_txt_file_uri)

json_dict = {}

for line in urllib.request.urlopen(metadata_txt_file_uri):
text = line.decode('utf-8')
if text.startswith('#') or text == '\r\n':
pass
else:
s = text.replace('\t','').replace('\r\n','').replace('"', '').replace('N.A.', 'n/a').split('=')

try:
s[1] = int(s[1])
except:
try:
s[1] = float(s[1])
except:
pass
json_dict[s[0]] = s[1]
json_obj = json.dumps(json_dict)
print(json_obj)

"""
The metadata has been split into several files:
- micr_XPCTzarr/samples.json
- micr_XPCTzarr/sub-LADAF-2020-31/sub-LADAF-2020-31_sessions.tsv
- micr_XPCTzarr/sub-LADAF-2020-31/ses-01/micr/sub-LADAF-2020-31_ses-01_sample-brain_XPCT.json
Note that some fields are redundant.
"""
1 change: 1 addition & 0 deletions micr_XPCTzarr/code/metadata_formatter_reqs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
urllib3
5 changes: 5 additions & 0 deletions micr_XPCTzarr/code/truncate_data.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash
# Bash script to truncate all the files containing the binary chunked data as per ../../CONTRIBUTING.md
# Modified from ../../CONTRIBUTING.md

find ../sub-LADAF-2020-31/ses-01/micr/sub-LADAF-2020-31_ses-01_sample-brain_XPCT.ome.zarr/ -type f -regex ".*/[0-9]*" -exec truncate -s 0 {} +
15 changes: 15 additions & 0 deletions micr_XPCTzarr/dataset_description.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"Name": "micr_XPCTzarr",
"Authors": [
"Matthieu Chourrout",
"David Stansby",
"Guillaume Gaisne",
"Claire L. Walsh",
"Peter D. Lee"
],
"BIDSVersion": "1.10.0",
"License": "CC-BY-4.0",
"DatasetDOI": "doi:10.15151/ESRF-DC-572252655",
"DatasetType": "raw",
"HowToAcknowledge": "Please cite this dataset as: Tafforeau, P., Walsh, C., Wagner, W. L., Daniyal J. Jafree, Bellier, A., Werlein, C., Kühnel, M. P., Boller, E., Walker-Samuel, S., Robertus, J. L., Long, D. A., Jacob, J., Marussi, S., Eeline Brown, Holroyd, N., Jonigk, D. D., Ackermann, M., & Lee, P. D. (2021). Complete brain from the body donor LADAF-2020-31 (Version 1) [dataset]. European Synchrotron Radiation Facility. https://doi.org/10.15151/ESRF-DC-572252655"
effigies marked this conversation as resolved.
Show resolved Hide resolved
}
36 changes: 36 additions & 0 deletions micr_XPCTzarr/participants.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
{
"participant_id": {
"Description": "Unique alphanumeric participant ID starting with sub-"
},
"participant_name": {
"Description": "full name or pseudo of the participant which can contain non-alphanumeric characters"
},
"sex": {
"Description": "sex of the participant",
"Levels": {
"M": "male",
"F": "female"
}
},
"age": {
"Description": "age of the participant",
"Units": "year"
},
"weight": {
"Description": "weight of the participant",
"Units": "kg"
},
"height": {
"Description": "height of the participant",
"Units": "cm"
},
"species": {
"Description": "binomial species name from the NCBI Taxonomy (https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi)"
},
"medical_information": {
"Description": "medical information as provided by the biobank"
},
"institute": {
"Description": "name of the biobank"
}
}
2 changes: 2 additions & 0 deletions micr_XPCTzarr/participants.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
participant_id participant_name sex age species weight height institute medical_information
sub-01 LADAF-2020-31 F 69 homo sapiens 40 145 Laboratoire d'Anatomie des Alpes Françaises type 2 diabetes, pelvic radiation to treat cancer of the uterus, right colectomy (benign lesion on histopathology), bilateral nephrostomy for acute obstructive renal failure, cystectomy, omentectomy and peritoneal carcinoma with occlusive syndrome
17 changes: 17 additions & 0 deletions micr_XPCTzarr/samples.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"sample_id": {
"Description": "Sample ID"
},
"participant_id": {
"Description": "Participant ID from whom tissue samples have been acquired"
},
"sample_type": {
"Description": "Type of sample from ENCODE Biosample Type (https://www.encodeproject.org/profiles/biosample_type)"
},
"sample_info": {
"Description": "One-line title of the sample"
},
"sample_preparation": {
"Description": "Specific preparation of the sample"
}
}
2 changes: 2 additions & 0 deletions micr_XPCTzarr/samples.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
sample_id participant_id sample_type sample_info sample_preparation
sample-brain sub-01 tissue complete brain from the body donor program of the Laboratoire d'Anatomie des Alpes Francaise (LADAF) formalin fixed, progressive transfer to ethanol 70% with gentle vacuum degassing at each step, mounted with mixed agar gel at 70% ethanol, n.b. some damages due to the too rapid vacuum degassing
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
{
"InstitutionName": "European Synchrotron Radiation Facility",
"StationName": "BM05 EBS dipole wiggler 0.85T",
"BodyPart": "BRAIN",
"SampleEnvironment": "ex vivo",
"SampleFixation": "formalin",
"SampleEmbedding": "mixed agar gel at 70% ethanol",
"PixelSize": [
25.08,
25.08,
25.08
],
"PixelSizeUnits": "um",
"AcquisitionParameters": {
"XStep": "n/a",
"XStages": 1,
"YStep": "n/a",
"YStages": 1,
"ZStep ": "2.2 mm",
"ZStages ": "2x79",
"Projections": 9990,
"RefN": "n/a",
"DarkN": 400,
"RefOn": "n/a",
"ScanningMode ": "continuous",
"ExposureTime": "0.036 s",
"AccExposureTime ": "0.006 s",
"AccFramesCount": 6,
"PropDistance": "3475 mm",
"Filters": "Mo 0.1mm Al 2mm SiO2 bars 12*5mm diameter",
"DetAvgEnergy": "93 keV",
"ScanGeometry": "quarter-acquisition, one scan in half-acquisition plus one annular scan",
"ScanRange": "360 deg",
"SensorName": "sCMOS PCO edge 4.2 CLHS",
"SensorMode": "rolling shutter",
"SensorPixelSize": "6.5 um",
"Magnification": 0.24,
"XPixelNum": 2048,
"YPixelNum": 176,
"OpticsType": "dzoom optic from BM05 based on Hasselblad 120mm F/4 macro objective",
"Scintillator": "LuAG:Ce 2000 um",
"SurfDoseRate": "10.5 Gy/s",
"VoiDoseRate": "10.5 Gy/s",
"VoiIntegDose": "2.48 kGy",
"ScanTime": "7.88 min",
"SeriesTime": "22 h"
},
"ProcessingParameters": {
"RefApproach": "reference jar with 70% ethanol, single reference",
"VolumeX": 5965,
"VolumeY": 5965,
"VolumeZ": 6990,
"32to16BitsMin": -0.04,
"32to16BitsMax": 0.1,
"Jp2ComprRatio": 10
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
{
"multiscales": [
{
"axes": [
{
"name": "z",
"type": "space",
"units": "micrometer"
},
{
"name": "y",
"type": "space",
"units": "micrometer"
},
{
"name": "x",
"type": "space",
"units": "micrometer"
}
],
"datasets": [
{
"coordinateTransformations": [
{
"scale": [
25.08,
25.08,
25.08
],
"type": "scale"
}
],
"path": "0"
}
],
"name": "/",
"version": "0.4"
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"zarr_format": 2
}
Loading