Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Look for axial T2/T2* images #26

Open
plbenveniste opened this issue Jul 18, 2024 · 5 comments
Open

Look for axial T2/T2* images #26

plbenveniste opened this issue Jul 18, 2024 · 5 comments
Assignees

Comments

@plbenveniste
Copy link
Collaborator

Opening this issue to reference all the data that we have, which is axial T2 and T2* images.

@plbenveniste plbenveniste self-assigned this Jul 18, 2024
@plbenveniste
Copy link
Collaborator Author

@naga-karthik This is being delayed by data-management issues related to the Bidsification of the beijing and the karolinska dataset. But I haven't forgotten about it.

@naga-karthik
Copy link
Member

Sure, no worries! thank you for the update! :)

@plbenveniste
Copy link
Collaborator Author

plbenveniste commented Aug 14, 2024

Here are the dataset I looked into:

  • basel-mp2rage: MP2RAGE
  • bavaria-quebec-spine-ms-unstitched: T2w
  • canproco: PSIR and STIR contrast
  • nih-ms-mp2rage : MPRAGE
  • sct-testing-large : T1w, T2w and T2*w

I didn't look at the following datasets (ms-nmo-beijing, ms-nyu, ms-karolinska-2020, ms-basel-2020,
umass-ms-ge-hdxt1.5, umass-ms-ge-pioneer3, umass-ms-siemens-espree1.5, umass-ms-ge-excite1.5 and ms-basel-2018) as they don't contain segmented T2w images (or in the case of ms-karolinska-2020 the segmentated images are included in sct-testing-large).

Code used
import json

path_json = "/Users/plbenveniste/moneta/users/pierrelouis/ms-lesion-agnostic/msd_data/dataset_2024-07-24_seed42_lesionOnly.json"

# Load the json file
with open(path_json, 'r') as f:
    data = json.load(f)

# Comcat images
images = data['test'] + data['train'] + data['validation']

# Initialize the list of images
axT2w_images = []

#  Iterate over the images
for image in images: 
    if image['contrast'] == 'T2w' and image['orientation'] == 'ax':
        axT2w_images.append(image)

print(f"Number of axial T2w images: {len(axT2w_images)}")

# Group the images per site and print the number of images per site
sites = {}
for image in axT2w_images:
    site = image['site']
    if site not in sites:
        sites[site] = []
    sites[site].append(image)

for site, images in sites.items():
    print(f"Site: {site}, number of images: {len(images)}")

The output was the following:

Number of axial T2w images: 1234
Site: bavaria-quebec, number of images: 986
Site: sct-testing-large, number of images: 248

@plbenveniste
Copy link
Collaborator Author

In the previous comment I was only looking at the images which have segmented lesions.

Here is the revised version with images which are T2w axial but don't necessary contain lesions.

I used the same code but on this file: /Users/plbenveniste/moneta/users/pierrelouis/ms-lesion-agnostic/msd_data/dataset_2024-09-10_seed42.json

Here is the output:

Number of axial T2w images: 2320
Site: bavaria-quebec, number of images: 1999
Site: sct-testing-large, number of images: 321

@plbenveniste
Copy link
Collaborator Author

I used the following code to filter and only get the T2w and axial data:

Code used for filtering
import json

path_json = "/Users/plbenveniste/moneta/users/pierrelouis/ms-lesion-agnostic/msd_data/dataset_2024-09-10_seed42.json"

# Load the json file
with open(path_json, 'r') as f:
    data = json.load(f)

# Get splits
train_images = data['train']
val_images = data['validation']
test_images = data['test']

# Filter the images
filtered_train_data = [entry for entry in train_images if entry['contrast'] == "T2w" and entry["orientation"] == "ax"]
filtered_val_data = [entry for entry in val_images if entry['contrast'] == "T2w" and entry["orientation"] == "ax"]
filtered_test_data = [entry for entry in test_images if entry['contrast'] == "T2w" and entry["orientation"] == "ax"]

# Save the filtered data
data['train'] = filtered_train_data
data['validation'] = filtered_val_data
data['test'] = filtered_test_data

# Update the number of images stored in the json file
data['numTest'] = len(filtered_test_data)
data['numTraining'] = len(filtered_train_data)
data['numValidation'] = len(filtered_val_data)

data["name"] = "ms-lesion-segmentation_T2w_axial_data"

# Save the filtered json file
path_filtered_json = "/Users/plbenveniste/moneta/users/pierrelouis/ms-lesion-agnostic/msd_data/dataset_T2w_ax.json"

with open(path_filtered_json, 'w') as f:
    json.dump(data, f, indent=4)

Here is the output required file @naga-karthik:
dataset_T2w_ax.json

I suggest to do some Ctrl+F to replace the path to your respective dataset for bavaria and sct-testing-large.

To convert this to nnUNet, you can use the code:

python nnunet/convert_msd_to_nnunet.py --input dataset_T2w_ax.json -o /nnUNet_raw --tasknumber XXX

Beware, with this code, the train and validation are stored in the nnUNet train folder.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants