Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Workflow and CBMAWorkflow classes. Support pairwise CBMA workflows #809

Merged
merged 40 commits into from
Jun 22, 2023

Conversation

JulioAPeraza
Copy link
Collaborator

Closes None.

Changes proposed in this pull request:

  • Convert cbma_workflow function into a class (CBMAWorkflow).
  • Add Workflow base class.
  • Support pairwise estimators in Diagnostics.
  • Support pairwise estimators in Workflow.
  • Support pairwise estimators in Report.

@JulioAPeraza JulioAPeraza added the enhancement New feature or request label Jun 2, 2023
@JulioAPeraza
Copy link
Collaborator Author

@jdkent We are getting an error while building the documentation. It seems related to the nimads source file:

WARNING: /home/docs/checkouts/readthedocs.org/user_builds/nimare/checkouts/809/examples/01_datasets/05_plot_nimads.py failed to execute correctly: Traceback (most recent call last):
  File "/home/docs/checkouts/readthedocs.org/user_builds/nimare/checkouts/809/examples/01_datasets/05_plot_nimads.py", line 28, in <module>
    nimads_studyset = download_file("https://neurostore.org/api/studysets/Cv2LLUqG76W9?nested=true")
  File "/home/docs/checkouts/readthedocs.org/user_builds/nimare/checkouts/809/examples/01_datasets/05_plot_nimads.py", line 25, in download_file
    return response.json()
  File "/home/docs/checkouts/readthedocs.org/user_builds/nimare/envs/809/lib/python3.8/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

@codecov
Copy link

codecov bot commented Jun 2, 2023

Codecov Report

Patch coverage: 95.73% and project coverage change: +0.02 🎉

Comparison is base (3801b67) 88.37% compared to head (d46e3ca) 88.39%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #809      +/-   ##
==========================================
+ Coverage   88.37%   88.39%   +0.02%     
==========================================
  Files          46       47       +1     
  Lines        5943     6024      +81     
==========================================
+ Hits         5252     5325      +73     
- Misses        691      699       +8     
Impacted Files Coverage Δ
nimare/reports/base.py 94.92% <83.33%> (-3.44%) ⬇️
nimare/diagnostics.py 98.72% <96.22%> (-1.28%) ⬇️
nimare/reports/figures.py 98.09% <100.00%> (+0.07%) ⬆️
nimare/workflows/__init__.py 100.00% <100.00%> (ø)
nimare/workflows/ale.py 95.12% <100.00%> (+0.06%) ⬆️
nimare/workflows/base.py 100.00% <100.00%> (ø)
nimare/workflows/cbma.py 100.00% <100.00%> (ø)

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jdkent
Copy link
Member

jdkent commented Jun 2, 2023

Thanks for the heads up, I should have fixed the issue now.

nimare/diagnostics.py Outdated Show resolved Hide resolved
Copy link
Member

@jdkent jdkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just have a couple comments so far.

Could you add a report that showcases the output of a pairwise estimator?
you can use this existing example: 08_plot_cbma_subtraction_conjunction.py

@@ -204,8 +202,14 @@ def transform(self, result):

contribution_tables.append(contribution_table.reset_index())

# Concat PositiveTail and NegativeTail tables
contribution_table = pd.concat(contribution_tables, ignore_index=True, sort=False)
if pairwaise_estimators or len(label_maps) == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For MKDAChi2, the z-values can be positive or negative (you can check the z-map results from running the test in this pull request: #811).

I think we would want to display both Positive and Negative Tail cluster with Pairwise estimators.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we would want to display both Positive and Negative Tail cluster with Pairwise estimators.

We are displaying Positive and Negative tail cluster tables and maps. Here, I'm only excluding the Negative tail contribution table, which will be zero for studies in id1. I think if we want to display that table we would need to plot it in a separate figure and use the id2 for estimating the values. WDYT?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in the report, we only show the summary for id1 and coordinates1 from dataset1. Should we include another section for the second group?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little more questionable, overall, I would say yes, we do want to display id2 for the coordinates. for the contribution table, this is a bit debatable. With ALESubtraction, yes, I think it is useful to know how both groups contributed to the existing clusters.

For the MKDAChi2 I think it is less useful to see a contribution table (and with jacknife it would be computationally prohibitive if one was using the entire neurosynth dataset).

But the notion that ALESubtraction is used for more similarly sized groups and MKDAChi2 is used for comparing a relatively smaller dataset to a huge dataset is only reflective of how we are currently the algorithms, there's nothing stopping you from using ALESubtraction with neurosynth.

So I think the answer would be creating an option/flag for the user to decide whether they want to show the second group or not. could be called display_second_group or something, with a default value of False.

__all__ = ["ale_sleuth_workflow", "cbma_workflow", "macm_workflow"]
__all__ = [
"ale_sleuth_workflow",
"Workflow",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the Workflow baseclass from __all__, this is not a public facing class (e.g., do not want users importing and trying to use it by mistake.

Copy link
Member

@jdkent jdkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all your work on this!, I have a couple comments/questions for idea on how to restructure small sections of code.

@@ -274,9 +295,12 @@ def _transform(self, expid, label_map, result):
# with one missing a study in its inputs.
estimator = copy.deepcopy(result.estimator)

dset = estimator.dataset
if self._is_pairwaise_estimator:
all_ids = estimator.inputs_["id1"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would all ids be id1 and id2, the test case in the test suite is an unusual scenerio where the same dataset is used for id1 and id2.

temp_dset = (
estimator.dataset1.slice(other_ids)
if sign == "PositiveTail"
else estimator.dataset2.slice(other_ids)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for "NegativeTail" is the temp result fitting dataset2 (minus one analysis) and dataset2, but should be dataset1, dataset2 (minus one analysis)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I forgot to cover these cases.

Comment on lines 315 to 320
elif "stat_desc-group1MinusGroup2" in result.maps:
target_value_map = "stat_desc-group1MinusGroup2"
elif "z_desc-specificity" in result.maps:
target_value_map = "z_desc-specificity"
else:
target_value_map = "z"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be replaced by taking the union of two sets? one set being ("est", "stat", "stat_desc-group1MinusGroup2", "z_desc-specificity") and the other being the result.maps, then if the union of the sets is empty, have the target_value_map be "z".

Or taking a step back, the purpose of this is to find the uncorrected values, correct? (if the target_image was the corrected values, so one could theoretically manipulate the target_image name to create the target_value_map name, then it would just be a parameter instead of a list of if statements.

Copy link
Collaborator Author

@JulioAPeraza JulioAPeraza Jun 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be replaced by taking the union of two sets?

Good idea!

if the target_image was the corrected values, so one could theoretically manipulate the target_image name to create the target_value_map name, then it would just be a parameter instead of a list of if statements.

That's what I initially tried to do, but I think the logic here is to use stat maps if possible, if they are not available then use z maps.

@JulioAPeraza JulioAPeraza marked this pull request as draft June 21, 2023 16:06
Copy link
Member

@jdkent jdkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "PositiveTail" and "NegativeTail" should be defined as variables and described since they are written as strings a few too many times, so it could be hard to get the context of what the string means and more difficult to change in the future if we have to change every string instead of a variable.

@JulioAPeraza
Copy link
Collaborator Author

Great point.

@JulioAPeraza JulioAPeraza marked this pull request as ready for review June 21, 2023 20:24
@JulioAPeraza JulioAPeraza marked this pull request as draft June 21, 2023 22:11
@JulioAPeraza JulioAPeraza marked this pull request as ready for review June 21, 2023 22:34
Copy link
Member

@jdkent jdkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, just another minor comment, I think after that it looks good to me.

@@ -129,7 +131,7 @@
)
related_corrected_results = jackknife.transform(related_corrected_results)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't from this pull request, but I think this variable should have a different name, like related_diagnostic_results, so that related_corrected_results is not overwritten. follows the convention of going from regular results to corrected results.

Copy link
Member

@jdkent jdkent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks @JulioAPeraza

@jdkent jdkent merged commit c4346a2 into neurostuff:main Jun 22, 2023
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants