Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shape check to Dataset initialization #106

Merged
merged 5 commits into from
Jun 12, 2022

Conversation

JulioAPeraza
Copy link
Collaborator

Closes #99.

Changes proposed in this pull request:

  • Add _check_inputs_shape() function to utils.py to avoid repetitive code.
  • Add shape check to Dataset initialization:
    • Check whether the number of rows of y matches X.
    • Check whether the number of rows and columns of y match v.
    • Check whether the number of rows and columns of y match n.

Note: I didn't check for the number of columns of X vs the length of X_names, because an exception is raised in case of any mismatch when _get_predictors() is applied.

@JulioAPeraza JulioAPeraza added the bug Something isn't working label Jun 9, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jun 9, 2022

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.88%. Comparing base (0a99840) to head (86a1eb5).
Report is 24 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #106      +/-   ##
==========================================
+ Coverage   87.60%   87.88%   +0.28%     
==========================================
  Files          13       13              
  Lines         863      883      +20     
==========================================
+ Hits          756      776      +20     
  Misses        107      107              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll need to refactor _check_inputs_shape if we ever need to extend it to work with N-dimensional arrays, but I don't know if that is necessary at the moment, so I'm happy to approve it now. Thanks!

@tsalo
Copy link
Member

tsalo commented Jun 10, 2022

Oh BTW you should add yourself to the Zenodo file. I totally forgot about that.

Comment on lines +31 to +37
# Raise error if the number of rows and columns of v don't match y
with pytest.raises(ValueError):
utils._check_inputs_shape(y, v, "y", "v", row=True, column=True)

# Raise error if neither row or column is True
with pytest.raises(ValueError):
utils._check_inputs_shape(y, n, "y", "n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I know I approved already, but I just realized that, while the function allows Nones, that behavior isn't tested here. Can you test Nones? Not every Dataset will have v or n.

@JulioAPeraza
Copy link
Collaborator Author

I think we'll need to refactor _check_inputs_shape if we ever need to extend it to work with N-dimensional arrays

That's a great idea. I think that can be implemented by checking the shape looping through a list of axis given by the user:

utils._check_inputs_shape(y, X, "y", "X", axis=[0])
utils._check_inputs_shape(y, n, "y", "n", axis=[0, 1])
utils._check_inputs_shape(X, np.array(X_names)[None, :], "X", "X_names", axis=[1])

pymare/utils.py Outdated
Comment on lines 58 to 60
elif (param1 is None) or (param2 is None):
# If param1 or param2 is None, we don't need to check the shape
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think we need that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're okay dropping this clause, then, I'll be happy to approve.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Would you like to add support for N-dimensional arrays in this PR? I think I have got it working.

Copy link
Member

@tsalo tsalo Jun 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need N-dim support yet. I'm happy to merge as-is.

EDIT: Once the extra clause is removed, I mean.

@tsalo tsalo merged commit 19af399 into neurostuff:master Jun 12, 2022
@JulioAPeraza JulioAPeraza deleted the check-shapes branch June 13, 2022 15:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Datasets do not check input array shapes/sizes
3 participants