Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What values are being "clipped to the interval [0, 98]" #2145

Open
douglasmacdonald opened this issue Jun 29, 2024 · 4 comments
Open

What values are being "clipped to the interval [0, 98]" #2145

douglasmacdonald opened this issue Jun 29, 2024 · 4 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@douglasmacdonald
Copy link

douglasmacdonald commented Jun 29, 2024

Issue

I am not sure what the following paragraph means, as the values following it are not clipped between 0 and 98, and I can't see where in the document this clipping is being applied.

"Below we have min/max values calculated across the dataset per band. The values were clipped to the interval [0, 98] to stretch the band values and avoid outliers influencing the band histograms." https://github.com/microsoft/torchgeo/blob/v0.5.2/docs/tutorials/transforms.ipynb

See also

https://torchgeo.readthedocs.io/en/stable/tutorials/transforms.html#Dataset-Bands-and-Statistics

Fix

It would be helpful if it were clarified which values are being clipped.

@douglasmacdonald douglasmacdonald added the documentation Improvements or additions to documentation label Jun 29, 2024
@douglasmacdonald
Copy link
Author

I might need to read this more carefully. Is it saying that the original data between 0 and 98% of the maximum value was selected before the max and min values were calculated? If this is the case, a reference to the original work might be useful.

@isaaccorley
Copy link
Collaborator

You're right that it should actually read the 0 and 98th percentiles. The values were computed on the entire train set. This is a common approach to filtering outliers and then linear stretching multispectral data. This is similar to the functionality available in ArcGIS or QGIS. You can see more about this and other options here https://www.nv5geospatialsoftware.com/docs/BackgroundStretchTypes.html.

@adamjstewart
Copy link
Collaborator

@douglasmacdonald want to open a PR to clarify the documentation?

@douglasmacdonald
Copy link
Author

@adamjstewart, thanks for nudging. I can't do a PR just now as I am poking around some other data. Although it seems like a simple job, any corrections I made would be guesses rather than from my knowledge, and it would take some work to be sure I was correct. Give me a few months and I might start trying to contribute. Thank you for your work on Torchgeo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants