Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare vector results with ROOT's. #36

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

Conversation

jpivarski
Copy link
Member

@jpivarski jpivarski commented Mar 17, 2021

@ianna This is the branch I said I'd make to get you started with ROOT tests. The tests in this repo aren't PR-numbered, they're organized hierarchically—I think that makes more sense for this kind of library. The ROOT tests are all under tests/root, though some of the tests that would be needed to get up to 100% coverage won't be ROOT-related.

The ROOT tests need this line:

# If ROOT is not available, skip these tests.
ROOT = pytest.importorskip("ROOT")

so they only run when ROOT is available (one, but not all, of the CI tests).

For clearer output in a terminal (and CI), I'm using pytest's parameterize, rather than writing a loop in a test. pytest will give independent error messages and filter print statements for only the combinations that fail.

# ROOT.Math.PxPyPzEVector constructor arguments to get all the weird cases.
constructor = [
(0, 0, 0, 0),
(0, 0, 0, 10),
(0, 0, 0, -10),
(1, 2, 3, 0),
(1, 2, 3, 10),
(1, 2, 3, -10),
(1, 2, 3, 2.5),
(1, 2, 3, -2.5),
]
# Coordinate conversion methods to apply to the VectorObject4D.
coordinates = [
"to_xyzt",
"to_xythetat",
"to_xyetat",
"to_rhophizt",
"to_rhophithetat",
"to_rhophietat",
"to_xyztau",
"to_xythetatau",
"to_xyetatau",
"to_rhophiztau",
"to_rhophithetatau",
"to_rhophietatau",
]
# Run a test that compares ROOT's 'M2()' with vector's 'tau2' for all cases.
@pytest.mark.parametrize("constructor", constructor)
@pytest.mark.parametrize("coordinates", coordinates)
def test_M2(constructor, coordinates):
assert ROOT.Math.PxPyPzEVector(*constructor).M2() == pytest.approx(
getattr(
vector.obj(**dict(zip(["x", "y", "z", "t"], constructor))), coordinates
)().tau2
)

The test itself constructs a ROOT object and a vector object using the vector.obj(coord1=a, coord2=b, ...) syntax (through a **dict because it's parameterized). Then it converts the vector object into each of the coordinate systems, so that all formulas for the parameter of interest (M2/tau2 below) get tested. Finally, this has to go through pytest.approx because the round-trip through a different coordinate system introduces numerical error.

assert ROOT.Math.PxPyPzEVector(*constructor).M2() == pytest.approx(
getattr(
vector.obj(**dict(zip(["x", "y", "z", "t"], constructor))), coordinates
)().tau2
)

The ROOT types to test are the Math::GenVector classes from 2005, not the original TVector2, TVector3, and TLorentzVector from 1999 (though the latter are still used, so it wouldn't be bad to look at them—it's just that the new-style vectors are normative). Wherever a new-style ROOT vector returns a non-NaN result, we need to return the same result. If ROOT returns NaN and there is a sensible value to return, such as ROOT.Math.PxPyPzEVector(0, 0, 0, 0).Rapidity(), which could be 0, then we're free to make it so.

The Math::GenVector classes, parameters, and their vector equivalents are given below. I've included links to the ROOT source, since it can be useful to see the code to know what are relevant cases. For example, ROOT treats forward-timelike, spacelike, and backward-timelike Lorentz vectors differently. In the spacelike region, they carry a negative sign from mass-squared to mass (instead of letting it be imaginary), but they completely exclude the backward-timelike region by clipping to the nearest spacelike point with the same 3D components. Thus, it's not enough to just check positive, zero, and negative; it's also relevant whether the mass magnitude is greater than or less than the momentum magnitude.

I don't think it's important to check ROOT's single-precision floating point classes; just double-precision is fine. On our side, the type of vector is determined by the set of keywords passed to the vector.obj function

  • ROOT.Math.XYVector(x, y) like vector.obj(x=x, y=y)
    • ROOT's Dot is our dot (in general, we have lowercase method and property names)
    • ROOT's Mag2 is our rho2
    • ROOT's Mag is our rho
    • Phi
    • Rotate is our rotateZ
    • Unit
    • X, Y
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.Polar2DVector(rho, phi) like vector.obj(rho=rho, phi=phi)
    • Dot
    • Mag2/rho2
    • Mag/rho
    • Phi
    • Rotate/rotateZ
    • Unit
    • X, Y
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.XYZVector(x, y, z) like vector.obj(x=x, y=y, z=z)
    • Dot
    • Cross
    • Mag2/mag2 (for 3D, the name corresponds)
    • Mag/mag (same deal)
    • Perp2 is our rho2 (our name is consistent across 2D, 3D, 4D)
    • Perp is our rho (same deal, though ROOT also has a Rho, but not Rho2)
    • Phi
    • Eta
    • Theta
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • Unit
    • X, Y, Z
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.Polar3DVector(r, theta, phi) like vector.obj(rho=r*sin(theta), theta=theta, phi=phi)
    • Dot
    • Cross
    • Mag2/mag2 (for 3D, the name corresponds)
    • Mag/mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • Unit
    • X, Y, Z
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.RhoZPhiVector(rho, z, phi) like vector.obj(rho=rho, z=z, phi=phi)
    • Dot
    • Cross
    • Mag2/mag2 (for 3D, the name corresponds)
    • Mag/mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • Unit
    • X, Y, Z
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.RhoEtaPhiVector(rho, eta, phi) like vector.obj(rho=rho, eta=eta, phi=phi)
    • Dot
    • Cross
    • Mag2/mag2 (for 3D, the name corresponds)
    • Mag/mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • Unit
    • X, Y, Z
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.PxPyPzEVector(x, y, z, t) like vector.obj(px=x, py=y, pz=z, E=t) (I don't see how this is different from ROOT.Math.XYZTVector(x, y, z, t); maybe some method names?)
    • Dot
    • P2 is our mag2 (ROOT's 4D mag2 is the dot product with itself, what we call tau or mass)
    • P is our mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • Mass2 is our tau2 (or mass2 if it's a momentum vector and has kinematic synonyms)
    • Mass is tau (or mass)
    • Rapidity
    • Beta (scalar)
    • Gamma (scalar)
    • BoostToCM is our beta3 (it doesn't boost: it returns a velocity vector for which c=1)
    • ColinearRapidity (we don't have an equivalent, but perhaps we should)
    • Et2 to have a method for transverse energy, you have to construct a vector.obj with momentum coordinates
    • Et
    • Mt2 same for transverse mass: it's only on momentum vectors
    • Mt
    • isLightlike/is_lightlike
    • isSpacelike/is_spacelike
    • isTimelike/is_timelike
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • so are the boosts
    • Unit
    • X, Y, Z, T
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.PxPyPzMVector(x, y, z, tau) like vector.obj(px=x, py=y, pz=z, mass=tau)
    • Dot
    • P2 is our mag2 (ROOT's 4D mag2 is the dot product with itself, what we call tau or mass)
    • P is our mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • Mass2 is our tau2 (or mass2 if it's a momentum vector and has kinematic synonyms)
    • Mass is tau (or mass)
    • Rapidity
    • Beta (scalar)
    • Gamma (scalar)
    • BoostToCM is our beta3 (it doesn't boost: it returns a velocity vector for which c=1)
    • ColinearRapidity (we don't have an equivalent, but perhaps we should)
    • Et2 to have a method for transverse energy, you have to construct a vector.obj with momentum coordinates
    • Et
    • Mt2 same for transverse mass: it's only on momentum vectors
    • Mt
    • isLightlike/is_lightlike
    • isSpacelike/is_spacelike
    • isTimelike/is_timelike
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • so are the boosts
    • Unit
    • X, Y, Z, T
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.PtEtaPhiEVector(rho, eta, phi, t) like vector.obj(pt=rho, eta=eta, phi=phi, E=t)
    • Dot
    • P2 is our mag2 (ROOT's 4D mag2 is the dot product with itself, what we call tau or mass)
    • P is our mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • Mass2 is our tau2 (or mass2 if it's a momentum vector and has kinematic synonyms)
    • Mass is tau (or mass)
    • Rapidity
    • Beta (scalar)
    • Gamma (scalar)
    • BoostToCM is our beta3 (it doesn't boost: it returns a velocity vector for which c=1)
    • ColinearRapidity (we don't have an equivalent, but perhaps we should)
    • Et2 to have a method for transverse energy, you have to construct a vector.obj with momentum coordinates
    • Et
    • Mt2 same for transverse mass: it's only on momentum vectors
    • Mt
    • isLightlike/is_lightlike
    • isSpacelike/is_spacelike
    • isTimelike/is_timelike
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • so are the boosts
    • Unit
    • X, Y, Z, T
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.PtEtaPhiMVector(rho, eta, phi, tau) like vector.obj(pt=rho, eta=eta, phi=phi, E=tau)
    • Dot
    • P2 is our mag2 (ROOT's 4D mag2 is the dot product with itself, what we call tau or mass)
    • P is our mag (same deal)
    • Perp2/rho2
    • Perp/rho
    • Phi
    • Eta
    • Theta
    • Mass2 is our tau2 (or mass2 if it's a momentum vector and has kinematic synonyms)
    • Mass is tau (or mass)
    • Rapidity
    • Beta (scalar)
    • Gamma (scalar)
    • BoostToCM is our beta3 (it doesn't boost: it returns a velocity vector for which c=1)
    • ColinearRapidity (we don't have an equivalent, but perhaps we should)
    • Et2 to have a method for transverse energy, you have to construct a vector.obj with momentum coordinates
    • Et
    • Mt2 same for transverse mass: it's only on momentum vectors
    • Mt
    • isLightlike/is_lightlike
    • isSpacelike/is_spacelike
    • isTimelike/is_timelike
    • ROOT's rotateX, rotateY, rotateZ, and rotate_axis are in its VectorUtil namespace (see below)
    • so are the boosts
    • Unit
    • X, Y, Z, T
    • __add__ (addition by a vector)
    • __sub__ (subtraction by a vector)
    • __neg__ (unary negation of a vector)
    • __mul__ (multiplication by a scalar)
    • __truediv__ (division by a scalar)
    • __eq__ (vector equality), but since you're going through different coordinate systems, use isclose
  • ROOT.Math.EulerAngles(phi, theta, psi) like calling rotate_euler(phi, theta, psi) on a 3D or 4D vector object
    • apply (the default order in rotate_euler corresponds to ROOT's choice)
  • ROOT.Math.Quaternion(u, i, j, k) like calling rotate_quaternion(u, i, j, k) on a 3D or 4D vector object
    • apply
  • ROOT.Math.RotationX like rotateX
    • apply
  • ROOT.Math.RotationY like rotateY
    • apply
  • ROOT.Math.RotationZ like rotateZ
    • apply (our rotateZ applies to 2D vectors as well as 3D vectors)
  • ROOT.Math.Transform3D like transform3D
    • apply (note that you have to make an object that supplies coordinates through __getitem__ as "xx", "xy", etc. A simple dict will do)
  • ROOT.Math.LorentzRotation is the 10 component symmetric part of a transform4D, which is a general 16 component matrix
    • apply
  • ROOT.Math.VectorUtil
    • DeltaPhi
    • DeltaR2
    • DeltaR2RapidityPhi (we don't have an equivalent, but maybe we should)
    • DeltaR
    • DeltaRapidityPhi (we don't have an equivalent)
    • CosTheta (we don't have an equivalent, but it would be called deltacostheta)
    • Angle we do have an equivalent, though it's named deltaangle ("delta" for everything that's a difference between two vectors)
    • ProjVector (we don't have an equivalent)
    • PerpVector (we don't have an equivalent)
    • Perp2 (we don't have an equivalent)
    • Perp (we don't have an equivalent)
    • InvariantMass (we don't have this, but maybe it would be good to compute mass without also computing the other coordinates, as our __add__ does)
    • InvariantMass2 (same deal)
    • The rest are all equivalent to methods that we do have.

I think this note is done!

@jpivarski jpivarski marked this pull request as draft March 17, 2021 14:59
@henryiii
Copy link
Member

henryiii commented Mar 18, 2021

Another powerful tool is a PyTest fixture. You can write:

@pytest.fixture(request=constructor_list)
def constructor(params):
    return request.param

@pytest.fixture(request=coordinates_list)
def coordinates(params):
    return request.param

def test_M2(constructor, coordinates):
   ...

def test_other(constructor, coordinates):
   ...

Now, you can just use constructor and coordinates anywhere without the boilerplate of extra decorators, and even more importantly, you can move the fixture definitions to a conftest.py file, then every test file in the same directory or below will have access to this fixture! You can also make the fixture smarter, such as providing nice IDs for each parametrization, and all tests pick up the changes for free. The is just the surface of fixtures, you can control where they get constructed, you can do setup/cleanup, etc. This is how pytest's internal fixtures (capsys, tmpfile, etc) work.

@henryiii
Copy link
Member

Also, this to me looks very much like property-based testing. This might be a good resource: https://www.youtube.com/watch?v=tiy031EoDXo If static types have been added, you might even be able to get a head start writing the tests. :)

@jpivarski
Copy link
Member Author

Good suggestion! I had to watch the video to find out what "property-based testing is," and it's a bit like that because we're using ROOT to generate the expected results. It's a bit more than that because there are some cases that have ambiguous answers—the result in some coordinate systems is NaN because there was a 0/0 or square root/logarithm of a negative number. In these cases, we get to choose what we want the answer to be, ideally to correspond to what it would be in another coordinate system, if a non-NaN value exists there.

What I recommended above is to take what ROOT returns as the definition of truth for the ambiguous cases. My thinking is that there might be some analysis logic that relies on a particular behavior at extreme points and we want that logic to be undisturbed. If ROOT's output were unambiguously wrong—a value that mathematically ought to be finite returned as NaN or the wrong finite value—then we shouldn't follow it; we should submit a bug report to ROOT. But if there is no "good" answer, there's no reason to rock the boat.

I was also recommending "partition testing" (another word I just learned) for choosing numerical cases. We can get 100% line coverage while leaving out some important cases, and you have to look at the ROOT code to know what those cases are. In the ROOT code, you see if statements checking for negative mass-squared and choosing a very odd return value of

mass_squared = copysign(mass*mass, mass)

so that we can carry information about spacelike vectors without complex numbers. Then when it comes to computing the energy, they do this:

energy_squared = maximum(mass_squared + momentum_squared, 0)

cutting off the backward-timelike part of the space. Those are clearly intentional choices; they effectively map the space of masses, which would ordinarily be positive-real and positive-imaginary but not in general complex, onto a space of positive and negative reals. Someone must have decided that that was "best," and the logic of analysis code might legitimately depend on it. So we preserve it.

However, that means that there are qualitatively different partitions in the space of points to test. We only need one from each, but if there are two vectors going into a formula, then we need a Cartesian product of one from each. I think that could be done rather nicely with

@pytest.mark.parametrize("v1", constructor) 
@pytest.mark.parametrize("v2", constructor) 
def test_dot(v1, v2):
    ...

Your suggestion of promoting coordinates to a fixture is a good one: we might want to use it everywhere. The case is weaker for constructor, which probably should have been qualified as xyzt_constructor, since we would use a different set of values to pick points from the different partitions for other ROOT class constructors (PxPyPzE4D, PxPyPzM4D, PtEtaPhiE4D, PtEtaPhiM4D, etc.).

@henryiii
Copy link
Member

henryiii commented Mar 18, 2021

I would generally use a fixture in all cases except when there is exactly one usage. It's really about the same number of lines, and less if you have several usages, and factors out the listing (which you already have done) from the test. It also doesn't build up a lot of decorators if you have a Cartesian product, like the mark does. However, there's no harm in having fixtures in a single file that are not available to the other files.

Fixtures also are better when you want to actually do the construction or some other transform on the data before using it every time; you can return anything you'd like, not just a vanilla request.param.

xyzt_constructor is still a fine fixture. Maybe a local one, but still. :)

@henryiii
Copy link
Member

I haven't used Hypothesis before, but I think it has extensive tooling for selecting from distributions with special cases. I'm reviewing a SciPy abstract for a talk on Hypothesis, which is why that's on my mind. :)

@ianna
Copy link
Collaborator

ianna commented Mar 19, 2021

@jpivarski and @henryiii - thanks!

@ianna
Copy link
Collaborator

ianna commented Mar 19, 2021

@henryiii - thanks for the tip: Hypothesis looks very nice indeed! I like the idea of its ghostwriter that I have tested it immediately :-)

The command line interface needs click and the writer needs black. It all integrates well with coverage. It looks like it supports numpy and pandas.

The generated code is neat, but some work has to be done on the strategy definitions. Here is an example:

# This test code was written by the `hypothesis.extra.ghostwriter` module
# and is provided under the Creative Commons Zero public domain dedication.

import vector
from hypothesis import given, strategies as st

# TODO: replace st.nothing() with appropriate strategies


@given(azimuthal=st.nothing())
def test_fuzz_MomentumObject2D(azimuthal):
    vector.MomentumObject2D(azimuthal=azimuthal)


@given(azimuthal=st.nothing(), longitudinal=st.nothing())
def test_fuzz_MomentumObject3D(azimuthal, longitudinal):
    vector.MomentumObject3D(azimuthal=azimuthal, longitudinal=longitudinal)


@given(azimuthal=st.nothing(), longitudinal=st.nothing(), temporal=st.nothing())
def test_fuzz_MomentumObject4D(azimuthal, longitudinal, temporal):
    vector.MomentumObject4D(
        azimuthal=azimuthal, longitudinal=longitudinal, temporal=temporal
    )


@given(azimuthal=st.nothing())
def test_fuzz_VectorObject2D(azimuthal):
    vector.VectorObject2D(azimuthal=azimuthal)


@given(azimuthal=st.nothing(), longitudinal=st.nothing())
def test_fuzz_VectorObject3D(azimuthal, longitudinal):
    vector.VectorObject3D(azimuthal=azimuthal, longitudinal=longitudinal)


@given(azimuthal=st.nothing(), longitudinal=st.nothing(), temporal=st.nothing())
def test_fuzz_VectorObject4D(azimuthal, longitudinal, temporal):
    vector.VectorObject4D(
        azimuthal=azimuthal, longitudinal=longitudinal, temporal=temporal
    )

@henryiii
Copy link
Member

Are there type annotations? If there are no type annotations, it can't infer anything about the functions. In general, I would recommend adding type annotations as-you-go, as it's a bit tricky to add them after the fact, as it's then easier to forget exactly what you support, and it helps promote good static design, and it's usually easy to add as you go. Unlike a doctoring, it is tested so it can't get "stale" if things change, mypy will start complaining immediately if you invalidate them.

But anyway, that's still a pretty nice start, you could replace the st.nothing() with floating point strategies. You'd need to provide some information that would not be in the type system anyway, like >0, etc, on some of them. (Now if Python had a built-in contracts library...)

@henryiii
Copy link
Member

It should support NumPy, it's in the NumPy test suite, in fact adding Hypothesis caught a NumPy bug. :)

@henryiii
Copy link
Member

PS: You have now used Hypothesis more than I have. :) Though I see several really useful places to put it into histogram test suites. :)

@ianna
Copy link
Collaborator

ianna commented Mar 19, 2021

Are there type annotations? If there are no type annotations, it can't infer anything about the functions. In general, I would recommend adding type annotations as-you-go, as it's a bit tricky to add them after the fact, as it's then easier to forget exactly what you support, and it helps promote good static design, and it's usually easy to add as you go. Unlike a doctoring, it is tested so it can't get "stale" if things change, mypy will start complaining immediately if you invalidate them.

But anyway, that's still a pretty nice start, you could replace the st.nothing() with floating point strategies. You'd need to provide some information that would not be in the type system anyway, like >0, etc, on some of them. (Now if Python had a built-in contracts library...)

yes, that what I did. These generated tests cover 43% of src/vector/methods.py. It's a good start :-)

@ianna
Copy link
Collaborator

ianna commented Mar 19, 2021

PS: You have now used Hypothesis more than I have. :) Though I see several really useful places to put it into histogram test suites. :)

I thought of AwkwardArray :-)

@jpivarski
Copy link
Member Author

Are there type annotations?

Not yet, but I do plan to add them. I'm thinking it will be easier to add them in a burst, just as documentation is always easier to add in a burst than as-you-go, and for the same reason. During development, I don't know what (exactly) the types are going to be and what (exactly) the functions are going to do. Once things get stable enough that they're not changing all the time—the annealing temperature cools—then we can crystallize it with such constraints to prevent it from wandering from its intended behavior.

There are some parts of this that are going to require some careful thought. The compute functions, for instance, exist to be duck-typed, and they use a minimum of Python language features (PR #38) to ensure that all the backends we have in mind will work on them. By "minimum of Python language features," I'm including such rules as "no if statements" (which would be a problem for tracing JITs like JAX and TensorFlow). Doing the Numba implementation now, I almost had to restructure them again before I found a Numba work-around (couldn't pass lib as an argument where lib is a module, though I can enclose it as a closure to one JIT-compiled function and have it get passed from one JIT-compiled function to another—there are a lot of technical restrictions like this).

To break the vector class instances down into objects that a particular backend can accept in compute functions, it was necessary to handle their types at runtime with the dispatch_map in each compute submodule. This dispatch_map must be interpreted differently by each backend (Numba is using it to build lowered functions, whereas the object and NumPy backends simply wrap the results).

I'm hoping it will be possible to express these rules in the static annotations by annotating the methods on the class objects and telling the type-checker to assume that the implementation provides such a type correctly. Viewed from outside, the classes have well-behaved, static types, but their implementations have to do dynamic typing to ensure that using functions that are shared across backends. (This "sharing across backends" requirement is important, since there are a lot of them.) We can make those type assertions strongly because the code provides the right type by construction, rather than by verification. I just hope that we can express that assurance in the static type language. (That is, "don't check the internals of this function, but trust me, it returns an XYZ. Do all external type-checking as normal.")

@ianna
Copy link
Collaborator

ianna commented Mar 22, 2021

@jpivarski - Here is a link to a modified test that uses both fixture and Hypothesis. I think, Hypothesis can define the OverflowError and NaN checks as its strategy. These are corner cases in its tests. Also, the coordinates fixture was reset between function calls but not between test cases generated by @given(...), so I've changed it to a module-scoped fixture to make the Hypothesis happy.

Hypothesis does finds a few distinct failures - the cases are added to the constructor.

@jpivarski
Copy link
Member Author

I wrote a long response about this and then my computer crashed. I'll write the gist of it again.

I understand the motivation for hypothesis, but I don't think it suits our goal. The overflow cases that hypothesis found either depend on ROOT's limits (which we aren't obliged to reproduce!) or details in the choices of formulas in the vector.compute module that don't have physical meaning. For example, we could make different trig substitutions in these formulas that change the performance of computation and the limit of numerical precision, but the new formulas should be considered equally "correct."

In different contexts from ours, overflow errors can be disastrous, from segfaults to exploitable security vulnerabilities. I understand why hypothesis is checking for those extremes and wants to clamp down on a particular behavior to ensure that the behavior doesn't change without notice. But that's a different context from our case, in which we expect numerical precision to be inexact and to break down for large numbers.

On the other hand, an automated search is not going to find the boundaries that do matter for us—the points where an if statement in ROOT or np.absolute, np.maximum, etc. in our code discontinuously changes behavior. I gave the example of ROOT's handling of square roots of negative mass squared. One "correct" interpretation would be to say that it should be NaN, since the square root of a negative number is not defined on the reals. ROOT's choice is to say mass = -sqrt(abs(mass_squared)) if mass_squared is negative but bigger than -momentum_squared and to say mass = 0 for the third case. Thus, there are three distinct formulas involved (counting mass = sqrt(mass_squared) and mass = -sqrt(abs(mass_squared)) as separate formulas) and we absolutely have to include at least one point from each of these regions in a test. However, an automated search wouldn't find that there are turning points or where these turning points are: it is necessary to do that by eye.

As I see it, these tests need to satisfy two goals:

  1. We need to find errors in our formulas. We do this partly by round-tripping through coordinate systems (hoping there aren't exactly cancelling errors in the coordinate transformations!), and partly by comparing with ROOT (because it has been tested by thousands of physicists). Any significant disagreement (i.e. beyond pytest.approx's tolerance) for non-extreme values should either result in a fix to our code or a bug-report to ROOT, after being carefully examined. If there are multiple formulas that take effect in different intervals of inputs, then we must test each.
  2. When there is no "correct" answer but ROOT returns a non-NaN value anyway, we should reproduce ROOT's result because analyses can legitimately depend on it. The negative mass-squared example I gave above is a great example of that. Note that this is one-sided: if ROOT returns a NaN value, we can decide what the answer should be. (The simplest would be to follow ROOT's lead and also make it NaN, but that's a choice.)

But any visible differences in performance (which hypothesis is not checking) or numerical limits (which it is) are not supposed to be the same between Vector and ROOT, or even among Vector's backends. The VectorObject* classes use Python objects and VectorNumpy* use NumPy arrays, which have very different behaviors for integers—Python integers have arbitrary precisions; NumPy integers are fixed-width. Then that ceases to be true for VectorObject* when they pass through Numba, which replaces its integers with fixed-width equivalents. If the tests are opinionated about these differences, we'll get a lot of false positives about irrelevant changes.

I usually think of tests as "pins," holding cloth in place so that it doesn't move around while you're trying to work on it. If you pin something that is supposed to move, that can be just as bad as not pinning something that isn't. Here, what matters is the approximate outputs of points in each qualitatively different domain of inputs, for non-extreme input values.

@henryiii
Copy link
Member

What I've been doing (and would recommend) is setting reasonable limits for Hypothesis (since numerical differences for extreme numbers isn't interesting, and I tend to want to avoid generating NaN or infinity), and always using pytest.approx (which I recently found works perfectly on numpy arrays, which is fantastic). You can set nan_ok on pytest.approx if you have some nans to compare, or you can just avoid them. I find it's helpful to know where you differ from the reference, even if that difference is expected, it's good to know it exists (and exclude it).

@henryiii
Copy link
Member

I would generally recommend getting into the habit of adding simple annotations if it is obvious. If you write a function that returns a bool, you should add -> bool when you write it. If you refactor it to change it from returning a bool, you can change the type annotation (and MyPy will insist on it). For complex types, such as things you'll have to develop Protocols for, Generic types, etc, it's possibly best to wait and add them later; though I find it helps you think about the design. If you are already doing this, why not write down the types while you are thinking about them, instead of later?

As to typing of simple functions, in general, the more explicit types, the better, and types internally help keep the codebase consistent, so ideally the types for the simple functions could be protocol(s). Also, at runtime, types are just stored in an extra variable so the don't get in the way of any usages. You definitely do have to requires something inside the functions to use the arguments, so that could be the contents of the Protocol. But if needed, you can define them as Any, and # type: ignore or cast them inside classes when using them. Even Python's own standard library (in typeshed) does this occasionally, so it's quite valid. The most important thing for users is to have all external functions nicely typed.

@ianna
Copy link
Collaborator

ianna commented May 10, 2021

@jpivarski and @henryiii - would it be possible to grant me an access to this repo? thanks!

git push origin ianna/compare-with-ROOT
remote: Permission to scikit-hep/vector.git denied to ianna.

@jpivarski
Copy link
Member Author

I've given you access. (You should be getting an email link.)

@ianna ianna changed the base branch from develop to main December 21, 2021 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants