Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes in mutable fields not correctly detected #27

Open
oTree-org opened this issue Sep 5, 2017 · 1 comment
Open

Changes in mutable fields not correctly detected #27

oTree-org opened this issue Sep 5, 2017 · 1 comment

Comments

@oTree-org
Copy link

If a field is mutable, SaveTheChange stores a deepcopy of it in _mutable_fields, and then when you save the object, compares the deepcopy with the current value:

[name for name, value in six.iteritems(instance._mutable_fields) if hasattr(instance, name) and getattr(instance, name) != value]

I see a couple of problems with this technique.

Problem 1
The first is that the it results in false positives:

>>> a = object()
>>> a == a
True
>>> a == copy.deepcopy(a)
False

I tested setting a field to object() and SaveTheChange updated that field even when I just accessed it without modifying it.

Problem 2
Using != doesn't work properly on all objects. For example, for NumPy arrays, a == b doesn't evaluate to a boolean, it evaluates to an array. So, SaveTheChange crashes if you try to store a a NumPy array on a field:

..\venv\lib\site-packages\otree_save_the_change\decorators.py:43: in save
    continue_saving, args, kwargs = save_hook(self, *args, **kwargs)
..\venv\lib\site-packages\otree_save_the_change\decorators.py:121: in _save_the_change_save_hook
    [name for name, value in six.iteritems(instance._mutable_fields) if hasattr(instance, name) and
getattr(instance, name) != value]
..\venv\lib\site-packages\otree_save_the_change\decorators.py:121: in <listcomp>
    [name for name, value in six.iteritems(instance._mutable_fields) if hasattr(instance, name) and
getattr(instance, name) != value]
E   ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or
a.all()

In this case, getattr(instance, name) != value evaluates to an array, and then the and tries to convert it to a boolean, which raises the ValueError.

Suggested solution
What if instead of deepcopy, we fill _mutable_fields with the output of .get_prep_value? And then in .save(), we compare that with .get_prep_value on the new value of the field? Because .get_prep_value is what determines what will actually go in the database anyway, right?

I think we would only need to call .get_prep_value at most twice for each mutable field: when it is first added to _mutable_fields, and in _save.

@karanlyons
Copy link
Owner

karanlyons commented Sep 5, 2017

Yeah, mutable fields are a nightmare. a == b not evaluating to some truthy/falsey value feels a bit insane to me, but if there are model fields out in the wild expecting this behavior then I guess we need to support them.

Problem 1 is because object’s standard equality check is just (I believe) a reference check. Most proper primitives that map to fields should (and do) implement cmp/eq/ne/etc. correctly.

Problem 2, though, is gonna bug me. The idea of get_prep_value would probably work, but I’m not sure what sort of performance hit we’re taking over the current approach. We’d be then making that call on the first time an attribute is accessed and then potentially for every time it is set, and a final time at save(). This may not actually be much worse than what is already going on, though.

Where it gets potentially tricky is that get_prep_value isn’t guaranteed to be pure: It should be pure, but then again a should == a, so we’re potentially just closing one door and opening another.

One solution I may float is trying to support certain object comparisons differently. In the case of numpy it looks like __eq__ returns another array compared element-wise. Coercing that to a bool doesn’t work, which means we’d at the very least need to special case numpy arrays to use (a == b).all()). But now we’re adding a set of lookups to lambdas for the actual comparison work, which may be worse than any other solution already presented.

I’m actually a bit annoyed with how numpy is doing this. == or a the very least bool(foo) should be True or False.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants