Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distinguishing annotated types from non-annotated ones in serialization_strategy #116

Merged
merged 6 commits into from
Jun 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 60 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,7 @@ Table of contents
* [Subclasses without a common field](#subclasses-without-a-common-field)
* [Class level discriminator](#class-level-discriminator)
* [Working with union of classes](#working-with-union-of-classes)
* [Extending existing types](#extending-existing-types)
* [Code generation options](#code-generation-options)
* [Add `omit_none` keyword argument](#add-omit_none-keyword-argument)
* [Add `by_alias` keyword argument](#add-by_alias-keyword-argument)
Expand Down Expand Up @@ -1127,6 +1128,10 @@ dictionary = instance.to_dict()
# {'x': '2021', 'y': '2021-01-01'}
```

Note that you can register different methods for multiple logical types which
are based on the same type using `NewType` and `Annotated`.
See [Extending existing types](#extending-existing-types) for details.

#### `aliases` config option

Sometimes it's better to write the field aliases in one place. You can mix
Expand Down Expand Up @@ -1800,7 +1805,7 @@ of all classes and an attempt to deserialize each of them.
Usually this approach can be used when you have multiple classes without a
common superclass or when you only need to deserialize some of the subclasses.
In the following example we will use `include_supertypes=True` to
deserialize 2 subclasses out of 3:
deserialize two subclasses out of three:

```python
from dataclasses import dataclass
Expand Down Expand Up @@ -1897,6 +1902,60 @@ assert plate == Plate(
)
```

### Extending existing types

There are situations where you might want some values of the same type to be
treated as their own type. You can create new logical types with
[`NewType`](https://docs.python.org/3/library/typing.html#newtype) or
[`Annotated`](https://docs.python.org/3/library/typing.html#typing.Annotated)
and register serialization strategies for them:

```python
from typing import Mapping, NewType, Annotated
from dataclasses import dataclass
from mashumaro import DataClassDictMixin

SessionID = NewType("SessionID", str)
AccountID = Annotated[str, "AccountID"]

@dataclass
class Context(DataClassDictMixin):
account_sessions: Mapping[AccountID, SessionID]

class Config:
serialization_strategy = {
AccountID: {
"deserialize": lambda x: ...,
"serialize": lambda x: ...,
},
SessionID: {
"deserialize": lambda x: ...,
"serialize": lambda x: ...,
}
}
```

Although using `NewType` is usually the most reliable way to avoid logical
errors, you have to pay for it with notable overhead. If you are creating
dataclass instances manually, then you know that type checkers will
enforce you to enclose a value in your `"NewType"` callable, which leads
to performance degradation:

```python
python -m timeit -s "from typing import NewType; MyInt = NewType('MyInt', int)" "MyInt(42)"
10000000 loops, best of 5: 31.1 nsec per loop

python -m timeit -s "from typing import NewType; MyInt = NewType('MyInt', int)" "42"
50000000 loops, best of 5: 4.35 nsec per loop
```

However, when you instantiate dataclasses using the `from_*` method, there will
be no performance degradation, because the value won't be enclosed in the
callable in the generated code. Therefore, if performance is more important
to you than catching logical errors by type checkers in case you are actively
creating or changing dataclasses manually, then you should take a closer look
at using `Annotated`.

### Code generation options

#### Add `omit_none` keyword argument
Expand Down
6 changes: 1 addition & 5 deletions mashumaro/core/meta/code/builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,10 +159,6 @@ def get_field_resolved_type_params(
cls = self._get_field_class(field_name)
return self.resolved_type_params[cls]

@property
def field_types(self) -> typing.Dict[str, typing.Any]:
return self.__get_field_types()

def get_field_types(
self, include_extras: bool = False
) -> typing.Dict[str, typing.Any]:
Expand Down Expand Up @@ -697,7 +693,7 @@ def get_pack_method_name(
def _add_pack_method_lines(self, method_name: str) -> None:
config = self.get_config()
try:
field_types = self.field_types
field_types = self.get_field_types(include_extras=True)
except UnresolvedTypeReferenceError:
if (
not self.allow_postponed_evaluation
Expand Down
5 changes: 4 additions & 1 deletion mashumaro/core/meta/types/pack.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,10 @@ def get_overridden_serialization_method(
serialize_option = spec.field_ctx.metadata.get("serialize")
if serialize_option is not None:
return serialize_option
for typ in (spec.type, spec.origin_type):
checking_types = [spec.type, spec.origin_type]
if spec.annotated_type:
checking_types.insert(0, spec.annotated_type)
for typ in checking_types:
for strategy in spec.builder.iter_serialization_strategies(
spec.field_ctx.metadata, typ
):
Expand Down
47 changes: 46 additions & 1 deletion tests/test_annotated.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
from dataclasses import dataclass
from datetime import date
from datetime import date, datetime

from typing_extensions import Annotated

from mashumaro import DataClassDictMixin
from mashumaro.config import BaseConfig


def test_annotated():
Expand All @@ -14,3 +15,47 @@ class DataClass(DataClassDictMixin):
obj = DataClass(date(2022, 2, 6))
assert DataClass.from_dict({"x": "2022-02-06"}) == obj
assert obj.to_dict() == {"x": "2022-02-06"}


def test_annotated_with_overridden_methods():
@dataclass
class DataClass(DataClassDictMixin):
foo: Annotated[date, "foo"]
bar: Annotated[date, "bar"]
baz: Annotated[date, "baz"]

class Config(BaseConfig):
serialization_strategy = {
Annotated[date, "foo"]: {
"serialize": date.toordinal,
"deserialize": date.fromordinal,
},
Annotated[date, "bar"]: {
"serialize": date.isoformat,
"deserialize": date.fromisoformat,
},
date: {
"serialize": lambda x: x.strftime("%Y%m%d"),
"deserialize": (
lambda x: datetime.strptime(x, "%Y%m%d").date()
),
},
}

obj = DataClass(
foo=date(2023, 6, 12),
bar=date(2023, 6, 12),
baz=date(2023, 6, 12),
)
obj.foo.strftime("%Y%M%D")
assert (
DataClass.from_dict(
{"foo": 738683, "bar": "2023-06-12", "baz": "20230612"}
)
== obj
)
assert obj.to_dict() == {
"foo": 738683,
"bar": "2023-06-12",
"baz": "20230612",
}
6 changes: 3 additions & 3 deletions tests/test_jsonschema/test_jsonschema_generation.py
Original file line number Diff line number Diff line change
Expand Up @@ -895,7 +895,7 @@ def test_jsonschema_for_unsupported_type():
build_json_schema(object)


def test_overriden_serialization_method_without_signature():
def test_overridden_serialization_method_without_signature():
@dataclass
class DataClass:
x: datetime.datetime
Expand All @@ -921,7 +921,7 @@ class Config(BaseConfig):
)


def test_overriden_serialization_method_without_return_annotation():
def test_overridden_serialization_method_without_return_annotation():
def as_timestamp(dt: datetime.datetime): # pragma no cover
return dt.timestamp()

Expand All @@ -939,7 +939,7 @@ class Config(BaseConfig):
assert build_json_schema(DataClass).properties["y"] == EmptyJSONSchema()


def test_overriden_serialization_method_with_return_annotation():
def test_overridden_serialization_method_with_return_annotation():
def as_timestamp(dt: datetime.datetime) -> float:
return dt.timestamp() # pragma no cover

Expand Down