Skip to content

Commit

Permalink
Add documentation for Python.
Browse files Browse the repository at this point in the history
  • Loading branch information
ielis committed Apr 30, 2024
1 parent 2e3540a commit e30ef0e
Show file tree
Hide file tree
Showing 3 changed files with 143 additions and 0 deletions.
140 changes: 140 additions & 0 deletions docs/python.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
.. _rstpython:

###################################
Working with Phenopackets in Python
###################################

Similarly to :ref:`Java <rstjava>`, the :ref:`Phenopacket Schema <rstschema>` can be considered the source of truth
for the specification, and the JSON produced by an arbitrary implementation can be used to inter-operate
with other services. Nevertheless, we **strongly** suggest to use the `phenopackets` library available
from Python Package Index (PyPi) or use the Python bindings generated by Protobuf compiler from the Protobuf files.

Here we provide a brief overview of the `phenopackets` library.


Install `phenopackets` into your Python environment
***************************************************

The `phenopackets` package can be installed from PyPi by running:

.. code-block:: shell
python3 -m pip install phenopackets
We use `pip` to install `phenopackets` and the required libraries/dependencies.


Create building blocks programmatically
***************************************

Let's start by importing all building blocks of Phenopacket Schema v2:

>>> import phenopackets.schema.v2 as pps2

Now we can access all building blocks of v2 Phenopacket Schema via `pps2` alias.

For instance, we can create an :ref:`Ontology class <rstontologyclass>` that corresponds to a Human Phenotype Ontology
term for *Spherocytosis* (`HP:0004444`):

>>> spherocytosis = pps2.OntologyClass(id='HP:0004444', label='Spherocytosis')
>>> spherocytosis # doctest: +NORMALIZE_WHITESPACE
id: "HP:0004444"
label: "Spherocytosis"

All schema building blocks, including `OntologyClass`, are available under `pps2` alias, and can be created with constructors that accept key/value arguments.
The constructors will not allow passing of arbitrary attributes:

>>> pps2.OntologyClass(foo='bar')
Traceback (most recent call last):
...
ValueError: Protocol message OntologyClass has no "foo" field.

We do not have to provide all attributes at the creation time and we can set the fields sequentially
using Python property syntax, to achieve the same outcome:

>>> spherocytosis2 = pps2.OntologyClass()
>>> spherocytosis2.id = 'HP:0004444'
>>> spherocytosis2.label = 'Spherocytosis'
>>> spherocytosis == spherocytosis2
True

However, setting the field values with property syntax only works for
`singular <https://protobuf.dev/reference/python/python-generated/#singular-fields-proto3>`_ (non-message) fields,
such as `bool`, `int`, `str`, or `float`, and the assignment will *NOT* work for message fields:

>>> pf = pps2.PhenotypicFeature()
>>> pf.type = spherocytosis
Traceback (most recent call last):
...
AttributeError: Assignment not allowed to field "type" in protocol message object.

To set a message field, we must use the `CopyFrom` function:

>>> pf.type.CopyFrom(spherocytosis)
>>> pf # doctest: +NORMALIZE_WHITESPACE
type {
id: "HP:0004444"
label: "Spherocytosis"
}

Last, a repeated field can be set using list-like semantics:

>>> modifiers = (
... pps2.OntologyClass(id='HP:0003623', label='Neonatal onset'),
... pps2.OntologyClass(id='HP:0011010', label='Chronic'),
... )
>>> pf.modifiers.extend(modifiers)
>>> pf # doctest: +NORMALIZE_WHITESPACE
type {
id: "HP:0004444"
label: "Spherocytosis"
}
modifiers {
id: "HP:0003623"
label: "Neonatal onset"
}
modifiers {
id: "HP:0011010"
label: "Chronic"
}

See `Protobuf documentation <https://protobuf.dev/reference/python/python-generated/#repeated-fields>`_
for more info.


Building blocks I/O
*******************

Having an instance with data, we can write the content into Protobuf's wire format:

>>> binary_str = pf.SerializeToString()
>>> binary_str
b'\x12\x1b\n\nHP:0004444\x12\rSpherocytosis*\x1c\n\nHP:0003623\x12\x0eNeonatal onset*\x15\n\nHP:0011010\x12\x07Chronic'

and get the same content back:

>>> pf2 = pps2.PhenotypicFeature()
>>> _ = pf2.ParseFromString(binary_str)
>>> pf == pf2
True

We can also dump the content of the building block to a *JSON* string or to a `dict` with Python objects using
`MessageToJson <https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html#google.protobuf.json_format.MessageToJson>`_
or `MessageToDict <https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html#google.protobuf.json_format.MessageToDict>`_
functions:

>>> from google.protobuf.json_format import MessageToDict
>>> json_dict = MessageToDict(pf)
>>> json_dict
{'type': {'id': 'HP:0004444', 'label': 'Spherocytosis'}, 'modifiers': [{'id': 'HP:0003623', 'label': 'Neonatal onset'}, {'id': 'HP:0011010', 'label': 'Chronic'}]}

We complete the JSON round-trip using
`Parse <https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html#google.protobuf.json_format.Parse>`_
or `ParseDict <https://googleapis.dev/python/protobuf/latest/google/protobuf/json_format.html#google.protobuf.json_format.ParseDict>`_
functions:

>>> from google.protobuf.json_format import ParseDict
>>> pf2 = ParseDict(json_dict, pps2.PhenotypicFeature())
>>> pf == pf2
True

1 change: 1 addition & 0 deletions docs/working.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ produced as part of the build (:ref:`rstjavabuild`).
:maxdepth: 1

Working with Phenopackets in Java <java>
Working with Phenopackets in Python <python>
Working with Phenopackets in C++ <cpp>

Security disclaimer
Expand Down
2 changes: 2 additions & 0 deletions python/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,6 @@ package-dir = { "" = "src" }
[tool.pytest.ini_options]
testpaths = [
"tests",
"../docs",
]
addopts = "--doctest-modules --doctest-glob=\"*.rst\""

0 comments on commit e30ef0e

Please sign in to comment.