Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): io submodule #1682

Open
wants to merge 40 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
c5be996
(chore): create `io` submodule
ilan-gold Sep 19, 2024
a665f24
(fix): import submodule last
ilan-gold Sep 19, 2024
23655f8
(fix): delete erroneous file
ilan-gold Sep 19, 2024
59d2856
(refactor): make module redirect a resuable function
ilan-gold Sep 19, 2024
f3e0615
(feat): add redirect to top-level `__init__` for `io`
ilan-gold Sep 19, 2024
481d9aa
(fix): remove the actual imports
ilan-gold Sep 19, 2024
2575657
(chore): backwards compat layer
ilan-gold Sep 20, 2024
f36f786
(chore): docs
ilan-gold Sep 20, 2024
0f1380e
(chore): still export `read_zarr` and `read_h5ad` from main
ilan-gold Sep 20, 2024
d5fad3f
(fix): module name in error
ilan-gold Sep 20, 2024
4216164
(chore): fix all misplaced usages
ilan-gold Sep 20, 2024
cedc598
(fix): module error
ilan-gold Sep 20, 2024
fc25d7b
(fix): more old usgaes
ilan-gold Sep 20, 2024
da8762a
(chore): suppress scanpy warnings
ilan-gold Sep 20, 2024
3a04859
(fix): mass fix of imports
ilan-gold Sep 20, 2024
b9c5fbe
(chore): more depreated usages
ilan-gold Sep 20, 2024
d2e58ac
(fix): `experimental` redirect
ilan-gold Sep 20, 2024
a3a87f4
(fix): f-string fix
ilan-gold Sep 20, 2024
1174453
(fix): need to use `warnings` not `pytest.warns` bc import caching?
ilan-gold Sep 20, 2024
eb01d5f
(fix): docs
ilan-gold Sep 20, 2024
315d1dc
(fix): `test_readwrite` warning catches
ilan-gold Sep 20, 2024
8552f21
(chore): don't warn for `_io` import
ilan-gold Sep 20, 2024
912cc29
(chore): doc fixes
ilan-gold Sep 20, 2024
54b0ae1
(fix): compat for benchmark
ilan-gold Sep 20, 2024
7a8212d
(fix): more benchmarks fixes
ilan-gold Sep 20, 2024
4d1e3f1
(fix): add back `read` for now - will do on next release
ilan-gold Sep 20, 2024
b2f3f7c
(chore): remove erroneous diffs
ilan-gold Sep 20, 2024
7ac02c1
(chore): remove unnecessary circular import handling
ilan-gold Sep 20, 2024
c6f5e2b
(chore): release note
ilan-gold Sep 20, 2024
e796799
(fix): need to allow `_io.xyz.xyz` because so many use it...
ilan-gold Sep 20, 2024
0a8422c
(fix): `read_elem_as_dask` import
ilan-gold Sep 20, 2024
2561edf
(chore): add note on the public API
ilan-gold Sep 20, 2024
9e882b4
(chore): add back a deprecation warning
ilan-gold Sep 20, 2024
d672422
(fix): docs
ilan-gold Sep 20, 2024
2c32b65
(fix): typo
ilan-gold Sep 20, 2024
d56a87e
simpler
flying-sheep Sep 20, 2024
0fe6836
fix tests
flying-sheep Sep 20, 2024
50783ff
(fix): better import logic
ilan-gold Sep 20, 2024
247f6a3
Merge branch 'ig/io_module' of github.com:scverse/anndata into ig/io_…
ilan-gold Sep 20, 2024
a947d3e
simplify io module
flying-sheep Sep 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 12 additions & 12 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ Reading anndata’s native formats `.h5ad` and `zarr`.
.. autosummary::
:toctree: generated/

read_h5ad
read_zarr
io.read_h5ad
io.read_zarr
flying-sheep marked this conversation as resolved.
Show resolved Hide resolved
```

Reading individual portions ({attr}`~AnnData.obs`, {attr}`~AnnData.varm` etc.) of the {class}`AnnData` object.
Expand All @@ -43,8 +43,8 @@ Reading individual portions ({attr}`~AnnData.obs`, {attr}`~AnnData.varm` etc.) o
.. autosummary::
:toctree: generated/

read_elem
sparse_dataset
io.read_elem
io.sparse_dataset
```

Reading file formats that cannot represent all aspects of {class}`AnnData` objects.
Expand All @@ -57,13 +57,13 @@ You might have more success by assembling the {class}`AnnData` object yourself f
.. autosummary::
:toctree: generated/

read_csv
read_excel
read_hdf
read_loom
read_mtx
read_text
read_umi_tools
io.read_csv
io.read_excel
io.read_hdf
io.read_loom
io.read_mtx
io.read_text
io.read_umi_tools
```

## Writing
Expand All @@ -84,7 +84,7 @@ Writing individual portions ({attr}`~AnnData.obs`, {attr}`~AnnData.varm` etc.) o
.. autosummary::
:toctree: generated/

write_elem
io.write_elem
```

Writing formats that cannot represent all aspects of {class}`AnnData` objects.
Expand Down
2 changes: 1 addition & 1 deletion docs/benchmark-read-write.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@
],
"source": [
"%%time\n",
"adata = ad.read_loom(\"test.loom\")"
"adata = ad.io.read_loom(\"test.loom\")"
]
}
],
Expand Down
2 changes: 1 addition & 1 deletion docs/fileformat-prose.md
Original file line number Diff line number Diff line change
Expand Up @@ -635,7 +635,7 @@ function:

```python
>>> import awkward as ak
>>> from anndata import read_elem
>>> from anndata.io import read_elem
>>> awkward_group = store["varm/transcript"]
>>> ak.from_buffers(
... awkward_group.attrs["form"],
Expand Down
4 changes: 2 additions & 2 deletions docs/release-notes/0.10.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

* Concatenate on-disk anndata objects with {func}`anndata.experimental.concat_on_disk` {pr}`955` {user}`selmanozleyen`
* AnnData can now hold dask arrays with `scipy.sparse.spmatrix` chunks {pr}`1114` {user}`ivirshup`
* Public API for interacting with on disk sparse arrays: {func}`~anndata.sparse_dataset`, {class}`~anndata.abc.CSRDataset`, and {class}`~anndata.abc.CSCDataset` {pr}`765` {user}`ilan-gold` {user}`ivirshup`
* Public API for interacting with on disk sparse arrays: {func}`~anndata.io.sparse_dataset`, {class}`~anndata.abc.CSRDataset`, and {class}`~anndata.abc.CSCDataset` {pr}`765` {user}`ilan-gold` {user}`ivirshup`
* Improved performance for simple slices of OOC sparse arrays {pr}`1131` {user}`ivirshup`

**Improved errors and warnings**
Expand All @@ -37,7 +37,7 @@

#### Deprecations

* Deprecate `anndata.read`, which was just an alias for {func}`anndata.read_h5ad` {pr}`1108` {user}`ivirshup`.
* Deprecate `anndata.read`, which was just an alias for {func}`anndata.io.read_h5ad` {pr}`1108` {user}`ivirshup`.
* `dtype` argument to `AnnData` constructor is now deprecated {pr}`1153` {user}`ivirshup`

#### Bug fixes
Expand Down
2 changes: 1 addition & 1 deletion docs/release-notes/0.10.2.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#### Bug fixes

* Added compatibility layer for packages relying on `anndata._core.sparse_dataset.SparseDataset`.
Note that this API is *deprecated* and new code should use {class}`~anndata.abc.CSRDataset`, {class}`~anndata.abc.CSCDataset`, and {func}`~anndata.sparse_dataset` instead.
Note that this API is *deprecated* and new code should use {class}`~anndata.abc.CSRDataset`, {class}`~anndata.abc.CSCDataset`, and {func}`~anndata.io.sparse_dataset` instead.
{pr}`1185` {user}`ivirshup`
* Handle deprecation warning from `pd.Categorical.map` thrown during `anndata.concat` {pr}`1189` {user}`flying-sheep` {user}`ivirshup`
* Fixed extra steps being included in IO tracebacks {pr}`1193` {user}`flying-sheep`
Expand Down
4 changes: 2 additions & 2 deletions docs/release-notes/0.10.8.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@

* Write out `64bit` indptr when appropriate for {func}`~anndata.experimental.concat_on_disk` {pr}`1493` {user}`ilan-gold`
* Support for Numpy 2 {pr}`1499` {user}`flying-sheep`
* Fix {func}`~anndata.sparse_dataset` docstring test on account of new {mod}`scipy` version {pr}`1514` {user}`ilan-gold`
* Fix {func}`~anndata.io.sparse_dataset` docstring test on account of new {mod}`scipy` version {pr}`1514` {user}`ilan-gold`

#### Documentation

* Improved example for {func}`~anndata.sparse_dataset` {pr}`1468` {user}`ivirshup`
* Improved example for {func}`~anndata.io.sparse_dataset` {pr}`1468` {user}`ivirshup`
6 changes: 3 additions & 3 deletions docs/release-notes/0.11.0rc1.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
### Breaking changes

- Removed deprecated modules `anndata.core` and `anndata.readwrite` {user}`ivirshup` ({pr}`1197`)
- No longer export `sparse_dataset` from `anndata.experimental`, instead exporting {func}`anndata.sparse_dataset` {user}`ilan-gold` ({pr}`1642`)
- No longer export `sparse_dataset` from `anndata.experimental`, instead exporting {func}`anndata.io.sparse_dataset` {user}`ilan-gold` ({pr}`1642`)
- Move `RWAble` and `InMemoryElem` out of `experimental`, renaming `RWAble` to {type}`~anndata.typing.AxisStorable` and `InMemoryElem` to {type}`~anndata.typing.RWAble` {user}`ilan-gold` ({pr}`1643`)

### Development Process
Expand All @@ -28,5 +28,5 @@
- Read and write support for nullable string arrays ({class}`pandas.arrays.StringArray`).
Use pandas’ {doc}`pandas:user_guide/options` `mode.string_storage` to control which storage mode is used when reading `dtype="string"` columns.
{user}`flying-sheep` ({pr}`1558`)
- Export {func}`~anndata.write_elem` and {func}`~anndata.read_elem` directly from the main package instead of `experimental` {user}`ilan-gold` ({pr}`1598`)
- Allow reading sparse data (via {func}`~anndata.read_elem` or {func}`~anndata.sparse_dataset`) into either {class}`scipy.sparse.csr_array` or {class}`scipy.sparse.csc_array` via {attr}`anndata.settings.shall_use_sparse_array_on_read` {user}`ilan-gold` ({pr}`1633`)
- Export {func}`~anndata.io.write_elem` and {func}`~anndata.io.read_elem` directly from the main package instead of `experimental` {user}`ilan-gold` ({pr}`1598`)
- Allow reading sparse data (via {func}`~anndata.io.read_elem` or {func}`~anndata.io.sparse_dataset`) into either {class}`scipy.sparse.csr_array` or {class}`scipy.sparse.csc_array` via {attr}`anndata.settings.shall_use_sparse_array_on_read` {user}`ilan-gold` ({pr}`1633`)
2 changes: 1 addition & 1 deletion docs/release-notes/0.5.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,6 @@
- automatically remove unused categories after slicing
- read/write [.loom](https://loompy.org) files using loompy 2
- fixed read/write for a few text file formats
- read [UMI tools] files: {func}`~anndata.read_umi_tools`
- read [UMI tools] files: {func}`~anndata.io.read_umi_tools`

[umi tools]: https://github.com/CGATOxford/UMI-tools
4 changes: 2 additions & 2 deletions docs/release-notes/0.6.x.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@
`0.6.16` {smaller}`A Wolf`
- maintain dtype upon copy.
`0.6.13` {smaller}`A Wolf`
- {attr}`~anndata.AnnData.layers` inspired by [.loom](https://loompy.org) files allows their information lossless reading via {func}`~anndata.read_loom`.
- {attr}`~anndata.AnnData.layers` inspired by [.loom](https://loompy.org) files allows their information lossless reading via {func}`~anndata.io.read_loom`.
`0.6.7`–`0.6.9` {pr}`46` & {pr}`48` {smaller}`S Rybakov`
- support for reading zarr files: {func}`~anndata.read_zarr`
- support for reading zarr files: {func}`~anndata.io.read_zarr`
`0.6.7` {pr}`38` {smaller}`T White`
- initialization from pandas DataFrames
`0.6.` {smaller}`A Wolf`
Expand Down
4 changes: 2 additions & 2 deletions docs/release-notes/0.7.6.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,5 @@

#### Deprecations

- Passing positional arguments to {func}`anndata.read_loom` besides the path is now deprecated {pr}`538` {smaller}`I Virshup`
- {func}`anndata.read_loom` arguments `obsm_names` and `varm_names` are now deprecated in favour of `obsm_mapping` and `varm_mapping` {pr}`538` {smaller}`I Virshup`
- Passing positional arguments to {func}`anndata.io.read_loom` besides the path is now deprecated {pr}`538` {smaller}`I Virshup`
- {func}`anndata.io.read_loom` arguments `obsm_names` and `varm_names` are now deprecated in favour of `obsm_mapping` and `varm_mapping` {pr}`538` {smaller}`I Virshup`
4 changes: 2 additions & 2 deletions docs/release-notes/0.8.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,14 @@ This should make it much easier to support new datatypes, use partial access, an

- Each element should be tagged with an `encoding_type` and `encoding_version`. See updated docs on the {doc}`file format </fileformat-prose>`
- Support for nullable integer and boolean data arrays. More data types to come!
- Experimental support for low level access to the IO API via {func}`~anndata.read_elem` and {func}`~anndata.write_elem`
- Experimental support for low level access to the IO API via {func}`~anndata.io.read_elem` and {func}`~anndata.io.write_elem`

#### Features

- Added PyTorch dataloader {class}`~anndata.experimental.AnnLoader` and lazy concatenation object {class}`~anndata.experimental.AnnCollection`. See the [tutorials] {pr}`416` {smaller}`S Rybakov`
- Compatibility with `h5ad` files written from Julia {pr}`569` {smaller}`I Kats`
- Many logging messages that should have been warnings are now warnings {pr}`650` {smaller}`I Virshup`
- Significantly more efficient {func}`anndata.read_umi_tools` {pr}`661` {smaller}`I Virshup`
- Significantly more efficient {func}`anndata.io.read_umi_tools` {pr}`661` {smaller}`I Virshup`
- Fixed deepcopy of a copy of a view retaining sparse matrix view mixin type {pr}`670` {smaller}`M Klein`
- In many cases {attr}`~anndata.AnnData.X` can now be `None` {pr}`463` {smaller}`R Cannoodt` {pr}`677` {smaller}`I Virshup`. Remaining work is documented in {issue}`467`.
- Removed hard `xlrd` dependency {smaller}`I Virshup`
Expand Down
2 changes: 1 addition & 1 deletion docs/release-notes/0.9.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@

- {doc}`/interoperability`: new page on interoperability with other packages {pr}`831` {user}`ivirshup`

- Expanded docstring more documentation for `backed` argument of {func}`anndata.read_h5ad` {pr}`812` {user}`jeskowagner`
- Expanded docstring more documentation for `backed` argument of {func}`anndata.io.read_h5ad` {pr}`812` {user}`jeskowagner`

- Documented how to use alternative compression methods for the `h5ad` file format, see {meth}`AnnData.write_h5ad() <anndata.AnnData.write_h5ad>` {pr}`857` {user}`nigeil`

Expand Down
2 changes: 1 addition & 1 deletion docs/release-notes/0.9.2.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@
* Views of `awkward.Array`s now work with `awkward>=2.3` {pr}`1040` {user}`ivirshup`
* Fix ufuncs of views like `adata.X[:10].cov(axis=0)` returning views {pr}`1043` {user}`flying-sheep`
* Fix instantiating AnnData where `.X` is a `DataFrame` with an integer valued index {pr}`1002` {user}`flying-sheep`
* Fix {func}`~anndata.read_zarr` when used on `zarr.Group` {pr}`1057` {user}`ivirshup`
* Fix {func}`~anndata.io.read_zarr` when used on `zarr.Group` {pr}`1057` {user}`ivirshup`
59 changes: 33 additions & 26 deletions src/anndata/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

from __future__ import annotations

from types import MappingProxyType
from typing import TYPE_CHECKING

if TYPE_CHECKING:
from typing import Any

try: # See https://github.com/maresb/hatch-vcs-footgun-example
from setuptools_scm import get_version

Expand All @@ -24,32 +30,21 @@
from ._core.anndata import AnnData
from ._core.merge import concat
from ._core.raw import Raw
from ._core.sparse_dataset import sparse_dataset
from ._io import (
read_csv,
read_excel,
read_h5ad,
read_hdf,
read_loom,
read_mtx,
read_text,
read_umi_tools,
read_zarr,
)
from ._io.specs import read_elem, write_elem
from ._settings import settings
from ._warnings import (
ExperimentalFeatureWarning,
ImplicitModificationWarning,
OldFormatWarning,
WriteWarning,
)
from .io import read_h5ad, read_zarr
from .utils import module_get_attr_redirect

# Submodules need to be imported last
from . import abc, experimental, typing # noqa: E402 isort: skip
from . import abc, experimental, typing, io # noqa: E402 isort: skip

# We use these in tests by attribute access
from . import _io, logging # noqa: F401, E402 isort: skip
from . import logging # noqa: F401, E402 isort: skip


def read(*args, **kwargs):
Expand All @@ -63,6 +58,26 @@ def read(*args, **kwargs):
return read_h5ad(*args, **kwargs)


_DEPRECATED = MappingProxyType(
dict(
(method, f"io.{method}")
for method in [
"read_loom",
"read_hdf",
"read_excel",
"read_umi_tools",
"read_csv",
"read_text",
"read_mtx",
]
)
)


def __getattr__(attr_name: str) -> Any:
return module_get_attr_redirect(attr_name, deprecated_mapping=_DEPRECATED)


__all__ = [
# Attributes
"__version__",
Expand All @@ -71,23 +86,15 @@ def read(*args, **kwargs):
"abc",
"experimental",
"typing",
"io",
# Classes
"AnnData",
"Raw",
# Functions
"concat",
"sparse_dataset",
"read_h5ad",
"read_loom",
"read_hdf",
"read_excel",
"read_umi_tools",
"read_csv",
"read_text",
"read_mtx",
"read_zarr",
"read_elem",
"write_elem",
"read_h5ad",
"read",
# Warnings
"OldFormatWarning",
"WriteWarning",
Expand Down
32 changes: 16 additions & 16 deletions src/anndata/_core/anndata.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,15 +134,15 @@

See Also
--------
read_h5ad
read_csv
read_excel
read_hdf
read_loom
read_zarr
read_mtx
read_text
read_umi_tools
io.read_h5ad
io.read_csv
io.read_excel
io.read_hdf
io.read_loom
io.read_zarr
io.read_mtx
io.read_text
io.read_umi_tools

Notes
-----
Expand Down Expand Up @@ -996,7 +996,7 @@
self._X = None

def _set_backed(self, attr, value):
from .._io.utils import write_attribute
from ..io.utils import write_attribute

Check warning on line 999 in src/anndata/_core/anndata.py

View check run for this annotation

Codecov / codecov/patch

src/anndata/_core/anndata.py#L999

Added line #L999 was not covered by tests

write_attribute(self.file._file, attr, value)

Expand Down Expand Up @@ -1399,7 +1399,7 @@
.. code:: python

import anndata
backed = anndata.read_h5ad("file.h5ad", backed="r")
backed = anndata.io.read_h5ad("file.h5ad", backed="r")
mem = backed[backed.obs["cluster"] == "a", :].to_memory()
"""
new = {}
Expand Down Expand Up @@ -1444,7 +1444,7 @@
else:
return self._mutated_copy()
else:
from .._io import read_h5ad, write_h5ad
from ..io import read_h5ad, write_h5ad

if filename is None:
raise ValueError(
Expand Down Expand Up @@ -1858,7 +1858,7 @@
Sparse arrays in AnnData object to write as dense. Currently only
supports `X` and `raw/X`.
"""
from .._io import write_h5ad
from ..io import write_h5ad

if filename is None and not self.isbacked:
raise ValueError("Provide a filename!")
Expand Down Expand Up @@ -1894,7 +1894,7 @@
sep
Separator for the data.
"""
from .._io import write_csvs
from ..io import write_csvs

write_csvs(dirname, self, skip_data=skip_data, sep=sep)

Expand All @@ -1907,7 +1907,7 @@
filename
The filename.
"""
from .._io import write_loom
from ..io import write_loom

write_loom(filename, self, write_obsm_varm=write_obsm_varm)

Expand All @@ -1926,7 +1926,7 @@
chunks
Chunk shape.
"""
from .._io import write_zarr
from ..io import write_zarr

write_zarr(store, self, chunks=chunks)

Expand Down
4 changes: 2 additions & 2 deletions src/anndata/_core/sparse_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -613,8 +613,8 @@ def sparse_dataset(group: GroupStorageType) -> abc.CSRDataset | abc.CSCDataset:

>>> import scanpy as sc
>>> import h5py
>>> from anndata import sparse_dataset
>>> from anndata import read_elem
>>> from anndata.io import sparse_dataset
>>> from anndata.io import read_elem
>>> sc.datasets.pbmc68k_reduced().raw.to_adata().write_h5ad("pbmc.h5ad")

Initialize a sparse dataset from storage
Expand Down
Loading
Loading