Skip to content

Commit

Permalink
Select between native and Python implementations of lz4 and lzo (#51)
Browse files Browse the repository at this point in the history
Two new extra's "lz4" and "lzo" are defined which will install the
needed Python packages to do naitve lz4 and lzo decompression.

When installed the dissect.util.compression.lz4 and
dissect.util.compression.lzo modules will point to these native
versions. Otherwise they will point to the pure Python implementations
in this project.

The native (when available) and Python modules can be accessed
explicitly through:

- dissect.util.compression.lz4_native
- dissect.util.compression.lz4_python
- dissect.util.compression.lzo_native
- dissect.util.compression.lzo_python
  • Loading branch information
pyrco committed Jul 26, 2024
1 parent 73aac2c commit 7ac7a3a
Show file tree
Hide file tree
Showing 4 changed files with 85 additions and 2 deletions.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,17 @@ Information on the supported Python versions can be found in the Getting Started
pip install dissect.util
```

This module is also automatically installed if you install the `dissect` package.
`dissect.util` includes pure Python implementations of the lz4 and lzo decompression algorithms. To automatically use
the faster, native (C-based) lz4 and lzo implementations in other Dissect projects, install the package with the lz4 and
lzo extras:

```bash
pip install "dissect.util[lz4,lzo]"
```

Unfortunately there is no binary `python-lzo` wheel for PyPy installations on Windows, so it won't be installed there.

This module including the lz4 and lzo extras is also automatically installed if you install the `dissect` package.

## Build and test instructions

Expand Down
57 changes: 57 additions & 0 deletions dissect/util/compression/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from dissect.util.compression import lz4 as lz4_python
from dissect.util.compression import lzo as lzo_python

# This selects between the native version of lz4 (when installed) and our own
# pure-Python implementation.
#
# By doing a:
# from dissect.util.compression import lz4
#
# in another project will automatically give you one or the other.
#
# The native version is also available as dissect.util.compression.lz4_native
# (when installed) and the pure Python version is always available as
# dissect.util.compression.lz4_python.
#
# Note that the pure Python implementation is not a full replacement of the
# native lz4 Python package: only the decompress() function is implemented.
try:
import lz4.block as lz4
import lz4.block as lz4_native
except ImportError:
lz4 = lz4_python
lz4_native = None

# This selects between the native version of lzo (when installed) and our own
# pure-Python implementation.
#
# By doing a:
# from dissect.util.compression import lzo
#
# in another project will automatically give you one or the other.
#
# The native version is also available as dissect.util.compression.lzo_native
# (when installed) and the pure Python version is always available as
# dissect.util.compression.lzo_python.
#
# Note that the pure Python implementation is not a full replacement of the
# native lzo Python package: only the decompress() function is implemented.
try:
import lzo
import lzo as lzo_native
except ImportError:
lzo = lzo_python
lzo_native = None

__all__ = [
"lz4",
"lz4_native",
"lz4_python",
"lznt1",
"lzo",
"lzo_native",
"lzo_python",
"lzxpress",
"lzxpress_huffman",
"sevenbit",
]
8 changes: 7 additions & 1 deletion dissect/util/compression/lz4.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,12 +23,18 @@ def _get_length(src: BinaryIO, length: int) -> int:


def decompress(
src: Union[bytes, BinaryIO], max_length: int = -1, return_bytearray: bool = False, return_bytes_read: bool = False
src: Union[bytes, BinaryIO],
uncompressed_size: int = -1,
max_length: int = -1,
return_bytearray: bool = False,
return_bytes_read: bool = False,
) -> Union[bytes, tuple[bytes, int]]:
"""LZ4 decompress from a file-like object up to a certain length. Assumes no header.
Args:
src: File-like object to decompress from.
uncompressed_size: Ignored, present for compatibility with native lz4. The ``max_length``
parameter sort-of but not completely has the same function.
max_length: Decompress up to this many result bytes.
return_bytearray: Whether to return ``bytearray`` or ``bytes``.
return_bytes_read: Whether to return a tuple of ``(data, bytes_read)`` or just the data.
Expand Down
10 changes: 10 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,16 @@ homepage = "https://dissect.tools"
documentation = "https://docs.dissect.tools/en/latest/projects/dissect.util"
repository = "https://github.com/fox-it/dissect.util"

[project.optional-dependencies]
lzo = [
# There are no Windows PyPy wheels available for python-lzo
# So we use a pure python fallback for it.
"python-lzo; platform_system != 'Windows' or platform_python_implementation != 'PyPy'",
]
lz4 = [
"lz4",
]

[project.scripts]
dump-nskeyedarchiver = "dissect.util.tools.dump_nskeyedarchiver:main"

Expand Down

0 comments on commit 7ac7a3a

Please sign in to comment.