Low level API

blosc2.compress(src, typesize=8, clevel=9, shuffle=1, cname='blosclz')

Compress src, with a given type size.

Parameters
srcbytes-like object (supporting the buffer interface)

The data to be compressed.

typesizeint (optional) from 1 to 255

The data type size.

clevelint (optional)

The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.

shuffleint (optional)

The shuffle filter to be activated. Allowed values are blosc2.NOFILTER, blosc2.SHUFFLE and blosc2.BITSHUFFLE. The default is blosc2.SHUFFLE.

cnamestring (optional)

The name of the compressor used internally in Blosc. It can be any of the supported by Blosc ( blosclz , lz4 , lz4hc , zlib , zstd and maybe others too). The default is blosclz .

Returns
outstr / bytes

The compressed data in form of a Python str / bytes object.

Raises
TypeError

If src doesn’t support the buffer interface.

ValueError

If src is too long. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not a valid codec.

Examples

>>> import array, sys
>>> a = array.array('i', range(1000*1000))
>>> a_bytesobj = a.tobytes()
>>> c_bytesobj = blosc2.compress(a_bytesobj, typesize=4)
>>> len(c_bytesobj) < len(a_bytesobj)
True
blosc2.decompress(src, dst=None, as_bytearray=False)

Decompresses a bytes-like compressed object.

Parameters
srcbytes-like object

The data to be decompressed. Must be a bytes-like object that supports the Python Buffer Protocol, like bytes, bytearray, memoryview, or numpy.ndarray.

dstNumPy object or bytearray

The destination NumPy object or bytearray to fill wich length must be greater than 0. The user must make sure that it has enough capacity for hosting the decompressed data. Default is None, meaning that a new bytes or bytearray object is created, filled and returned.

as_bytearraybool (optional)

If this flag is True then the return type will be a bytearray object instead of a bytesobject.

Returns
If dst=None
outstr / bytes or bytearray

The decompressed data in form of a Python str / bytes object. If as_bytearray is True then this will be a bytearray object, otherwise this will be a str/ bytes object.

If dst!=None
outNone

As the result will already be in dst.

Raises
RuntimeError

The compressed data is corrupted or the output buffer is not large enough Could not get a bytes object

TypeError

If src does not support Buffer Protocol

ValueError

If the length of src is smaller than the minimum If dst is not None and its length is 0

Examples

>>> import array, sys
>>> a = array.array('i', range(1000*1000))
>>> a_bytesobj = a.tobytes()
>>> c_bytesobj = blosc2.compress(a_bytesobj, typesize=4)
>>> a_bytesobj2 = blosc2.decompress(c_bytesobj)
>>> a_bytesobj == a_bytesobj2
True
>>> b"" == blosc2.decompress(blosc2.compress(b"", 1))
True
>>> b"1"*7 == blosc2.decompress(blosc2.compress(b"1"*7, 8))
True
>>> type(blosc2.decompress(blosc2.compress(b"1"*7, 8),
...                                      as_bytearray=True)) is bytearray
True
>>> import numpy
>>> arr = numpy.arange(10)
>>> comp_arr = blosc2.compress(arr)
>>> dest = numpy.empty(arr.shape, arr.dtype)
>>> blosc2.decompress(comp_arr, dst=dest)
>>> numpy.array_equal(arr, dest)
True
blosc2.pack_array(arr, clevel=9, shuffle=1, cname='blosclz')

Pack (compress) a NumPy array. It is equivalent to the pack function.

Parameters
arrndarray

The NumPy array to be packed.

clevelint (optional)

The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.

shuffleint (optional)

The shuffle filter to be activated. Allowed values are blosc.NOSHUFFLE, blosc.SHUFFLE and blosc.BITSHUFFLE. The default is blosc.SHUFFLE.

cnamestring (optional)

The name of the compressor used internally in Blosc. It can be any of the supported by Blosc ( blosclz , lz4 , lz4hc , zlib , zstd and maybe others too). The default is blosclz .

Returns
outstr / bytes

The packed array in form of a Python str / bytes object.

Raises
AttributeError

If the object does not have an itemsize attribute.

ValueError

If array.itemsize * array.size is larger than the maximum allowed buffer size. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not within the supported compressors.

See also

func

pack(object)

Examples

>>> import numpy
>>> a = numpy.arange(1e6)
>>> parray = blosc2.pack_array(a)
>>> len(parray) < a.size*a.itemsize
True
blosc2.pack(obj, clevel=9, shuffle=1, cname='blosclz')

Pack (compress) a Python object.

Parameters
objPython object with itemsize attribute

The Python object to be packed.

clevelint (optional)

The compression level from 0 (no compression) to 9 (maximum compression). The default is 9.

shuffleint (optional)

The shuffle filter to be activated. Allowed values are blosc2.NOFILTER, blosc2.SHUFFLE and blosc2.BITSHUFFLE. The default is blosc2.SHUFFLE.

cnamestring (optional)

The name of the compressor used internally in Blosc. It can be any of the supported by Blosc ( blosclz , lz4 , lz4hc , zlib , zstd and maybe others too). The default is blosclz .

Returns
outstr / bytes

The packed object in form of a Python str / bytes object.

Raises
AttributeError

If the object does not have an itemsize attribute. If the object does not have an size attribute.

ValueError

If obj.itemsize * obj.size is larger than the maximum allowed buffer size. If typesize is not within the allowed range. If clevel is not within the allowed range. If cname is not within the supported compressors.

Examples

>>> import numpy
>>> a = numpy.arange(1e6)
>>> parray = blosc2.pack(a)
>>> len(parray) < a.size*a.itemsize
True
blosc2.unpack_array(packed_array, **kwargs)

Unpack (decompress) a packed NumPy array.

Parameters
packed_arraystr / bytes

The packed array to be decompressed.

**kwargsfix_imports / encoding / errors

Optional parameters that can be passed to the pickle.loads API https://docs.python.org/3/library/pickle.html#pickle.loads

Returns
outndarray

The decompressed data in form of a NumPy array.

Raises
TypeError

If packed_array is not of type bytes or string.

Examples

>>> import numpy
>>> a = numpy.arange(1e6)
>>> parray = blosc2.pack_array(a)
>>> len(parray) < a.size*a.itemsize
True
>>> a2 = blosc2.unpack_array(parray)
>>> numpy.array_equal(a, a2)
True
>>> a = numpy.array(['å', 'ç', 'ø'])
>>> parray = blosc2.pack_array(a)
>>> a2 = blosc2.unpack_array(parray)
>>> numpy.array_equal(a, a2)
True
blosc2.unpack(packed_object, **kwargs)

Unpack (decompress) an object.

Parameters
packed_objectstr / bytes

The packed object to be decompressed.

**kwargsfix_imports / encoding / errors

Optional parameters that can be passed to the pickle.loads API https://docs.python.org/3/library/pickle.html#pickle.loads

Returns
outobject

The decompressed data in form of the original object.

Raises
TypeError

If packed_object is not of type bytes or string.

Examples

>>> import numpy
>>> a = numpy.arange(1e6)
>>> parray = blosc2.pack(a)
>>> len(parray) < a.size*a.itemsize
True
>>> a2 = blosc2.unpack(parray)
>>> numpy.array_equal(a, a2)
True
>>> a = numpy.array(['å', 'ç', 'ø'])
>>> parray = blosc2.pack(a)
>>> a2 = blosc2.unpack(parray)
>>> numpy.array_equal(a, a2)
True
blosc2.clib_info(cname)

Return info for compression libraries in C library.

Parameters
cnamestr

The compressor name.

Returns
outtuple

The associated library name and version.

blosc2.compressor_list()

Returns a list of compressors available in C library.

Returns
outlist

The list of names.

blosc2.detect_number_of_cores()

Detect the number of cores in this system.

Returns
outint

The number of cores in this system.

blosc2.free_resources()

Free possible memory temporaries and thread resources.

Returns
outNone

Notes

Blosc maintain a pool of threads waiting for work as well as some temporary space. You can use this function to release these resources when you are not going to use Blosc for a long while.

Examples

>>> blosc2.free_resources()
blosc2.get_clib(bytesobj)

Return the name of the compression library for Blosc bytesobj buffer.

Parameters
bytesobjstr / bytes

The compressed buffer.

Returns
outstr

The name of the compression library.

blosc2.print_versions()

Print all the versions of software that python-blosc relies on.

blosc2.set_blocksize(blocksize=0)

Force the use of a specific blocksize. If 0, an automatic blocksize will be used (the default).

Notes

This is a low-level function and is recommened for expert users only.

Examples

>>> blosc2.set_blocksize(512)
>>> blosc2.set_blocksize(0)
blosc2.set_nthreads(nthreads)

Set the number of threads to be used during Blosc operation.

Parameters
nthreadsint

The number of threads to be used during Blosc operation.

Returns
outint

The previous number of used threads.

Raises
ValueError

If nthreads is larger that the maximum number of threads blosc can use. If nthreads is not a positive integer.

Notes

The maximum number of threads for Blosc is \(2^{31} - 1\). In some cases Blosc gets better results if you set the number of threads to a value slightly below than your number of cores (via detect_number_of_cores).

Examples

Set the number of threads to 2 and then to 1: >>> oldn = blosc2.set_nthreads(2) >>> blosc2.set_nthreads(1) 2

blosc2.set_releasegil(gilstate)

Sets a boolean on whether to release the Python global inter-lock (GIL) during c-blosc compress and decompress operations or not. This defaults to False.

Parameters
gilstate: bool

True to release the GIL

Notes

Designed to be used with larger chunk sizes and a ThreadPool. There is a small performance penalty with releasing the GIL that will more harshly penalize small block sizes.

Examples

>>> oldReleaseState = blosc2.set_releasegil(True)
blosc2.set_compressor(compname)

Set the compressor to be used. The supported ones are blosclz , lz4 , lz4hc , zlib and ztsd. If this function is not called, then blosclz will be used.

Parameters
compnamestr

The name of the compressor to be used.

Returns
outint

The code for the compressor (>=0).

Raises
ValueError

If the compressor is not recognized, or there is not support for it.

blosc2.get_compressor()

Get the current compressor that is used for compression.

Returns
outstr

The name of the compressor.

blosc2.get_blocksize()

Get the internal blocksize to be used during compression.

Returns
outint

The size in bytes of the internal block size.

blosc2.compress2(src, **kwargs)

Compress src with the given compression params (if given)

Parameters
src: bytes-like object (supporting the buffer interface)
Returns
out: str/bytes

The compressed data in form of a Python str / bytes object.

Other Parameters
kwargs: dict, optional

Keyword arguments supported:

compcode: int

The compressor code. It can be blosc2.BLOSCLZ (the default one), blosc2.LZ4, blosc2.LZ4HC, blosc2.ZLIB, blosc2.ZSTD and maybe other too.

compcode_meta: int

The metadata for the compressor code, 0 by default.

clevel: int

The compression level from 0 (no compression) to 9 (maximum compression). By default: 5.

use_dict: bool

Use dicts or not when compressing (only for ZSTD). By default False.

typesize: int from 1 to 255

The data type size. By default: 8.

nthreads: int

The number of threads to use internally (1 by default).

blocksize: int

The requested size of the compressed blocks. If 0 (the default) blosc2 chooses it automatically.

splitmode: int

The splitmode for the blocks. It can be blosc2.ALWAYS_SPLIT, blosc2.NEVER_SPLIT, blosc2.AUTO_SPLIT and blosc2.FORWARD_COMPAT_SPLIT. The default value is blosc2.FORWARD_COMPAT_SPLIT.

filters: list

The sequence of filters. By default: {0, 0, 0, 0, 0, blosc2.BLOSC_SHUFFLE}.

filters_meta: list

The metadata for filters. By default: {0, 0, 0, 0, 0, 0}.

Raises
RuntimeError

If the data cannot be compressed into dst. If an internal error occurred, probably because some parameter is not a valid parameter.

blosc2.decompress2(src, dst=None, **kwargs)

Compress src with the given compression params (if given)

Parameters
src: bytes-like object

The data to be decompressed. Must be a bytes-like object that supports the Python Buffer Protocol, like bytes, bytearray, memoryview, or numpy.ndarray.

dst: NumPy object or bytearray

The destination NumPy object or bytearray to fill wich length must be greater than 0. The user must make sure that it has enough capacity for hosting the decompressed data. Default is None, meaning that a new bytes object is created, filled and returned.

Returns
out: str/bytes

The decompressed data in form of a Python str / bytes object if dst is None. Otherwise, it will return None because the result will already be in dst.

Other Parameters
kwargs: dict, optional

Keyword arguments supported: nthreads: int The number of threads to use internally (1 by default).

Raises
RuntimeError

If the data cannot be compressed into dst. If an internal error occurred, probably because some parameter is not a valid one. If dst is None and could not create a bytes object to store the result.

TypeError

If src does not support the Buffer Protocol

ValueError

If the length of src is smaller than the minimum If dst is not None and its length is 0