dwww Home | Show directory contents | Find package

Blosc: A blocking, shuffling and lossless compression library

  Author                   Contact           URL
  ------------------------ ----------------- ----------------------
  Blosc Development Team   blosc@blosc.org   http://www.blosc.org

  Gitter     GH Actions   NumFOCUS                Code of Conduct
  ---------- ------------ ----------------------- ------------------------
  [Gitter]   [CI CMake]   [Powered by NumFOCUS]   [Contributor Covenant]

What is it?

Blosc is a high performance compressor optimized for binary data. It has
been designed to transmit data to the processor cache faster than the
traditional, non-compressed, direct memory fetch approach via a memcpy()
OS call. Blosc is the first compressor (that I’m aware of) that is meant
not only to reduce the size of large datasets on-disk or in-memory, but
also to accelerate memory-bound computations.

It uses the blocking technique so as to reduce activity in the memory
bus as much as possible. In short, this technique works by dividing
datasets in blocks that are small enough to fit in caches of modern
processors and perform compression / decompression there. It also
leverages, if available, SIMD instructions (SSE2, AVX2) and
multi-threading capabilities of CPUs, in order to accelerate the
compression / decompression process to a maximum.

See some benchmarks about Blosc performance.

Blosc is distributed using the BSD license, see LICENSES/BLOSC.txt for
details.

Meta-compression and other differences over existing compressors

C-Blosc is not like other compressors: it should rather be called a
meta-compressor. This is so because it can use different compressors and
filters (programs that generally improve compression ratio). At any
rate, it can also be called a compressor because it happens that it
already comes with several compressor and filters, so it can actually
work like a regular codec.

Currently C-Blosc comes with support of BloscLZ, a compressor heavily
based on FastLZ (http://fastlz.org/), LZ4 and LZ4HC
(https://github.com/Cyan4973/lz4), Snappy
(https://github.com/google/snappy), Zlib (http://www.zlib.net/) and Zstd
(http://www.zstd.net).

C-Blosc also comes with highly optimized (they can use SSE2 or AVX2
instructions, if available) shuffle and bitshuffle filters (for info on
how and why shuffling works see here). However, additional compressors
or filters may be added in the future.

Blosc is in charge of coordinating the different compressor and filters
so that they can leverage the blocking technique as well as
multi-threaded execution (if several cores are available) automatically.
That makes that every codec and filter will work at very high speeds,
even if it was not initially designed for doing blocking or
multi-threading.

Finally, C-Blosc is specially suited to deal with binary data because it
can take advantage of the type size meta-information for improved
compression ratio by using the integrated shuffle and bitshuffle
filters.

When taken together, all these features set Blosc apart from other
compression libraries.

Compiling the Blosc library

Blosc can be built, tested and installed using CMake_. The following
procedure describes the “out of source” build.


      $ cd c-blosc
      $ mkdir build
      $ cd build

Now run CMake configuration and optionally specify the installation
directory (e.g. ‘/usr’ or ‘/usr/local’):


      $ cmake -DCMAKE_INSTALL_PREFIX=your_install_prefix_directory ..

CMake allows to configure Blosc in many different ways, like preferring
internal or external sources for compressors or enabling/disabling them.
Please note that configuration can also be performed using UI tools
provided by CMake (ccmake or cmake-gui):


      $ ccmake ..      # run a curses-based interface
      $ cmake-gui ..   # run a graphical interface

Build, test and install Blosc:


      $ cmake --build .
      $ ctest
      $ cmake --build . --target install

The static and dynamic version of the Blosc library, together with
header files, will be installed into the specified CMAKE_INSTALL_PREFIX.

Codec support with CMake

C-Blosc comes with full sources for LZ4, LZ4HC, Snappy, Zlib and Zstd
and in general, you should not worry about not having (or CMake not
finding) the libraries in your system because by default the included
sources will be automatically compiled and included in the C-Blosc
library. This effectively means that you can be confident in having a
complete support for all the codecs in all the Blosc deployments (unless
you are explicitly excluding support for some of them).

But in case you want to force Blosc to use external codec libraries
instead of the included sources, you can do that:


      $ cmake -DPREFER_EXTERNAL_ZSTD=ON ..

You can also disable support for some compression libraries:


      $ cmake -DDEACTIVATE_SNAPPY=ON ..  # in case you don't have a C++ compiler

Examples

In the examples/ directory you can find hints on how to use Blosc inside
your app.

Supported platforms

Blosc is meant to support all platforms where a C89 compliant C compiler
can be found. The ones that are mostly tested are Intel (Linux, Mac OSX
and Windows) and ARM (Linux), but exotic ones as IBM Blue Gene Q
embedded “A2” processor are reported to work too.

Mac OSX troubleshooting

If you run into compilation troubles when using Mac OSX, please make
sure that you have installed the command line developer tools. You can
always install them with:


      $ xcode-select --install

Wrapper for Python

Blosc has an official wrapper for Python. See:

https://github.com/Blosc/python-blosc

Command line interface and serialization format for Blosc

Blosc can be used from command line by using Bloscpack. See:

https://github.com/Blosc/bloscpack

Filter for HDF5

For those who want to use Blosc as a filter in the HDF5 library, there
is a sample implementation in the hdf5-blosc project in:

https://github.com/Blosc/hdf5-blosc

Mailing list

There is an official mailing list for Blosc at:

blosc@googlegroups.com http://groups.google.es/group/blosc

Acknowledgments

See THANKS.rst.

------------------------------------------------------------------------

Enjoy data!

Generated by dwww version 1.14 on Wed Aug 27 14:26:03 CEST 2025.