Snappy compression for Windows
Quoting from upstream Snappy homepage:
Snappy is a compression/decompression library. It does not aim for maximum compression, or compatibility with any other compression library; instead, it aims for very high speeds and reasonable compression. For instance, compared to the fastest mode of zlib, Snappy is an order of magnitude faster for most inputs, but the resulting compressed files are anywhere from 20% to 100% bigger.
This is a Windows port of Snappy for C++, .NET, and the command line. There are some other ports of Snappy to Windows listed at the end of this page, but this one aims to be the most complete, the most up to date, and the most stable one. Snappy for Windows is provided free of charge under permissive BSD license.
On this page
Tutorial for C++
C++ NuGet package contains source code that is compiled together with your project. Your project will therefore have no DLL dependencies and there will be no C++ runtime issues. It however means that Debug build of your project will contain slower Debug build of Snappy.
<PackageReference Include="Snappy" Version="220.127.116.11" />
Alternatively, you can download plain 7-Zip archive of the DLLs and associated LIBs and headers. Sources are available from GitHub and Bitbucket. C++ code and binaries are distributed under BSD license.
After downloading the library, your first step is to include the header:
You can then compress a buffer of data like this:
char compressed; size_t length = 1000; snappy_status status = snappy_compress( "Hello World!", 12, compressed, &length);
After calling this function, buffer
compressed will contain string
"Hello World!" compressed by Snappy.
length will contain length of the compressed data.
Note that the compressed version can be slightly larger than input in extreme cases.
Snappy allows you to calculate the required size of the output buffer:
char uncompressed = "Hello World!"; size_t length = snappy_max_compressed_length(sizeof(uncompressed)); char *compressed = new char[length]; // ... compress like above
You can decompress compressed buffer like this:
char uncompressed; size_t uncompressedLength = 1000; snappy_status status = snappy_uncompress( compressed, compressedLength, uncompressed, &uncompressedLength);
This works the same as compression above except the process is reversed.
uncompressed will contain our "Hello World!" string and
uncompressedLength will be 12.
Again, we can estimate size of output buffer, but this time it's not a simple constant based on input size.
Snappy stores size of uncompressed data in header of the compressed block.
It has a function that retrieves the exact length of uncompressed data in O(1) time:
size_t uncompressedLength; snappy_status status = snappy_uncompressed_length( compressed, compressedLength, &uncompressedLength);
Snappy also has a function to validate compressed buffer. And there's a whole alternate C++ API that additionally supports pluggable source/sink interfaces that allow you to compress a number of buffers into single solid block. See Snappy header files, which contain documentation for all the public APIs.
Tutorial for .NET
.NET DLL is AnyCPU, but it automatically forwards all calls to one of the two native DLLs depending on whether the current process is 32-bit or 64-bit. The two native DLLs are embedded as resources and unpacked into temporary location before first use.
<PackageReference Include="Snappy.NET" Version="18.104.22.168" />
Alternatively, you can download plain 7-Zip archive of the DLLs. Sources are available from GitHub and Bitbucket. .NET code and underlying native libraries are distributed under BSD license.
After downloading the library, your can use the Snappy namespace:
You will gain access to low-level
SnappyCodec API that provides block-level compression.
Snappy.NET additionally contains
SnappyStream that implements streaming API for streams of unbounded size.
SnappyCodec class provides you with a pair of simple compression/uncompression methods:
public static byte Compress(byte input); public static byte Uncompress(byte input);
It cannot be simpler than that. If you would like to squeeze out the last bit of performance, there are corresponding two no-copy methods along with methods for estimating output buffer size. Explore the documentation via IntelliSense or by peeking into SnappyCodec class.
SnappyStream class is very similar to
GZipStream class in .NET Framework.
SnappyStream creates output incompatible with
SnappyCodec is for compressing fixed-size blocks,
SnappyStream is intended for unbounded streams.
SnappyStream is however compatible with other implementations of
Snappy framing specification.
You can create Snappy-compressed file like this:
using (var file = File.OpenWrite("mydata.sz")) using (var compressor = new SnappyStream(file, CompressionMode.Compress)) using (var writer = new StreamWriter(compressor)) writer.WriteLine("Hello World!");
Decompression is similarly easy:
using (var file = File.OpenRead("mydata.sz")) using (var decoder = new SnappyStream(file, CompressionMode.Decompress)) using (var reader = new StreamReader(decoder)) Console.Write(reader.ReadToEnd());
If you are on .NET 4.5,
SnappyStream provides you with async variants of all I/O methods.
If you would like to do advanced stream manipulation, you can use
Snappy for Windows includes command-line tools snzip and snunzip that can be used to manipulate Snappy files on the command line. These tools are compatible with other tools implementing Snappy framing specification.
Download the 7-Zip archive, extract it somewhere, and find bin folder in the extracted package. You can then compress a file like this:
That will produce file
test.dat.sz in the same folder. You can decompress it again like this:
Here's the list of options you can use with snzip:
snzip, snunzip - Snappy compression command-line tool Options: -d --decompress --uncompress Run in decompression mode. This is default if started as 'snunzip'. -c --stdout --to-stdout Output to standard output instead of file. -t --test Only test integrity of the compressed file. Don't actually unpack it. -v --verbose Verbose output. -V --version Version. Display the version number and compilation options then quit. -h --help Display this information and quit.
Sources are available from GitHub and Bitbucket.
Tests have been ported to Windows as well and they show that this Windows port is correct and fast. The various benchmark types should be interpreted as follows:
- BM_ZFlat - compression speed and compression ratio (compressed size / uncompressed size)
- BM_UFlat - decompression speed
- BM_UValidate - validation of compressed stream
Speed benchmarks should be taken with a grain of salt. The CPU used for testing was a very fast Core i7 3.4GHz with all data fitting in its L3 cache. Benchmarks have been done on a single core with all the other cores unoccupied.
32-bit test results
C:\...\snappy-visual-cpp>Release\runtests.exe Running microbenchmarks. Benchmark Time(ns) CPU(ns) Iterations --------------------------------------------------- BM_UFlat/0 86114 90398 1208 1.1GB/s html BM_UFlat/1 841390 852105 238 785.8MB/s urls BM_UFlat/2 5516 5662 33057 20.9GB/s jpg BM_UFlat/3 131 128 1333333 1.4GB/s jpg_200 BM_UFlat/4 28449 28278 6620 3.1GB/s pdf BM_UFlat/5 345159 347857 583 1.1GB/s html4 BM_UFlat/6 30515 30464 6657 770.2MB/s cp BM_UFlat/7 13173 13010 15588 817.3MB/s c BM_UFlat/8 3805 3782 53619 938.2MB/s lsp BM_UFlat/9 1249163 1275477 159 769.9MB/s xls BM_UFlat/10 303 308 606060 617.5MB/s xls_200 BM_UFlat/11 276564 262186 714 553.2MB/s txt1 BM_UFlat/12 243323 241429 840 494.5MB/s txt2 BM_UFlat/13 740712 690778 271 589.2MB/s txt3 BM_UFlat/14 1011155 1050782 193 437.3MB/s txt4 BM_UFlat/15 402018 408050 497 1.2GB/s bin BM_UFlat/16 266 273 512820 696.7MB/s bin_200 BM_UFlat/17 50228 51057 3972 714.3MB/s sum BM_UFlat/18 4866 4829 38759 834.6MB/s man BM_UFlat/19 81618 81972 2474 1.3GB/s pb BM_UFlat/20 304378 305423 664 575.5MB/s gaviota BM_UValidate/0 39102 39478 5137 2.4GB/s html BM_UValidate/1 435403 406075 461 1.6GB/s urls BM_UValidate/2 190 196 952380 601.5GB/s jpg BM_UValidate/3 73 70 2222222 2.7GB/s jpg_200 BM_UValidate/4 13005 12112 15455 7.3GB/s pdf BM_ZFlat/0 194230 193143 1050 505.6MB/s html (22.31 %) BM_ZFlat/1 2245170 2184010 100 306.6MB/s urls (47.77 %) BM_ZFlat/2 36788 29345 5316 4.0GB/s jpg (99.87 %) BM_ZFlat/3 554 568 246913 335.4MB/s jpg_200 (79.00 %) BM_ZFlat/4 95347 87970 2128 1022.6MB/s pdf (82.07 %) BM_ZFlat/5 748713 745591 272 523.9MB/s html4 (22.51 %) BM_ZFlat/6 83891 80759 2318 290.5MB/s cp (48.12 %) BM_ZFlat/7 31498 32165 4365 330.6MB/s c (42.40 %) BM_ZFlat/8 9403 9542 21253 371.9MB/s lsp (48.37 %) BM_ZFlat/9 2274370 2340020 100 419.7MB/s xls (41.23 %) BM_ZFlat/10 705 694 224719 274.8MB/s xls_200 (78.00 %) BM_ZFlat/11 681761 680540 298 213.1MB/s txt1 (57.87 %) BM_ZFlat/12 602058 598233 339 199.6MB/s txt2 (61.93 %) BM_ZFlat/13 1859000 1877787 108 216.7MB/s txt3 (54.92 %) BM_ZFlat/14 2377800 2340020 100 196.4MB/s txt4 (66.22 %) BM_ZFlat/15 663420 676003 300 724.0MB/s bin (18.11 %) BM_ZFlat/16 203 197 869565 966.5MB/s bin_200 (7.50 %) BM_ZFlat/17 139682 132860 1409 274.5MB/s sum (48.96 %) BM_ZFlat/18 12794 12318 15197 327.3MB/s man (59.36 %) BM_ZFlat/19 159285 160316 1265 705.4MB/s pb (19.64 %) BM_ZFlat/20 555553 558680 363 314.6MB/s gaviota (37.72 %) Running correctness tests. All tests passed.
64-bit test results
C:\...\snappy-visual-cpp>x64\Release\runtests.exe Running microbenchmarks. Benchmark Time(ns) CPU(ns) Iterations --------------------------------------------------- BM_UFlat/0 59839 59391 1576 1.6GB/s html BM_UFlat/1 616407 625929 324 1.0GB/s urls BM_UFlat/2 7089 7301 23501 16.2GB/s jpg BM_UFlat/3 83 77 1818181 2.4GB/s jpg_200 BM_UFlat/4 19813 19610 9546 4.5GB/s pdf BM_UFlat/5 249815 253501 800 1.5GB/s html4 BM_UFlat/6 19922 19143 9779 1.2GB/s cp BM_UFlat/7 9434 9266 20202 1.1GB/s c BM_UFlat/8 2584 2499 74906 1.4GB/s lsp BM_UFlat/9 936381 943260 215 1.0GB/s xls BM_UFlat/10 215 206 377358 922.7MB/s xls_200 BM_UFlat/11 205981 204849 990 708.0MB/s txt1 BM_UFlat/12 183019 188127 1078 634.6MB/s txt2 BM_UFlat/13 542320 530314 353 767.4MB/s txt3 BM_UFlat/14 757496 709094 264 648.1MB/s txt4 BM_UFlat/15 306537 304506 666 1.6GB/s bin BM_UFlat/16 209 215 869565 886.0MB/s bin_200 BM_UFlat/17 35983 35401 5288 1.0GB/s sum BM_UFlat/18 3530 3437 58997 1.1GB/s man BM_UFlat/19 57265 59177 3427 1.9GB/s pb BM_UFlat/20 216308 204145 917 861.1MB/s gaviota BM_UValidate/0 35711 36149 5610 2.6GB/s html BM_UValidate/1 415625 387579 483 1.7GB/s urls BM_UValidate/2 161 154 606060 765.6GB/s jpg BM_UValidate/3 52 51 3333333 3.6GB/s jpg_200 BM_UValidate/4 12067 12249 16556 7.2GB/s pdf BM_ZFlat/0 127627 129502 1566 754.1MB/s html (22.31 %) BM_ZFlat/1 1647181 1676041 121 399.5MB/s urls (47.77 %) BM_ZFlat/2 33583 23252 6038 5.1GB/s jpg (99.87 %) BM_ZFlat/3 464 463 303030 411.7MB/s jpg_200 (79.00 %) BM_ZFlat/4 53554 55213 3673 1.6GB/s pdf (82.07 %) BM_ZFlat/5 527481 525391 386 743.5MB/s html4 (22.51 %) BM_ZFlat/6 53204 52971 3534 442.9MB/s cp (48.12 %) BM_ZFlat/7 19206 19123 9789 556.0MB/s c (42.40 %) BM_ZFlat/8 6457 5389 28943 658.4MB/s lsp (48.37 %) BM_ZFlat/9 1650286 1662303 122 590.8MB/s xls (41.23 %) BM_ZFlat/10 589 574 298507 331.8MB/s xls_200 (78.00 %) BM_ZFlat/11 494736 499509 406 290.4MB/s txt1 (57.87 %) BM_ZFlat/12 450292 452680 448 263.7MB/s txt2 (61.93 %) BM_ZFlat/13 1332614 1370277 148 297.0MB/s txt3 (54.92 %) BM_ZFlat/14 1838211 1860559 109 247.0MB/s txt4 (66.22 %) BM_ZFlat/15 504965 468002 400 1.0GB/s bin (18.11 %) BM_ZFlat/16 203 196 952380 970.4MB/s bin_200 (7.50 %) BM_ZFlat/17 97675 88636 1936 411.4MB/s sum (48.96 %) BM_ZFlat/18 8915 8723 21459 462.1MB/s man (59.36 %) BM_ZFlat/19 115207 119224 1701 948.6MB/s pb (19.64 %) BM_ZFlat/20 410089 404792 501 434.3MB/s gaviota (37.72 %) Running correctness tests. Crazy decompression lengths not checked on 64-bit build All tests passed.
.NET performance numbers are about the same, perhaps a tiny bit lower.
SnappyStream class uses hardware-accelerated CRC-32C wherever possible.
There have been many efforts to port Snappy to Windows. This port of Snappy for Windows aims to be the most complete, the most up to date, and the most stable one. I will briefly mention existing ports and their relative strengths and weaknesses here.
Snappy for .NET
Developed mostly to compare performance with LZ4 compressor that is a close relative of Snappy. It includes native DLL build as well as .NET wrapper. I have copied bit counting optimization from this port.
It has a couple usability flaws though. It exposes only C APIs, not C++ APIs. The .NET wrapper requires developers to copy native DLLs around instead of embedding them and the native DLLs require installation of Visual C++ redistributable. There are no NuGet packages. It wasn't updated for over one year.
Snappy for .NET on CodePlex
This is pure .NET reimplementation of Snappy. Its readme plainly states that it is a work in progress. It was saying that for over a year. I need something stable in my projects. Any kind of "work in progress" is out of question for me.
The project is nevertheless maintained. There have been some recent commits. Perhaps someday it will be mature enough. I will then include it in my port as a pure .NET fallback in case the native libraries cannot be loaded.
Snappy.Sharp on GitHub
This is another pure .NET reimplementation of Snappy. There was unfortunately no commit for over 3 years. It looks abandoned. The readme contains no warnings about unfinished stuff, but there's no performance report either.
SnappySharp on GitHub
This seems to have been an attempt to create .NET wrapper for Snappy. There is a single commit made 2 years ago, which contains empty .NET project. I assume this project was abandoned before any progress has been made.
Snappy.Net on GitHub
This Windows port of Snappy is maintained by Robert Važan. Nearly all of the C++ code implementing the core Snappy algorithm was taken from upstream Snappy project. You can submit issues on GitHub (C++, .NET, tools) or Bitbucket (C++, .NET, tools), including requests for documentation. Pull requests are also welcome via GitHub (C++, .NET, tools) or Bitbucket (C++, .NET, tools).
Known bugs and issues:
- There's no managed .NET implementation. The library will fail under Mono, on ARM processors, and in various restricted contexts (ASP.NET, ClickOnce).
- C++/CLI, mixed mode assembly, or something similar should be used to avoid unpacking the native DLLs in temporary folder.