mirror of
				https://gitea.com/Lydanne/buildx.git
				synced 2025-10-25 13:13:45 +08:00 
			
		
		
		
	
		
			
				
	
	
		
			90 lines
		
	
	
		
			5.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			90 lines
		
	
	
		
			5.4 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
| # Huff0 entropy compression
 | |
| 
 | |
| This package provides Huff0 encoding and decoding as used in zstd.
 | |
|             
 | |
| [Huff0](https://github.com/Cyan4973/FiniteStateEntropy#new-generation-entropy-coders), 
 | |
| a Huffman codec designed for modern CPU, featuring OoO (Out of Order) operations on multiple ALU 
 | |
| (Arithmetic Logic Unit), achieving extremely fast compression and decompression speeds.
 | |
| 
 | |
| This can be used for compressing input with a lot of similar input values to the smallest number of bytes.
 | |
| This does not perform any multi-byte [dictionary coding](https://en.wikipedia.org/wiki/Dictionary_coder) as LZ coders,
 | |
| but it can be used as a secondary step to compressors (like Snappy) that does not do entropy encoding. 
 | |
| 
 | |
| * [Godoc documentation](https://godoc.org/github.com/klauspost/compress/huff0)
 | |
| 
 | |
| ## News
 | |
| 
 | |
| This is used as part of the [zstandard](https://github.com/klauspost/compress/tree/master/zstd#zstd) compression and decompression package.
 | |
| 
 | |
| This ensures that most functionality is well tested.
 | |
| 
 | |
| # Usage
 | |
| 
 | |
| This package provides a low level interface that allows to compress single independent blocks. 
 | |
| 
 | |
| Each block is separate, and there is no built in integrity checks. 
 | |
| This means that the caller should keep track of block sizes and also do checksums if needed.  
 | |
| 
 | |
| Compressing a block is done via the [`Compress1X`](https://godoc.org/github.com/klauspost/compress/huff0#Compress1X) and 
 | |
| [`Compress4X`](https://godoc.org/github.com/klauspost/compress/huff0#Compress4X) functions.
 | |
| You must provide input and will receive the output and maybe an error.
 | |
| 
 | |
| These error values can be returned:
 | |
| 
 | |
| | Error               | Description                                                                 |
 | |
| |---------------------|-----------------------------------------------------------------------------|
 | |
| | `<nil>`             | Everything ok, output is returned                                           |
 | |
| | `ErrIncompressible` | Returned when input is judged to be too hard to compress                    |
 | |
| | `ErrUseRLE`         | Returned from the compressor when the input is a single byte value repeated |
 | |
| | `ErrTooBig`         | Returned if the input block exceeds the maximum allowed size (128 Kib)      |
 | |
| | `(error)`           | An internal error occurred.                                                 |
 | |
| 
 | |
| 
 | |
| As can be seen above some of there are errors that will be returned even under normal operation so it is important to handle these.
 | |
| 
 | |
| To reduce allocations you can provide a [`Scratch`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch) object 
 | |
| that can be re-used for successive calls. Both compression and decompression accepts a `Scratch` object, and the same 
 | |
| object can be used for both.   
 | |
| 
 | |
| Be aware, that when re-using a `Scratch` object that the *output* buffer is also re-used, so if you are still using this
 | |
| you must set the `Out` field in the scratch to nil. The same buffer is used for compression and decompression output.
 | |
| 
 | |
| The `Scratch` object will retain state that allows to re-use previous tables for encoding and decoding.  
 | |
| 
 | |
| ## Tables and re-use
 | |
| 
 | |
| Huff0 allows for reusing tables from the previous block to save space if that is expected to give better/faster results. 
 | |
| 
 | |
| The Scratch object allows you to set a [`ReusePolicy`](https://godoc.org/github.com/klauspost/compress/huff0#ReusePolicy) 
 | |
| that controls this behaviour. See the documentation for details. This can be altered between each block.
 | |
| 
 | |
| Do however note that this information is *not* stored in the output block and it is up to the users of the package to
 | |
| record whether [`ReadTable`](https://godoc.org/github.com/klauspost/compress/huff0#ReadTable) should be called,
 | |
| based on the boolean reported back from the CompressXX call. 
 | |
| 
 | |
| If you want to store the table separate from the data, you can access them as `OutData` and `OutTable` on the 
 | |
| [`Scratch`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch) object.
 | |
| 
 | |
| ## Decompressing
 | |
| 
 | |
| The first part of decoding is to initialize the decoding table through [`ReadTable`](https://godoc.org/github.com/klauspost/compress/huff0#ReadTable).
 | |
| This will initialize the decoding tables. 
 | |
| You can supply the complete block to `ReadTable` and it will return the data part of the block 
 | |
| which can be given to the decompressor. 
 | |
| 
 | |
| Decompressing is done by calling the [`Decompress1X`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch.Decompress1X) 
 | |
| or [`Decompress4X`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch.Decompress4X) function.
 | |
| 
 | |
| For concurrently decompressing content with a fixed table a stateless [`Decoder`](https://godoc.org/github.com/klauspost/compress/huff0#Decoder) can be requested which will remain correct as long as the scratch is unchanged. The capacity of the provided slice indicates the expected output size.
 | |
| 
 | |
| You must provide the output from the compression stage, at exactly the size you got back. If you receive an error back
 | |
| your input was likely corrupted. 
 | |
| 
 | |
| It is important to note that a successful decoding does *not* mean your output matches your original input. 
 | |
| There are no integrity checks, so relying on errors from the decompressor does not assure your data is valid.
 | |
| 
 | |
| # Contributing
 | |
| 
 | |
| Contributions are always welcome. Be aware that adding public functions will require good justification and breaking 
 | |
| changes will likely not be accepted. If in doubt open an issue before writing the PR.
 | 
