User Guide¶
PPM, Prediction by partial matching, is a wellknown compression technique based on context modeling and prediction. PPM models use a set of previous symbols in the uncompressed symbol stream to predict the next symbol in the stream.
PPMd is an implementation of PPMII by Dmitry Shkarin.
The ppmd-cffi package uses core C files from p7zip. The library has a bare function and no metadata/header handling functions. This means you should know compression parameters and input/output data sizes.
Getting started¶
Install¶
The ppmd-cffi is written by Python and C language bound with CFFI, and can be downloaded from PyPI(aka. Python Package Index) using standard ‘pip’ command as like follows;
$ pip install ppmd-cffi
Command line¶
ppmd-cffi provide command line script to hande .ppmd file.
To compress file
$ ppmd target.dat
To decompress ppmd file
$ ppmd -x target.ppmd
To decompress to STDOUT
$ ppmd -x -c target.ppmd
Programming Interfaces¶
.ppmd file comression/decompression¶
ppmd-cffi project provide two functions which compress and decompress .ppmd archive file. PpmdCompressor class provide compress function compress() and PpmdDecompressor class provide extraction function decompress().
Both classes takes version= argument which default is 8, means PPMd Ver. I. Also classes takes target, fname and ftime arguments which is a target file and its properties. target should be a file-like object which has write() method. fname and ftime is a file property which is stored in archive as meta data. fname should be string, and ftime should be a datetime object.
order and mem_in_mb parameters will be vary.
Compression with PPMd ver. H¶
targetfile = pathlib.Path('target.dat')
fname = 'target.dat'
ftime = datetime.utcfromtimestamp(targetfile.stat().st_mtime)
archivefile = 'archive.ppmd'
order = 6
mem_in_mb = 16
with archivefile.open('wb') as target:
with targetfile.open('rb') as src:
with PpmdCompressor(target, fname, ftime, order, mem_in_mb, version=7) as compressor:
compressor.compress(src)
Compression with PPMd ver. I¶
targetfile = pathlib.Path('target.dat')
fname = 'target.dat'
ftime = datetime.utcfromtimestamp(targetfile.stat().st_mtime)
archivefile = 'archive.ppmd'
order = 6
mem_in_mb = 8
with archivefile.open('wb') as target:
with targetfile.open('rb') as src:
with PpmdCompressor(target, fname, ftime, order, mem_in_mb, version=8) as compressor:
compressor.compress(src)
Decompression¶
When construct PpmdDecompressor object, it read header from specified archive file. The header hold a compression parameters such as PPMd version, order and memory size. It also has a filename and timestamp. PpmdDecompressor select a proper decoder based on header data. You need to handle filename and timestamp by your self. A decompressor method will write data to specified file-like object, which should have write() method.
targetfile = pathlib.Path('target.ppmd')
with targetfile.open('rb') as target:
with PpmdDecompressor(target, target_size) as decompressor:
extractedfile = pathlib.Path(parent.joinpath(decompressor.filename))
with extractedfile.open('wb') as ofile:
decompressor.decompress(ofile)
timestamp = datetime_to_timestamp(decompressor.ftime)
os.utime(str(extractedfile), times=(timestamp, timestamp))
Bare encoding/decoding PPMd data¶
There are several classes to handle bare PPMd data. Note: mem parameter should be as bytes not MB.
- Ppmd7Encoder(dst, order, mem)
- Ppmd7Decoder(src, order, mem)
- Ppmd8Encoder(det, order, mem, restore)
- Ppmd8Decoder(src, order, mem, restore)