s-bsdipa, a mutation of BSDiff
==============================

Colin Percival's BSDiff, imported from FreeBSD and transformed into
a library; please see header comment of s-bsdipa-lib.h for more:
create or apply binary difference patch.  In general:

- One includes s-bsdipa-lib.h and uses the all-in-memory s_bsdipa_diff()
  and s_bsdipa_patch() functions to create and apply patches.
  Ie, (for example mmap(2)ed) memory in, (heap) memory out.

- Necessary compression / storage preparation can (could) be achieved
  easily by including s-bsdipa-io.h after defining s_BSDIPA_IO as
  desired, followed by using the according s_bsdipa_io_write_*() and
  _read_*() functions; These still do not perform direct I/O, but call
  a supplied hook with fully prepared buffers or store in (heap) memory,
  respectively.  Multiple _IO methods are provided.

- In general the lib/ directory of the source repository is self-
  contained, and may be copied for inclusion in other projects.

  If the s_BSDIPA_SMALL approach is taken, lib/libdivsufsort/ and
  lib/divsufsort.h may also be removed.

- Please see the introductional header comments of s-bsdipa-lib.h and
  s-bsdipa-io.h for more.

- The directory s-bsdipa contains a self-contained (except for
  compression libraries) program which can create and apply patches
  (like a combined FreeBSD bsdiff and bspatch program).
  It times execution and tracks memory usage on stderr.

- The directory perl contains the self-contained BsDiPa CPAN module.
  (Perl ships with ZLIB, liblzma/XZ and libz2/BZ2 support is
  compile-time detected.)

Licenses (full text included in s-bsdipa-lib.h):
  libdivsufsort(/LICENSE): MIT
  s-bsdiff.c: BSD-2-clause
  s-bspatch.c, s-bsdipa-lib.h, s-bsdipa-io.h, s-bsdipa.c: ISC

Repository:
  browse: https?://git.sdaoden.eu/browse/s-bsdipa.git
  clone:  https?://git.sdaoden.eu/scm/s-bsdipa.git

  alternatively: https://github.com/sdaoden/s-bsdipa

Contact: steffen at sdaoden dot eu.

1. s-bsdipa example
2. Releases

1. s-bsdipa example
-------------------

Tested via

  $ CFLAGS='-O3 -DNDEBUG' \
    make s_BSDIPA_32=y s_BSDIPA_CFLAGS='$(SUFX)' clean all
  $ mv s-bsdipa/s-bsdipa s-bsdipa32
  $ CFLAGS='-O3 -DNDEBUG' \
    make s_BSDIPA_CFLAGS='$(SUFX)' clean all
  $ mv s-bsdipa/s-bsdipa s-bsdipa64

  $ for c in -J -j -Z -z; do
      for b in 32 64; do
        echo $b/$c
        ./s-bsdipa$b $c -9 diff .2 .1 .P$b$c
        ./s-bsdipa$b $c -f7 diff .2 .1 .P$b$c
        ./s-bsdipa$b $c -f5 diff .2 .1 .P$b$c
        ./s-bsdipa$b patch .1 .P$b$c .R$b$c
      done
    done

Text (manuals of S-nail, v14.9.25 vs v14.10.0-alpha):

  $ file .1 .2
  .1: troff or preprocessor input, ASCII text
  .2: troff or preprocessor input, ASCII text

  $ ll .1 .2
  -rw-r----- 1 steffen steffen 428420 May  6 23:59 .1
  -rw-r----- 1 steffen steffen 390770 May  7 00:00 .2

  32/-J
  # 57072 result bytes | 114 allocs: all=708315454 peek=706338602
  # Code 0:071 secs, XZ I/O 0:082 secs
  # 57072 result bytes | 114 allocs: all=196610366 peek=194633514
  # Code 0:070 secs, XZ I/O 0:068 secs
  # 56848 result bytes | 114 allocs: all=100141374 peek=98164522
  # Code 0:077 secs, XZ I/O 0:060 secs
  # 390770 result bytes | 11 allocs: all=9252521 peek=8861638
  # Code 0:000 secs, XZ I/O 0:003 secs
  64/-J
  # 57120 result bytes | 114 allocs: all=710339046 peek=706385342
  # Code 0:073 secs, XZ I/O 0:105 secs
  # 57120 result bytes | 114 allocs: all=198633958 peek=194680254
  # Code 0:071 secs, XZ I/O 0:092 secs
  # 56964 result bytes | 114 allocs: all=102164966 peek=98211262
  # Code 0:079 secs, XZ I/O 0:086 secs
  # 390770 result bytes | 11 allocs: all=9299261 peek=8908378
  # Code 0:000 secs, XZ I/O 0:003 secs
  32/-j
  # 58761 result bytes | 104 allocs: all=10060911 peek=8084059
  # Code 0:070 secs, BZ2 I/O 0:013 secs
  # 58761 result bytes | 104 allocs: all=8460911 peek=6484059
  # Code 0:069 secs, BZ2 I/O 0:013 secs
  # 58761 result bytes | 104 allocs: all=6860911 peek=4884059
  # Code 0:069 secs, BZ2 I/O 0:013 secs
  # 390770 result bytes | 4 allocs: all=2892425 peek=2501654
  # Code 0:000 secs, BZ2 I/O 0:006 secs
  64/-j
  # 58180 result bytes | 104 allocs: all=12084503 peek=8130799
  # Code 0:079 secs, BZ2 I/O 0:013 secs
  # 58180 result bytes | 104 allocs: all=10484503 peek=6530799
  # Code 0:078 secs, BZ2 I/O 0:013 secs
  # 58180 result bytes | 104 allocs: all=8884503 peek=4930799
  # Code 0:071 secs, BZ2 I/O 0:014 secs
  # 390770 result bytes | 4 allocs: all=2939165 peek=2548394
  # Code 0:000 secs, BZ2 I/O 0:008 secs
  32/-Z
  # 60156 result bytes | 102 allocs: all=876613546 peek=874636694
  # Code 0:069 secs, ZSTD I/O 0:561 secs
  # 61350 result bytes | 102 allocs: all=41220652 peek=39243800
  # Code 0:070 secs, ZSTD I/O 0:035 secs
  # 62119 result bytes | 102 allocs: all=49460012 peek=47483160
  # Code 0:070 secs, ZSTD I/O 0:025 secs
  # 390770 result bytes | 4 allocs: all=5511857 peek=5121086
  # Code 0:000 secs, ZSTD I/O 0:000 secs
  64/-Z
  # 61178 result bytes | 102 allocs: all=878637138 peek=874683434
  # Code 0:072 secs, ZSTD I/O 0:581 secs
  # 62558 result bytes | 102 allocs: all=43244244 peek=39290540
  # Code 0:071 secs, ZSTD I/O 0:047 secs
  # 64264 result bytes | 102 allocs: all=51483604 peek=47529900
  # Code 0:072 secs, ZSTD I/O 0:028 secs
  # 390770 result bytes | 4 allocs: all=5558597 peek=5167826
  # Code 0:000 secs, ZSTD I/O 0:000 secs
  32/-z
  # 64391 result bytes | 105 allocs: all=2810971 peek=2367623
  # Code 0:071 secs, ZLIB I/O 0:064 secs
  # 65835 result bytes | 105 allocs: all=2810971 peek=2367623
  # Code 0:069 secs, ZLIB I/O 0:016 secs
  # 67632 result bytes | 105 allocs: all=2810971 peek=2367623
  # Code 0:069 secs, ZLIB I/O 0:008 secs
  # 390770 result bytes | 4 allocs: all=868209 peek=828281
  # Code 0:000 secs, ZLIB I/O 0:001 secs
  64/-z
  # 66421 result bytes | 105 allocs: all=4834563 peek=4344475
  # Code 0:073 secs, ZLIB I/O 0:119 secs
  # 68606 result bytes | 105 allocs: all=4834563 peek=4344475
  # Code 0:073 secs, ZLIB I/O 0:016 secs
  # 70442 result bytes | 105 allocs: all=4834563 peek=4344475
  # Code 0:072 secs, ZLIB I/O 0:008 secs
  # 390770 result bytes | 4 allocs: all=914949 peek=875021
  # Code 0:000 secs, ZLIB I/O 0:001 secs

Binary (lynx v2-9-2u vs v2-9-2x, Linux x86-64):

  $ ll .1 .2
  -rwxr-x--- 1 steffen steffen 1678512 May  7 00:19 .1*
  -rwxr-x--- 1 steffen steffen 1682544 May  7 00:20 .2*

  32/-J
  # 355988 result bytes | 84 allocs: all=715464804 peek=708487584
  # Code 0:695 secs, XZ I/O 0:510 secs
  # 355988 result bytes | 84 allocs: all=203759716 peek=196782496
  # Code 0:604 secs, XZ I/O 0:450 secs
  # 359448 result bytes | 84 allocs: all=107290724 peek=100313504
  # Code 0:606 secs, XZ I/O 0:356 secs
  # 1682544 result bytes | 11 allocs: all=11821069 peek=10138412
  # Code 0:002 secs, XZ I/O 0:017 secs
  64/-J
  # 355880 result bytes | 84 allocs: all=722474004 peek=708519564
  # Code 0:636 secs, XZ I/O 0:478 secs
  # 355880 result bytes | 84 allocs: all=210768916 peek=196814476
  # Code 0:636 secs, XZ I/O 0:463 secs
  # 359476 result bytes | 84 allocs: all=114299924 peek=100345484
  # Code 0:636 secs, XZ I/O 0:371 secs
  # 1682544 result bytes | 11 allocs: all=11852809 peek=10170152
  # Code 0:002 secs, XZ I/O 0:017 secs
  32/-j
  # 410229 result bytes | 74 allocs: all=17210261 peek=10233041
  # Code 0:602 secs, BZ2 I/O 0:091 secs
  # 408275 result bytes | 74 allocs: all=15610261 peek=8659765
  # Code 0:607 secs, BZ2 I/O 0:091 secs
  # 397830 result bytes | 74 allocs: all=14010261 peek=8659765
  # Code 0:598 secs, BZ2 I/O 0:089 secs
  # 1682544 result bytes | 4 allocs: all=5460973 peek=3778428
  # Code 0:002 secs, BZ2 I/O 0:040 secs
  64/-j
  # 408806 result bytes | 74 allocs: all=24219461 peek=15636985
  # Code 0:641 secs, BZ2 I/O 0:091 secs
  # 406774 result bytes | 74 allocs: all=22619461 peek=15636985
  # Code 0:635 secs, BZ2 I/O 0:091 secs
  # 396466 result bytes | 74 allocs: all=21019461 peek=15636985
  # Code 0:637 secs, BZ2 I/O 0:089 secs
  # 1682544 result bytes | 4 allocs: all=5492713 peek=3810168
  # Code 0:002 secs, BZ2 I/O 0:040 secs
  32/-Z
  # 379014 result bytes | 72 allocs: all=883762896 peek=876785676
  # Code 0:605 secs, ZSTD I/O 1:003 secs
  # 399581 result bytes | 72 allocs: all=48370002 peek=41392782
  # Code 0:599 secs, ZSTD I/O 0:206 secs
  # 411942 result bytes | 72 allocs: all=56609362 peek=49632142
  # Code 0:606 secs, ZSTD I/O 0:070 secs
  # 1682544 result bytes | 4 allocs: all=8080405 peek=6397860
  # Code 0:002 secs, ZSTD I/O 0:002 secs
  64/-Z
  # 379017 result bytes | 72 allocs: all=890772096 peek=876817656
  # Code 0:635 secs, ZSTD I/O 1:016 secs
  # 402134 result bytes | 72 allocs: all=55379202 peek=41424762
  # Code 0:642 secs, ZSTD I/O 0:214 secs
  # 413229 result bytes | 72 allocs: all=63618562 peek=49664122
  # Code 0:637 secs, ZSTD I/O 0:072 secs
  # 1682544 result bytes | 4 allocs: all=8112145 peek=6429600
  # Code 0:002 secs, ZSTD I/O 0:003 secs
  32/-z
  # 413044 result bytes | 75 allocs: all=9960321 peek=8659765
  # Code 0:604 secs, ZLIB I/O 0:464 secs
  # 420678 result bytes | 75 allocs: all=9960321 peek=8659765
  # Code 0:600 secs, ZLIB I/O 0:086 secs
  # 432110 result bytes | 75 allocs: all=9960321 peek=8659765
  # Code 0:605 secs, ZLIB I/O 0:042 secs
  # 1682544 result bytes | 4 allocs: all=3436757 peek=3396829
  # Code 0:002 secs, ZLIB I/O 0:006 secs
  64/-z
  # 414385 result bytes | 75 allocs: all=16969521 peek=15636985
  # Code 0:636 secs, ZLIB I/O 0:512 secs
  # 422267 result bytes | 75 allocs: all=16969521 peek=15636985
  # Code 0:641 secs, ZLIB I/O 0:086 secs
  # 433588 result bytes | 75 allocs: all=16969521 peek=15636985
  # Code 0:644 secs, ZLIB I/O 0:042 secs
  # 1682544 result bytes | 4 allocs: all=3468497 peek=3428569
  # Code 0:002 secs, ZLIB I/O 0:006 secs

2. Releases
-----------

v0.9.1, 2026-05-09:
  + perl/, s-bsdipa/: detect availability of I/O layers via compilation
    test, instead of through existence of command line utilities.

  - s-bspatch.c: verify all control tuples were consumed, plus tweaks.
    This file is now ISC copyright.
  - Add optional zstd (Zstandard, libzstd) _IO method support.

v0.9.0, 2025-12-24:
  + Breaks backward compatibility of bsdipa_patch() as that assumes
    patches satisfy content constraints that are only satisfied by
    bsdipa_diff() of v0.9.0!!

  + Import of Colin Percival's original qsufsort() algorithm; it was
    replaced with libdivsufsort in FreeBSD, and Colin Percival pointed
    me to this because of existing bugfixes.  The original variant is
    smaller code, but suffers from a performance penalty on large files
    (about 15% for unrelated 5 megabyte binaries) -- it is faster for
    small files, however, so having it around is very beneficial.
  ++ By default both algorithms are compiled in, their usage depends on
     the data size.

v0.8.1, 2025-12-20:
  + Notes:
    Not released, bsdipa_patch() includes a patch content constraint
    test that is not satisfied before v0.9.0!
  + Warning:
    Miscompilations (of libdivsufsort) with gcc 15.2.0 and -O3 and above!
    As well as in sanitizer compilations.
    (clang 21.1.6 ok.)  (All on Linux.)
  - Tighten tested constraints on patch content.
  - Optimize away needless work when applying patch.
  - Fix beflen!=0 aftlen==0 "algorithm error" in BSDiff resulting in SEGV.
  - ZLIB I/O: allow compression config via cookie.
  - perl: make official core_try_oneshot_set(), add core_diff_level_set().
  - Add optional BZ2 (BZIP2, libbz2) _IO method support.

v0.8.0, 2025-07-03:
  - ABI and API breakage.
  - Fixes a cast that could have lost bits on systems with a 64-bit
    bsdipa_off_t and a 32-bit size_t (if any).
  - Adds an "is equal data" state.
  - Adds optional XZ (LZMA2, liblzma, XZ utils) _IO method support.
  - Adds I/O cookie support (yet only for XZ): cookie can be reused by
    successive calls to diff/patch, which can aid in dramatical
    reduction of resource aquire/release cycles.
  - s-bsdipa is now a real program, with options, manual, test, etc.
  - Coverity.com (project 31371 / s-bsdipa) still sees us 0.0.

v0.7.0, 2025-02-19:
  - CHANGE: honour s_bsdipa_patch_ctx::pc_max_allowed_restored_len
    already on the s-bsdipa-io.h layer, directly after having called
    s_bsdipa_patch_parse_header().  (Ie, before the ".pc_patch_dat
    allocation" even.)
  - FIX for s-bsdipa example program: when compiled without NDEBUG
    it would munmap(2) invalidated pointer/length combo.

v0.6.1, 2025-02-17:
  - Coverity.com (project 31371 / s-bsdipa) FIXes for the s-bsdipa
    example program: one unused value, one fd resource leak.
    (Tool design changed without that being adopted in early design
    stage: obviously not enough iterations and/or too much fuzz.)
  - bsdipa_patch() CHANGE: until now field lengths were not verified
    in the (unusual) .pc_patch_dat==NULL case, as the user was expected
    to have done this before; instead, always check anything.
  -- Do not increment minor number nonetheless, no ABI change.

v0.6.0, 2025-01-31:
  - Adds struct s_bsdipa_patch_ctx::pc_max_allowed_restored_len, which
    allows to configure the maximum allowed size of the restored data.
    (Mostly for perl or other possible script/xy interfaces, the
    C interface as such has s_bsdipa_header::h_before_len ...)

v0.5.3, 2025-01-17:
  - FIXes totally false buffer usage blindlessly introduced to fix
    (correct .. but nonetheless false) cpantesters.org assertion
    failure.  (That is: it is binary data so NUL termination is a fake,
    .. but that is how it has to be, stupid!)
    What a mess.

v0.5.2, 2025-01-09:
  - CHANGE/FIX: ensure patch fits in _OFF_MAX, including control data.
    s_bsdipa_patch_parse_header() did verify that on the patch side,
    but on the diff side we yet did not care, as in theory the data
    could have been stored in individual chunks.
  - FIX: perl CPAN testers started failing (in a second round?)
    due to assertion failures regarding SV_HAS_TRAILING_NUL and that
    missing.  Therefore ensure our memory results have one byte in
    addition and do always terminate them.
  - more perl module creation related tweaks.

v0.5.1, 2025-01-05:
  - perl module creation related tweaks.

v0.5.0, 2024-12-26: (first release)

# s-ts-mode
