Commit Briefs

Neels Hofmeyr

patience: do not swallow identical neighbors (neels/patience2)

This does not make much sense, because if common-unique lines swallow their neighboring ones, they count less, and another bad, shorter sequence may gain more weight than a very long sequence that was combined to just one common-unique chunk. It also much simplifies the code and avoids bugs we had to implement complex fixes for before.


Stefan Sperling

set diff box recursion limit to UINT_MAX by default

In practice, recursion is already limited by our Myers max-effort cut and this heuristic should generally provide a better split than a limit on the number of diff boxes. A recursion limit is only required for diff configs that do not include the Myers algorithm, which we currently don't have. Discussed with Neels


Stefan Sperling

set a minimum myers effort limit to avoid an early short-cut on small files

discussed with Neels


Stefan Sperling

allow for telling the difference between empty and non-existent files

Adjust labels used in diff output accordingly; non-existent files should have the label "/dev/null"




Christian Weisgerber

add a missing include for uint8_t and switch from <inttypes.h> to <stdint.h>

ok millert stsp




Stefan Sperling

add another test case where a context line appears as both - and +

This line appears as a context line with regular diff(1) and git diff: crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; [[[ + crd->crd_alg = sd->mds.mdd_crypto.scr_alg; + crd->crd_klen = sd->mds.mdd_crypto.scr_klen; crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; - bcopy(&blk, crd->crd_iv, sizeof(blk)); + memcpy(crd->crd_iv, &blkno, sizeof(blkno)); ]]] Our diff produces a different result where this context line is both deleted and added: [[[ - crd->crd_alg = CRYPTO_AES_XTS; + crd->crd_alg = sd->mds.mdd_crypto.scr_alg; + crd->crd_klen = sd->mds.mdd_crypto.scr_klen; + crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; + memcpy(crd->crd_iv, &blkno, sizeof(blkno)); + } - switch (sd->mds.mdd_crypto.scr_meta->scm_alg) { - case SR_CRYPTOA_AES_XTS_128: - crd->crd_klen = 256; - break; - case SR_CRYPTOA_AES_XTS_256: - crd->crd_klen = 512; - break; - default: - goto unwind; - } - crd->crd_key = sd->mds.mdd_crypto.scr_key[0]; - bcopy(&blk, crd->crd_iv, sizeof(blk)); - } - crwu->cr_wu = wu; - crwu->cr_crp->crp_opaque = crwu; - return (crwu); ]]]


Branches













Tags

This repository contains no tags

Tree

.gitignorecommits | blame
LICENCEcommits | blame
READMEcommits | blame
compat/
diff/
diff-version.mkcommits | blame
include/
lib/
man/
test/

README

This is a collection of diff algorithms, to test various combinations.

The initial aim was to provide a faster diff implementation for got
(gameoftrees.org) with a BSD license, at the u2k20 OpenBSD hackathon.
A side effect could be improving OpenBSD's /usr/bin/diff utility.

At the time of writing, this is little more than a playground / benchmark basis
/ diff algorithm analysis platform. What could be done:
- add profiling and test series to rate diff algorithm combinations.
- interface with / merge into got.

The Myers and Patience Diff algorithm implementations found here are based on
the explanations found in these blog post series:
  https://blog.jcoglan.com/2017/02/12/the-myers-diff-algorithm-part-1/ ff.
and
  https://blog.jcoglan.com/2017/09/19/the-patience-diff-algorithm/ ff.
-- possibly the single most comprehensive explanations of these algorithms.
Many thanks for this valuable door opener!
The source code itself is not based on the code found in those blogs, but
written from scratch with the knowledge gained.

Compile:
  make -C diff

Test:
  make -C test/