Previously we were assuming that the first/last line pairs in a hunk
contained the earliest/latest lines on both sides. This isn't true
when there are no common items between the lines.
This fixes some display issues in load_before/after.js, but include a
new integration test that is smaller and easier to eyeball.
Fixes#133
Previously we'd get tripped up by cases where choosing equal
delimiters would be be considered the same as entering each delimiter
separately, making diffs worse.
Fixes#147
The diff crate has a great ergonomic API, but it doesn't implement
Myers' algorithm and performs badly on large inputs.
https://github.com/utkarshkukreti/diff.rs/issues/1
Now that we have a wrapper wu_diff that provides a similar API,
replace the remaining call sites to diff::slice(). These are
relatively cold, so this is a small performance improvement (1%
instruction reduction).
This further improves performance on large text files. On the sample
files in #153, this improves performance from 99B instructions to 29B
instructions on my machine.
For large files, tree-sitter syntax highlighting is much more
expensive than the parse itself. We spend most of the runtime
advancing the tree-sitter query cursor.
This doesn't affect runtime of normal usage, but it helps debugging
and makes flamegraphs more readable.
Spotted in #153
In #153 a user reported difftastic never terminated on a 140,000
file. This was due to the diff crate using a very large amount of time
and memory.
The diff crate does not use Myers' algorithm, which has a
divide-and-conquer approach using snakes:
https://blog.jcoglan.com/2017/03/22/myers-diff-in-linear-space-theory/
wu-diff does implement Myer's algorithm and performs much better on
these large files.
This reverts commit 7544874a55. It turns
out there are cases where this is still necessary (see new sample
file). It's also performance neutral.
This bug became more obvious with the recent 'skip unchanged'
optimisations. The optimisation changed the number of preceeding nodes and
exposed this bug more often.
Introduce a new type EnteredDelimiter that tracks entering/leaving
list nodes. The PopEither and PopBoth cases reflect the choices more
accurately than a 2-tuple of options.
This is a performance hit (slow_before.rs runtime has increased by
49%) but it's important for diff correctness.
Fixes#147