This produces substantially better diff results, and fixes the 'last
item in the list shown as changed' problem.
This can produce slower diffing. typing_before.ml takes 10% more
instructions and slow_before.rs takes 110% more instructions.
This is a more traditional graph representation. It is slightly easier
to reason about, and it's clearer that graph node creation time
dominates graphs exploration.
This is a slight performance regression, but it enables better
exploration of parethesis nesting (see next commit). typing_before.ml
has regressed from 3.75B instructions to 3.85B instructions and
slow_before.rs has regressed from 1.73B instructions to 2.15B
instructions.
This change has also made the diff output for slow_before.rs slightly
worse (note the `lhs` variable is now claimed as changed in more
cases). It's not clear why, but presumably means that the node visit
order has changed slightly.
Closes#324
This removes the need to special-case Perl, and is necessary for
CMake (which has nodes bracket_comment and line_comment that aren't
marked as 'extra').
This removes the need to special-case Perl, and is necessary for
CMake (which has nodes bracket_comment and line_comment that aren't
marked as 'extra').
This makes the 'lists are sufficiently similar' heuristic more
aggressive. Previously we'd look for lists with common start or end
children and the same delimiters.
This worked badly for cases like:
LHS: (novel-lhs (a b c d e))
RHS: (novel-rhs (a b c d e))
Instead, look for sublists that are unique on both sides and occur on
both the LHS and RHS root being considered. This allows us to match up
many more cases.
Consider lists to be sufficiently similar exclusiely using this
(surprisingly effective) heuristic, and don't consider outer
delimiters.
This substantially improves performance in many cases, particularly
for files that are fairly flat (many toplevel lists with little
nesting).
Fixes#306
When git calls us, we always know the file name. If we're called with
two arguments and one is /dev/null, use the other for language
detection and display.