Commit Graph

8190 Commits (e38d14a14486610bc279a5a057faf2852b0ebcc4)
 

Author SHA1 Message Date
Wilfred Hughes e38d14a144 Prefer aligning blank lines in display
After we've aligned lines based on diff results, we have intermediate
lines that we need to align somehow. Previously, we'd just take them
in order, aligning the first on the LHS with the first on the RHS and
so on.

If the intermediate lines start or end with a sequence of blank lines,
prefer aligning the blank lines. If we have both, arbitrarily choose
the ending blank lines.

This has produced better results in many of the sample files, although
in the case of slow_before.rs we've just changed from a leading blank
line alignment to a trailing blank line alignment.
2022-03-17 22:16:45 +07:00
Wilfred Hughes a51b81e86d cargo fmt 2022-03-17 20:45:09 +07:00
Wilfred Hughes eb59e15cd8 Silence some of the noisier clippy lints 2022-03-17 20:13:27 +07:00
Wilfred Hughes 2ce09e0a56 Move functions only used in context calculations to context.rs 2022-03-17 09:32:34 +07:00
Wilfred Hughes 6d58247465 Preserve the outer delimiter when shrinking
Previously, we'd always discard the outer delimiter if it matched on
both sides. This prevented the tree diff finding optimal diffs.

Fixes #124
2022-03-16 23:38:05 +07:00
Wilfred Hughes f4f12003cb Fix minor clippy lints 2022-03-16 22:42:31 +07:00
Wilfred Hughes d709aa98b5 Fix typo 2022-03-15 21:52:09 +07:00
Wilfred Hughes 83eea58fbc Display cost when printing routes 2022-03-15 21:33:01 +07:00
Wilfred Hughes 02976415dc Remove unnecessary braces 2022-03-15 21:19:05 +07:00
Wilfred Hughes 40d3ccd06c Fix confusion between byte length and codepoint length in styling
We should split lines based on their codepoint length, so all our
lengths are on codepoint boundaries. We can then safely index by byte position.

All the positions are measured in bytes, not code points. Tweak
function names to make this explicit.

Fixes #149
2022-03-15 09:50:13 +07:00
Wilfred Hughes b2229d66a8 Always display all lines in a hunk
Previously we were assuming that the first/last line pairs in a hunk
contained the earliest/latest lines on both sides. This isn't true
when there are no common items between the lines.

This fixes some display issues in load_before/after.js, but include a
new integration test that is smaller and easier to eyeball.

Fixes #133
2022-03-15 00:11:30 +07:00
Wilfred Hughes 6d9dc8322f Tweak motto 2022-03-13 22:14:03 +07:00
Wilfred Hughes 8c469e3d36 Ensure slider correction sets the opposite node on both sides
Fixes #154
2022-03-13 21:23:58 +07:00
Wilfred Hughes 19d11951ce Update changelog for c0d1faae6 2022-03-13 20:26:59 +07:00
Wilfred Hughes 7e8bf37bbc Use NonZeroU32 for unique IDs
This is a modest memory saving (reducing instruction counts by 3%
too), and it's nice having a distinct type for IDs.
2022-03-13 18:41:39 +07:00
Wilfred Hughes 0eb9f82aed Include node ID when printing ChangeKind 2022-03-13 18:28:33 +07:00
Wilfred Hughes c0d1faae63 Keep exploring the graph even when we find matched delimiters
Previously we'd get tripped up by cases where choosing equal
delimiters would be be considered the same as entering each delimiter
separately, making diffs worse.

Fixes #147
2022-03-13 15:41:54 +07:00
Wilfred Hughes bd6a1cbdc6 Store can_pop_either in Vertex 2022-03-13 13:56:18 +07:00
Wilfred Hughes 85fc2c6340 cargo fmt 2022-03-13 13:16:40 +07:00
Wilfred Hughes d416ee289f Improve debug printing
Clarify that we're printing count, not node IDs.
2022-03-13 12:43:50 +07:00
Wilfred Hughes 615c18d63e Clarify cost choices more 2022-03-12 18:03:26 +07:00
Wilfred Hughes 20aa06db5f Remove obsolete comment 2022-03-12 18:01:08 +07:00
Wilfred Hughes fdf5c2f88e Improve bug report template wording
Closes #138
2022-03-12 17:44:54 +07:00
Wilfred Hughes 50a3eb958e
Update issue templates 2022-03-12 17:37:14 +07:00
Wilfred Hughes 9439a932e6 Optimise hunk search within matched_lines
This reduces instructions from 29B instructions to 22B for the sample
file in #153.
2022-03-12 14:40:49 +07:00
Wilfred Hughes dad463daf5 Use Myers' diff everywhere
The diff crate has a great ergonomic API, but it doesn't implement
Myers' algorithm and performs badly on large inputs.

https://github.com/utkarshkukreti/diff.rs/issues/1

Now that we have a wrapper wu_diff that provides a similar API,
replace the remaining call sites to diff::slice(). These are
relatively cold, so this is a small performance improvement (1%
instruction reduction).
2022-03-12 12:29:34 +07:00
Wilfred Hughes dbf088bd25 cargo fmt 2022-03-12 12:22:38 +07:00
Wilfred Hughes 6210921104 Use Myers' diff for word-level diffing too
This further improves performance on large text files. On the sample
files in #153, this improves performance from 99B instructions to 29B
instructions on my machine.
2022-03-12 12:19:57 +07:00
Wilfred Hughes 5d8af55231 Prefer myers_diff types 2022-03-12 12:17:26 +07:00
Wilfred Hughes edee567e61 Factor out a myers_diff module 2022-03-12 12:15:59 +07:00
Wilfred Hughes dba68d1d2a Don't run syntax highlighting when dumping the tree-sitter output
For large files, tree-sitter syntax highlighting is much more
expensive than the parse itself. We spend most of the runtime
advancing the tree-sitter query cursor.

This doesn't affect runtime of normal usage, but it helps debugging
and makes flamegraphs more readable.

Spotted in #153
2022-03-12 11:53:39 +07:00
Wilfred Hughes 2f8ccd94da Add sample file showing slow perf large adjacent lists
Part of #156.
2022-03-12 11:28:28 +07:00
Wilfred Hughes afb1b369f4 Switch to wu-diff for textual diffing
In #153 a user reported difftastic never terminated on a 140,000
file. This was due to the diff crate using a very large amount of time
and memory.

The diff crate does not use Myers' algorithm, which has a
divide-and-conquer approach using snakes:

https://blog.jcoglan.com/2017/03/22/myers-diff-in-linear-space-theory/

wu-diff does implement Myer's algorithm and performs much better on
these large files.
2022-03-10 23:12:25 +07:00
Wilfred Hughes e8d9ffa61c Remove unwanted debugging 2022-03-10 22:59:28 +07:00
Wilfred Hughes c81911be51 Allow the unchanged tree threshold to be changed with an env var
This is helpful when debugging production use cases. Fixes #155
2022-03-10 21:15:57 +07:00
Wilfred Hughes a3a2bfb317 Roll version 2022-03-10 00:13:26 +07:00
Wilfred Hughes 918b6a1ba3 Don't consider Hack files with <?hh headers and .php extension as PHP 2022-03-10 00:03:57 +07:00
Wilfred Hughes a10e91e1cf Add symlinks for building PHP parser 2022-03-09 23:55:18 +07:00
Wilfred Hughes ed0bde6b91 Adding support for PHP 2022-03-09 23:52:31 +07:00
Wilfred Hughes 51a4da6d7c Add 'vendor/tree-sitter-php/' from commit '0ce134234214427b6aeb2735e93a307881c6cd6f'
git-subtree-dir: vendor/tree-sitter-php
git-subtree-mainline: 020983cd85
git-subtree-split: 0ce1342342
2022-03-09 23:36:23 +07:00
Wilfred Hughes 020983cd85 Merge branch 'split_unchanged_regions'
This fixes #116
2022-03-09 23:12:02 +07:00
Wilfred Hughes 0927a2f9e6 Update changelog 2022-03-09 23:11:04 +07:00
Wilfred Hughes d8c16e561d Update regression tests 2022-03-09 23:04:44 +07:00
Wilfred Hughes f0e90b2aea Improve test naming and ensure they're exercising the relevant parts 2022-03-09 23:02:16 +07:00
Wilfred Hughes cc384bfa6d Add shrinking and get tests passing 2022-03-09 22:45:53 +07:00
Wilfred Hughes 80b9762216 Set threshold and document 2022-03-09 09:40:53 +07:00
Wilfred Hughes d1c060ca17 Make skip unchanged logic less aggressive 2022-03-09 09:40:53 +07:00
Wilfred Hughes f39721d792 Update most of the integration tests 2022-03-09 09:40:53 +07:00
Wilfred Hughes 1895df8f3b Allow unchanged nodes to be checked recursively, and hook up to main.rs 2022-03-09 09:40:53 +07:00
Wilfred Hughes fb7d2d7c86 Get basic examples working with impl 2022-03-09 09:40:53 +07:00