Commit Graph

8197 Commits (06ffa0267a38a5b29e7adf2a42b33681365010e5)
 

Author SHA1 Message Date
Wilfred Hughes 06ffa0267a Next release will be a patch release with a single crash fix 2022-03-19 10:24:30 +07:00
Wilfred Hughes d06f357c93 Fix crash when outer delimiters are discarded when skipping unchanged
If we skip some nodes inside a list whose delimiters are unchanged, we
need to mark the outer list as unchanged.

Split ChangeState::Unchanged into UnchangedNode and UnchangedDelimiter
to make this clearer, and add a test.
2022-03-18 16:00:05 +07:00
Wilfred Hughes 6d3b1ffbdd Update test names to reflect current function names 2022-03-18 15:35:04 +07:00
Wilfred Hughes 038463d0e4 Add Debug for ChangeState to help debugging 2022-03-18 13:26:39 +07:00
Wilfred Hughes 0e04e199e4 Remove NovelTree edges from graph
This is a small speedup (up to 6% reduction in instructions) and makes
the graph logic easier to reason about.

In principle this can change dififng results, but all of the sample
files are unaffected.

This logic was intended to solve the problem of a small number of
nodes being matched up in a very large expression. We can do this as
cleanup after diffing, which should be faster and more effective
(see #162).
2022-03-18 12:31:22 +07:00
Wilfred Hughes 0379b49109 Clarify non-goals 2022-03-17 22:41:47 +07:00
Wilfred Hughes b5187d98d0 Roll version 2022-03-17 22:22:42 +07:00
Wilfred Hughes e38d14a144 Prefer aligning blank lines in display
After we've aligned lines based on diff results, we have intermediate
lines that we need to align somehow. Previously, we'd just take them
in order, aligning the first on the LHS with the first on the RHS and
so on.

If the intermediate lines start or end with a sequence of blank lines,
prefer aligning the blank lines. If we have both, arbitrarily choose
the ending blank lines.

This has produced better results in many of the sample files, although
in the case of slow_before.rs we've just changed from a leading blank
line alignment to a trailing blank line alignment.
2022-03-17 22:16:45 +07:00
Wilfred Hughes a51b81e86d cargo fmt 2022-03-17 20:45:09 +07:00
Wilfred Hughes eb59e15cd8 Silence some of the noisier clippy lints 2022-03-17 20:13:27 +07:00
Wilfred Hughes 2ce09e0a56 Move functions only used in context calculations to context.rs 2022-03-17 09:32:34 +07:00
Wilfred Hughes 6d58247465 Preserve the outer delimiter when shrinking
Previously, we'd always discard the outer delimiter if it matched on
both sides. This prevented the tree diff finding optimal diffs.

Fixes #124
2022-03-16 23:38:05 +07:00
Wilfred Hughes f4f12003cb Fix minor clippy lints 2022-03-16 22:42:31 +07:00
Wilfred Hughes d709aa98b5 Fix typo 2022-03-15 21:52:09 +07:00
Wilfred Hughes 83eea58fbc Display cost when printing routes 2022-03-15 21:33:01 +07:00
Wilfred Hughes 02976415dc Remove unnecessary braces 2022-03-15 21:19:05 +07:00
Wilfred Hughes 40d3ccd06c Fix confusion between byte length and codepoint length in styling
We should split lines based on their codepoint length, so all our
lengths are on codepoint boundaries. We can then safely index by byte position.

All the positions are measured in bytes, not code points. Tweak
function names to make this explicit.

Fixes #149
2022-03-15 09:50:13 +07:00
Wilfred Hughes b2229d66a8 Always display all lines in a hunk
Previously we were assuming that the first/last line pairs in a hunk
contained the earliest/latest lines on both sides. This isn't true
when there are no common items between the lines.

This fixes some display issues in load_before/after.js, but include a
new integration test that is smaller and easier to eyeball.

Fixes #133
2022-03-15 00:11:30 +07:00
Wilfred Hughes 6d9dc8322f Tweak motto 2022-03-13 22:14:03 +07:00
Wilfred Hughes 8c469e3d36 Ensure slider correction sets the opposite node on both sides
Fixes #154
2022-03-13 21:23:58 +07:00
Wilfred Hughes 19d11951ce Update changelog for c0d1faae6 2022-03-13 20:26:59 +07:00
Wilfred Hughes 7e8bf37bbc Use NonZeroU32 for unique IDs
This is a modest memory saving (reducing instruction counts by 3%
too), and it's nice having a distinct type for IDs.
2022-03-13 18:41:39 +07:00
Wilfred Hughes 0eb9f82aed Include node ID when printing ChangeKind 2022-03-13 18:28:33 +07:00
Wilfred Hughes c0d1faae63 Keep exploring the graph even when we find matched delimiters
Previously we'd get tripped up by cases where choosing equal
delimiters would be be considered the same as entering each delimiter
separately, making diffs worse.

Fixes #147
2022-03-13 15:41:54 +07:00
Wilfred Hughes bd6a1cbdc6 Store can_pop_either in Vertex 2022-03-13 13:56:18 +07:00
Wilfred Hughes 85fc2c6340 cargo fmt 2022-03-13 13:16:40 +07:00
Wilfred Hughes d416ee289f Improve debug printing
Clarify that we're printing count, not node IDs.
2022-03-13 12:43:50 +07:00
Wilfred Hughes 615c18d63e Clarify cost choices more 2022-03-12 18:03:26 +07:00
Wilfred Hughes 20aa06db5f Remove obsolete comment 2022-03-12 18:01:08 +07:00
Wilfred Hughes fdf5c2f88e Improve bug report template wording
Closes #138
2022-03-12 17:44:54 +07:00
Wilfred Hughes 50a3eb958e
Update issue templates 2022-03-12 17:37:14 +07:00
Wilfred Hughes 9439a932e6 Optimise hunk search within matched_lines
This reduces instructions from 29B instructions to 22B for the sample
file in #153.
2022-03-12 14:40:49 +07:00
Wilfred Hughes dad463daf5 Use Myers' diff everywhere
The diff crate has a great ergonomic API, but it doesn't implement
Myers' algorithm and performs badly on large inputs.

https://github.com/utkarshkukreti/diff.rs/issues/1

Now that we have a wrapper wu_diff that provides a similar API,
replace the remaining call sites to diff::slice(). These are
relatively cold, so this is a small performance improvement (1%
instruction reduction).
2022-03-12 12:29:34 +07:00
Wilfred Hughes dbf088bd25 cargo fmt 2022-03-12 12:22:38 +07:00
Wilfred Hughes 6210921104 Use Myers' diff for word-level diffing too
This further improves performance on large text files. On the sample
files in #153, this improves performance from 99B instructions to 29B
instructions on my machine.
2022-03-12 12:19:57 +07:00
Wilfred Hughes 5d8af55231 Prefer myers_diff types 2022-03-12 12:17:26 +07:00
Wilfred Hughes edee567e61 Factor out a myers_diff module 2022-03-12 12:15:59 +07:00
Wilfred Hughes dba68d1d2a Don't run syntax highlighting when dumping the tree-sitter output
For large files, tree-sitter syntax highlighting is much more
expensive than the parse itself. We spend most of the runtime
advancing the tree-sitter query cursor.

This doesn't affect runtime of normal usage, but it helps debugging
and makes flamegraphs more readable.

Spotted in #153
2022-03-12 11:53:39 +07:00
Wilfred Hughes 2f8ccd94da Add sample file showing slow perf large adjacent lists
Part of #156.
2022-03-12 11:28:28 +07:00
Wilfred Hughes afb1b369f4 Switch to wu-diff for textual diffing
In #153 a user reported difftastic never terminated on a 140,000
file. This was due to the diff crate using a very large amount of time
and memory.

The diff crate does not use Myers' algorithm, which has a
divide-and-conquer approach using snakes:

https://blog.jcoglan.com/2017/03/22/myers-diff-in-linear-space-theory/

wu-diff does implement Myer's algorithm and performs much better on
these large files.
2022-03-10 23:12:25 +07:00
Wilfred Hughes e8d9ffa61c Remove unwanted debugging 2022-03-10 22:59:28 +07:00
Wilfred Hughes c81911be51 Allow the unchanged tree threshold to be changed with an env var
This is helpful when debugging production use cases. Fixes #155
2022-03-10 21:15:57 +07:00
Wilfred Hughes a3a2bfb317 Roll version 2022-03-10 00:13:26 +07:00
Wilfred Hughes 918b6a1ba3 Don't consider Hack files with <?hh headers and .php extension as PHP 2022-03-10 00:03:57 +07:00
Wilfred Hughes a10e91e1cf Add symlinks for building PHP parser 2022-03-09 23:55:18 +07:00
Wilfred Hughes ed0bde6b91 Adding support for PHP 2022-03-09 23:52:31 +07:00
Wilfred Hughes 51a4da6d7c Add 'vendor/tree-sitter-php/' from commit '0ce134234214427b6aeb2735e93a307881c6cd6f'
git-subtree-dir: vendor/tree-sitter-php
git-subtree-mainline: 020983cd85
git-subtree-split: 0ce1342342
2022-03-09 23:36:23 +07:00
Wilfred Hughes 020983cd85 Merge branch 'split_unchanged_regions'
This fixes #116
2022-03-09 23:12:02 +07:00
Wilfred Hughes 0927a2f9e6 Update changelog 2022-03-09 23:11:04 +07:00
Wilfred Hughes d8c16e561d Update regression tests 2022-03-09 23:04:44 +07:00