difftastic

Commit Graph

Author	SHA1	Message	Date
Wilfred Hughes	117d20c527	Add doc comment	2025-10-04 17:14:09 +07:00
Wilfred Hughes	cabe203465	Improve doc comment	2025-07-30 09:40:25 +07:00
Wilfred Hughes	ba45a40f71	Elide lifetimes in more places Versions of clippy after the MSRV complain about these, and it's fine on our current Rust version too.	2025-03-18 00:27:11 +07:00
Wilfred Hughes	d8b715bd5b	Rename myers_diff to LCS diff as it's not actually Myers algorithm	2025-03-09 23:55:08 +07:00
Wilfred Hughes	ca9b7da43f	Run cargo fmt	2025-03-06 23:03:40 +07:00
Wilfred Hughes	8953c55cf8	Pass String to new_atom This is a very tiny perf hit, but allows us to pass newly allocated strings to new_atom(), which will be necessary for normalising case-insensitive languages.	2025-02-23 20:08:45 +07:00
Wilfred Hughes	649c557708	Fix some clippy lints	2024-12-19 21:29:31 +07:00
Wilfred Hughes	39e645832e	Fix compilation on older Rust versions	2024-11-15 22:08:15 +07:00
Wilfred Hughes	d5b1e26d70	Add a debug helper for syntax tree as DOT	2024-11-14 22:55:00 +07:00
Wilfred Hughes	819a672df8	Clarify content ID in debug output on Syntax	2024-11-15 00:03:30 +07:00
Wilfred Hughes	549cb483fe	Fix crash due to trailing newlines in string nodes at EOF Fixes #782	2024-11-15 00:03:30 +07:00
Andreas Deininger	5ecf3c1eb2	Bump GitHub action workflows to their latest versions	2024-09-11 21:22:59 +07:00
Wilfred Hughes	0973998de2	Clarify enum variant NovelLinePart and expand doc comments	2024-07-30 15:33:37 +07:00
Wilfred Hughes	92fa3fb3de	Ensure files with no common content are aligned	2024-07-20 23:43:04 +07:00
Wilfred Hughes	ffe27c575e	Ensure line splitting distinguishes "foo" and "foo\n" We rely on being able to split lines and rejoin them to obtain the original string. `str::lines()` in the Rust stdlib does not have this property. This was causing crashes in word-diffing on textual diffing, where code paths differed on the number of lines they thought a string had. This was broken in `8b842387a1`. Fixes #688.	2024-07-20 16:09:44 +07:00
Wilfred Hughes	03d1f9bf26	Lint against .to_string() on String	2024-05-07 08:39:07 +07:00
Wilfred Hughes	5e38261b77	cargo fmt	2024-02-29 00:56:16 +07:00
Wilfred Hughes	7e8f928926	Add doc comments	2024-02-29 00:10:52 +07:00
Wilfred Hughes	cac80e992a	Avoid `res` locals in favour of more meaningful names	2023-11-28 13:27:27 +07:00
Wilfred Hughes	569f0038d1	Always filter blank lines at start and end in positions Fixes #595	2023-11-28 12:35:28 +07:00
Wilfred Hughes	d89d057345	Clarify parameter name	2023-11-28 11:57:11 +07:00
Wilfred Hughes	e96c9463a0	Fix typo	2023-11-28 11:15:11 +07:00
Wilfred Hughes	1ec868e1df	Update to latest line-numbers	2023-11-19 13:11:07 +07:00
Wilfred Hughes	f2b3b34bec	Use pub(crate) everywhere for visibility This isn't strictly necessary since difftastic is a binary-only crate. However, it improves compiler warnings (see next commit) and potentially helps future changes to make difftastic available as a library.	2023-11-18 16:46:13 +07:00
Wilfred Hughes	60d0f61cbd	Define a separate words module	2023-11-18 16:46:13 +07:00
Wilfred Hughes	6dd0c70767	Add TODO	2023-09-12 13:05:05 +07:00
Wilfred Hughes	1e7866b64e	Do word diffing on text too	2023-09-12 13:03:27 +07:00
Wilfred Hughes	243a4a5f48	Group imports consistently This corresponds to: $ cargo +nightly fmt -- --config group_imports=StdExternalCrate Since this option is only available on nightly, I'm not adding a rustfmt.toml to enforce this, just doing it as a one-off run.	2023-09-12 12:32:51 +07:00
Wilfred Hughes	b78ba2da4b	Use type names from line_numbers directly	2023-08-26 20:36:07 +07:00
Wilfred Hughes	41c9165c79	Use my line_numbers crate for newline position calculations	2023-08-26 16:25:32 +07:00
Wilfred Hughes	f6ceb2aefd	Update unit test new subword highlighting heuristic	2023-07-12 12:48:45 +07:00
Wilfred Hughes	a814e01d22	Improve word diffing heuristic and add another sample file	2023-07-12 12:12:32 +07:00
Wilfred Hughes	1d3b6836ef	Handle multiline atoms more accurately in split_atom_words	2023-07-12 11:49:39 +07:00
Wilfred Hughes	5824322244	Require some common words to do subword highlighting This is important when comparing short string literals. This change has improved several cases in sample_files/ but I've added a new example that made the previous unwanted behaviour much more obvious.	2023-07-10 09:03:21 +07:00
Wilfred Hughes	8eb949eb02	Use DftHashMap everywhere This is a 4% reduction in instructions for typing_before.ml, but a 0.2% increase instructions for slow_before.rs. This seems like a win overall, and it also keeps the codebase more consistent and simpler.	2023-07-09 15:41:01 +07:00
Wilfred Hughes	27f59c0b3a	Don't treat - as a word constituent This produces slightly better results with some string replacements.	2023-07-08 17:16:14 +07:00
Zhenge Chen	ffd49d523a	Detect replaced strings If a string is replaced with another, apply subword highlighting similar to how we handle replaced comments. Co-authored-by: Wilfred Hughes <me@wilfred.me.uk>	2023-07-08 17:16:06 +07:00
Wilfred Hughes	87d27c5598	Only split numbers inside comments Inside text files, it seems to be better to be conservative and consider abc123def as one word rather than three. This is noticeable when looking at changes to the compare.expected file, which contains hashes. 123c456 and 345c789 don't really have a `c` in common, so subword highlighting is ugly.	2023-07-07 08:40:06 +07:00
Wilfred Hughes	c07e640b24	Remove contiguous penalty The contiguous penalty was an attempt to fix the slider problem: // Old A B C D // New A B A B C D // Unwanted diff A +B+ +A+ B C D However, it doesn't make sense for Dijkstra, which is stateless. The best route from vertex X is independent of how we got to vertex X. This worked by dumb luck: in some circumstances we terminate early rather than fully executing Dijkstra's algorithm. This cost tweak improved results on a few test files. However, the post-processing slider logic is a proper, general solution. This was added much later. There's no reason to keep the contiguous penalty now. It's confusing, and makes adding new edge costs with consistent 'X costs more than Y' behaviours more difficult. Performance is essentially neutral: a small decrease in typing_before.ml, a small increase in slow_before.rs.	2023-07-06 08:37:02 +07:00
Wilfred Hughes	3730580ca3	Improve word splitting heuristics This is particularly noticeable when diffing comments with timestamps 2000-12-31T23:59:59 where we don't want 31T23 to be a single word.	2023-06-29 08:33:30 +07:00
Wilfred Hughes	87f19f5e10	Don't including trailing newlines in comment nodes This makes constructing hunks harder to reason about. This change doesn't affect output, but helps when debugging, as it makes multiline atoms much less common.	2023-04-30 09:51:39 +07:00
Wilfred Hughes	d521b29c9e	set_prev_sibling should always recurse	2023-04-20 08:42:10 +07:00
Wilfred Hughes	8b842387a1	Don't clean trailing newline before diffing Difftastic should take the user's input as-is, or it risks performing an incorrect diff in both textual and syntactic diffing. Fixes #499	2023-03-30 08:46:11 +07:00
Wilfred Hughes	c9105ca0ba	cargo fmt	2023-01-15 15:49:24 +07:00
Wilfred Hughes	a488efd63b	Add highlighting for ignored syntactic elements This finishes --ignore-comment support. Fixes #449.	2023-01-15 14:49:46 +07:00
Wilfred Hughes	0e3c57c64a	Skip unique items before computing Myer's diff on text This substantially improves performance on text files where there are few lines in common. For example, 10,000 line files with no lines in common is more than 10x faster (8.5 seconds to 0.49 seconds on my machine), and sample_files/huge_cpp_before.cpp is nearly 2% faster. Fixes the case mentioned by @quackenbush in #236. This is inspired by the heuristics discussions at https://github.com/mitsuhiko/similar/issues/15	2023-01-15 11:38:02 +07:00
Wilfred Hughes	8a799af0ff	cargo fmt	2023-01-06 18:18:37 +07:00
Wilfred Hughes	d8d4b8c003	Add is_all_whitespace helper function	2023-01-06 08:36:54 +07:00
Wilfred Hughes	0fc1842595	Improve word highlighting heuristics in comments Previously we highlighted changed whitespace, which led to ugly results if the number of words changed (there was a different number of whitespace characters so some were highlighted). Also treat _ and - as word constituents, as it produces nicer results when people write example CLI invocations in comments.	2023-01-02 16:56:31 +07:00
QuarticCat	2c6972c1b2	Fix more clippy warnings	2022-09-28 05:47:34 +07:00

1 2

65 Commits (117d20c5274b6ed95f728a744ef6954738e7557d)