difftastic

Commit Graph

Author	SHA1	Message	Date
Wilfred Hughes	7f5c11c075	cargo fmt	2024-04-28 16:40:00 +07:00
Wilfred Hughes	8655a9464e	Fix unwanted duplicate node in existing vec Broken in previous commit. This is now only a few percentage points performance win, but it's still a net improvement.	2024-04-28 16:35:40 +07:00
Wilfred Hughes	d15d593708	Move to smallvec for seen vertices This is a surprisingly large perf win. On my Thinkpad: typing_before/after.ml: before: 3.038B instructions after: 2.870B instructions slow_before/after.rs: before: 2.381B instructions after: 1.260B instructions (!)	2024-04-28 16:16:47 +07:00
Steinar H. Gunderson	302570591f	Make Stack be allocated on the arena. This fixes another memory leak, and also removes the need for refcounting the Stack objects and the Node objects they point to.	2024-04-28 15:46:23 +07:00
Steinar H. Gunderson	4fb1478817	Fix memory leak in neighbours array. Vertex is allocated on the arena, so it is never dropped; then it cannot contain a Vec allocated on the regular heap without leaking memory. Replace the Vec with a slice allocated on the arena, which seems to fix most of the leaks. (Some may remain; I haven't checked fully.) It should also be slightly more memory-efficient. It's not clear that we actually need the RefCell instead of just putting Option directly into the structure, but I've let it stay. This issue was probably introduced in `a71d6118cf`.	2024-04-28 15:46:23 +07:00
Wilfred Hughes	93ae0e91db	Fix typos	2024-03-12 23:08:39 +07:00
Wilfred Hughes	3d29dc1228	Silence some clippy lints	2024-03-11 22:26:30 +07:00
Wilfred Hughes	cac80e992a	Avoid `res` locals in favour of more meaningful names	2023-11-28 13:27:27 +07:00
Wilfred Hughes	1dbcd08a90	cargo fmt	2023-11-19 13:10:41 +07:00
Wilfred Hughes	f2b3b34bec	Use pub(crate) everywhere for visibility This isn't strictly necessary since difftastic is a binary-only crate. However, it improves compiler warnings (see next commit) and potentially helps future changes to make difftastic available as a library.	2023-11-18 16:46:13 +07:00
Wilfred Hughes	27b14ae4c7	Clarify probably_punctuation	2023-11-11 11:14:49 +07:00
Wilfred Hughes	1e7866b64e	Do word diffing on text too	2023-09-12 13:03:27 +07:00
Wilfred Hughes	243a4a5f48	Group imports consistently This corresponds to: $ cargo +nightly fmt -- --config group_imports=StdExternalCrate Since this option is only available on nightly, I'm not adding a rustfmt.toml to enforce this, just doing it as a one-off run.	2023-09-12 12:32:51 +07:00
Wilfred Hughes	8731a1b908	Fix rustdoc warnings	2023-09-12 12:21:43 +07:00
Wilfred Hughes	11f457b5f9	Fix typo	2023-08-16 21:20:17 +07:00
Wilfred Hughes	c2b7042b80	Do subword highlighting in more cases This is useful when two strings substantially differ, but have the same e.g. end.	2023-07-10 21:26:24 +07:00
Wilfred Hughes	4aca79f220	Use the raw_entry_mut API on hashbrown::HashMap This saves us searching the hash map twice. This is a modest performance improvement: an instruction count reduction of 4% on slow_before.rs, and 1% reduction on typing_before.ml.	2023-07-09 22:49:37 +07:00
Wilfred Hughes	d9911e0b49	Move DftHashMap to a separate file	2023-07-09 15:37:51 +07:00
Wilfred Hughes	f2456a12b2	Use hashbrown for the alloc_if_new data This was intended to allow usage of .entry_ref(), but it's already a performance win without using that API! It's around a 9% reduction in instructions in slow_before.rs, and 2% reduction in typing_before.ml.	2023-07-09 11:11:03 +07:00
Wilfred Hughes	2607d17d73	Fix spelling in comment	2023-07-08 17:16:14 +07:00
Zhenge Chen	ffd49d523a	Detect replaced strings If a string is replaced with another, apply subword highlighting similar to how we handle replaced comments. Co-authored-by: Wilfred Hughes <me@wilfred.me.uk>	2023-07-08 17:16:06 +07:00
Wilfred Hughes	f86ba13abf	Increase punctuation cost to 200	2023-07-08 14:59:47 +07:00
Wilfred Hughes	495dbe5b14	Improve comments in Edge::cost	2023-07-08 14:53:33 +07:00
Wilfred Hughes	53855e415e	Reduce copying further in set_neighbours This saves a remarkable 8.5% of instructions on slow_before.rs.	2023-07-07 23:37:16 +07:00
Wilfred Hughes	a180fd6d24	Don't return the neighbours inside get_set_neighbours This caused unnecessarying closing, costing 0.2% instructions in some cases, and also made the code less readable.	2023-07-07 23:29:51 +07:00
Wilfred Hughes	c07e640b24	Remove contiguous penalty The contiguous penalty was an attempt to fix the slider problem: // Old A B C D // New A B A B C D // Unwanted diff A +B+ +A+ B C D However, it doesn't make sense for Dijkstra, which is stateless. The best route from vertex X is independent of how we got to vertex X. This worked by dumb luck: in some circumstances we terminate early rather than fully executing Dijkstra's algorithm. This cost tweak improved results on a few test files. However, the post-processing slider logic is a proper, general solution. This was added much later. There's no reason to keep the contiguous penalty now. It's confusing, and makes adding new edge costs with consistent 'X costs more than Y' behaviours more difficult. Performance is essentially neutral: a small decrease in typing_before.ml, a small increase in slow_before.rs.	2023-07-06 08:37:02 +07:00
Wilfred Hughes	31df177881	Increase the punctuation penalty This ensures that choosing a unchanged non-punctuation atom with some novel atoms is better than choosing punctuation and some changed comments. This produces better results in general, see comma_and_comment_after.js for an example. This will be more noticeable after the next commit, where costs of novel atoms are in a smaller range of values.	2023-07-06 08:16:24 +07:00
Wilfred Hughes	c3016eca4a	Add TODO	2023-07-06 08:14:03 +07:00
Wilfred Hughes	43c24047b4	Don't track contiguous status on novel delimiter edges This is harder to reason about, and `2e6666041f` did not include a motivating test case. Removing contiguous status is a minor perf improvement (2% reduction in instructions), makes the code simpler, and does not significantly affect diffing results. Of the two sample files that have changed, the erlang_before.erl file has improved and nest_before.rs is neutral.	2023-07-04 23:53:16 +07:00
Wilfred Hughes	1e4d1828c7	Store probably_punctuation on unchanged edges This is equivalent (increased cost on unchanged nodes vs decreased cost on changed nodes), but easier to reason about. Previously we have multiple notions of changed atoms: NovelAtomLHS, NovelAtomRHS, and ReplacedComment. We want to consider punctuation as less desirable even when e.g. comments arereplaced.	2023-07-03 19:48:31 +07:00
Wilfred Hughes	c405b58327	Fix cost for ReplacedComment This needs to be 2x novel nodes, or we prefer it far too often.	2023-07-02 23:12:31 +07:00
Wilfred Hughes	8d44e91a06	Improve lifetime names	2023-04-22 15:25:45 +07:00
Wilfred Hughes	29d87a6ac4	Adding TODO	2023-01-08 22:06:58 +07:00
Wilfred Hughes	c310fb34f9	Use u32 for edge cost This is performance neutral (both runtime and memory size) but the code is slightly readable as there are fewer conversions.	2023-01-08 21:34:49 +07:00
Wilfred Hughes	00ecf36a22	Pop delimiters immediately, rather than having ExitDelimiter* edges @QuarticCat observed that popping delimiters is unnecessary, and saw a speedup in PR #401. This reduces the number of nodes in typical graphs by ~20%, reducing runtime and memory usage. This works because there is only one thing we can do at the end of a list: pop the delimiter. The syntax node on the other side does not give us more options, we have at most one. Popping all the delimiters as soon as possible is equivalent, and produces the same graph route. This change has also slightly changed the output of samples_files/slow_after.rs, producing a better (more minimal) diff. This is probably luck, due to the path-dependent nature of the route solving logic, but it's a positive sign. A huge thanks to @QuarticCat for their contributions, this is a huge speedup. Co-authored-by: QuarticCat <QuarticCat@pm.me>	2022-12-28 02:00:09 +07:00
Wilfred Hughes	57d1f6d449	Reserve the vec inside allocate_if_new Pushing to this vec was showing 2.5% of total compute time in profiles.	2022-12-28 00:30:25 +07:00
Wilfred Hughes	923989d1a8	clippy fixes	2022-11-03 22:18:56 +07:00
QuarticCat	cd5ba54752	Reduce number of branches of Vertex::eq	2022-10-06 22:33:47 +07:00
QuarticCat	887dec7645	Remove field can_pop_either from Vertex	2022-10-06 22:31:48 +07:00
QuarticCat	7a8044696e	Simplify push_{lhs,rhs}_delimiter	2022-10-06 22:31:38 +07:00
QuarticCat	3b0edb43a1	Change a RefCell in Vertex to Cell	2022-09-28 05:56:53 +07:00
QuarticCat	2c6972c1b2	Fix more clippy warnings	2022-09-28 05:47:34 +07:00
QuarticCat	d48ee2dfdb	Use a faster stack impl	2022-09-28 04:08:42 +07:00
Wilfred Hughes	c602503dec	Treat . as punctuation Closes #388	2022-09-21 21:39:07 +07:00
Wilfred Hughes	fe5ef8757d	Give novel punctuation a lower edge cost We'd rather see an unchanged variable name than an unchanged comma. Fixes #366	2022-09-09 09:47:53 +07:00
Wilfred Hughes	c957818514	Explore two graph nodes for each parenthesis position This produces substantially better diff results, and fixes the 'last item in the list shown as changed' problem. This can produce slower diffing. typing_before.ml takes 10% more instructions and slow_before.rs takes 110% more instructions.	2022-08-21 16:34:17 +07:00
Wilfred Hughes	a71d6118cf	Store predecessors and neighbours as mutable fields in graph nodes This is a more traditional graph representation. It is slightly easier to reason about, and it's clearer that graph node creation time dominates graphs exploration. This is a slight performance regression, but it enables better exploration of parethesis nesting (see next commit). typing_before.ml has regressed from 3.75B instructions to 3.85B instructions and slow_before.rs has regressed from 1.73B instructions to 2.15B instructions. This change has also made the diff output for slow_before.rs slightly worse (note the `lhs` variable is now claimed as changed in more cases). It's not clear why, but presumably means that the node visit order has changed slightly. Closes #324	2022-08-21 16:25:54 +07:00
Wilfred Hughes	51ddcef393	Make clippy happier	2022-07-03 11:20:44 +07:00
Wilfred Hughes	d4285bed7c	Move more files into diff/	2022-05-25 09:31:12 +07:00
Wilfred Hughes	c5fe152f25	Define a parse submodule	2022-05-25 09:28:12 +07:00

1 2

51 Commits (7f5c11c0758c87cd0d4989fa2eed09fb06dc1c74)