difftastic/sample_files
Wilfred Hughes 0e3c57c64a Skip unique items before computing Myer's diff on text
This substantially improves performance on text files where there are
few lines in common.

For example, 10,000 line files with no lines in common is more than 10x
faster (8.5 seconds to 0.49 seconds on my machine), and
sample_files/huge_cpp_before.cpp is nearly 2% faster.

Fixes the case mentioned by @quackenbush in #236.

This is inspired by the heuristics discussions at
https://github.com/mitsuhiko/similar/issues/15
2023-01-15 11:38:02 +07:00
..
dir_after Only add colour to the first hunk header 2022-12-01 09:38:36 +07:00
dir_before Only add colour to the first hunk header 2022-12-01 09:38:36 +07:00
Session_after.kt Add Kotlin support 2022-04-14 00:21:29 +07:00
Session_before.kt Add Kotlin support 2022-04-14 00:21:29 +07:00
b2_math_after.h Add a C++ test file 2022-04-17 16:33:20 +07:00
b2_math_before.h Add a C++ test file 2022-04-17 16:33:20 +07:00
bad_combine_after.rs Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
bad_combine_before.rs Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
change_outer_after.el Keep exploring the graph even when we find matched delimiters 2022-03-13 15:41:54 +07:00
change_outer_before.el Keep exploring the graph even when we find matched delimiters 2022-03-13 15:41:54 +07:00
chinese_after.po add sample_files for Chinese (CJK fullwidth characters) 2022-07-04 12:01:49 +07:00
chinese_before.po add sample_files for Chinese (CJK fullwidth characters) 2022-07-04 12:01:49 +07:00
clojure_after.clj Define a highlighting file for Clojure 2022-03-22 21:08:46 +07:00
clojure_before.clj Define a highlighting file for Clojure 2022-03-22 21:08:46 +07:00
comma_after.js Give novel punctuation a lower edge cost 2022-09-09 09:47:53 +07:00
comma_before.js Give novel punctuation a lower edge cost 2022-09-09 09:47:53 +07:00
comments_after.rs Improve word highlighting heuristics in comments 2023-01-02 16:56:31 +07:00
comments_before.rs Improve word highlighting heuristics in comments 2023-01-02 16:56:31 +07:00
compare.expected Skip unique items before computing Myer's diff on text 2023-01-15 11:38:02 +07:00
compare_all.sh Unset LC_ALL and LC_COLLATE to stabilize regression test output 2022-09-08 22:37:27 +07:00
context_after.rs Always display all lines in a hunk 2022-03-15 00:11:30 +07:00
context_before.rs Always display all lines in a hunk 2022-03-15 00:11:30 +07:00
contiguous_after.js Add a sample file exercising contiguous item logic 2021-07-31 10:50:02 +07:00
contiguous_before.js Add a sample file exercising contiguous item logic 2021-07-31 10:50:02 +07:00
css_after.css Treat colour values (e.g. `#FFF`) as atoms in CSS 2023-01-08 22:22:46 +07:00
css_before.css Treat float values as atoms in CSS 2021-12-04 18:34:24 +07:00
dart_after.dart Add a Dart sample 2022-03-20 11:07:39 +07:00
dart_before.dart Add a Dart sample 2022-03-20 11:07:39 +07:00
elisp_after.el Lisp sample files 2019-11-18 11:51:45 +07:00
elisp_before.el Lisp sample files 2019-11-18 11:51:45 +07:00
elisp_contiguous_after.el Add another contiguous atoms test file 2022-01-15 18:52:03 +07:00
elisp_contiguous_before.el Add another contiguous atoms test file 2022-01-15 18:52:03 +07:00
elm_after.elm add newline to module exports 2022-04-03 21:37:17 +07:00
elm_before.elm add support for Elm 2022-04-03 20:18:33 +07:00
elvish_after.elv Add Elvish support 2022-05-07 20:12:43 +07:00
elvish_before.elv Add Elvish support 2022-05-07 20:12:43 +07:00
erlang_after.erl Document Erlang support and add test 2022-12-15 23:30:45 +07:00
erlang_before.erl Document Erlang support and add test 2022-12-15 23:30:45 +07:00
hack_after.php Add basic syntax highlighting for Hack 2022-02-02 23:39:00 +07:00
hack_before.php Add basic syntax highlighting for Hack 2022-02-02 23:39:00 +07:00
hare_after.ha Add support for Hare 2022-09-13 23:34:16 +07:00
hare_before.ha Add support for Hare 2022-09-13 23:34:16 +07:00
haskell_after.hs Support @boolean and @character highlighting queries 2022-04-03 22:36:15 +07:00
haskell_before.hs Support @boolean and @character highlighting queries 2022-04-03 22:36:15 +07:00
hcl_after.hcl fix: Add atoms for hcl 2022-04-24 15:57:51 +07:00
hcl_before.hcl fix: Add atoms for hcl 2022-04-24 15:57:51 +07:00
helpful-unit-test-after.el Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
helpful-unit-test-before.el Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
helpful_after.el Add another elisp sample file 2021-07-27 23:48:58 +07:00
helpful_before.el Add another elisp sample file 2021-07-27 23:48:58 +07:00
html_after.html Ensure we use the correct config for sublanguage parsing 2023-01-08 22:24:43 +07:00
html_before.html Add HTML parser 2022-07-01 12:23:20 +07:00
html_simple_after.html Add HTML parser 2022-07-01 12:23:20 +07:00
html_simple_before.html Add HTML parser 2022-07-01 12:23:20 +07:00
huge_cpp_after.cpp Add large files from #293 for test 2022-07-03 22:18:30 +07:00
huge_cpp_before.cpp Add large files from #293 for test 2022-07-03 22:18:30 +07:00
identical_after.scala Clarify wording when a parsed file has no changes at all 2022-03-26 23:31:12 +07:00
identical_before.scala Clarify wording when a parsed file has no changes at all 2022-03-26 23:31:12 +07:00
if_after.py Simplify Python example file 2021-08-30 21:20:17 +07:00
if_before.py Simplify Python example file 2021-08-30 21:20:17 +07:00
janet_after.janet Add sample files and update compare.expected 2022-03-29 14:53:10 +07:00
janet_before.janet Add sample files and update compare.expected 2022-03-29 14:53:10 +07:00
java_after.java Rename Java sample file for consistency 2021-11-14 13:34:46 +07:00
java_before.java Rename Java sample file for consistency 2021-11-14 13:34:46 +07:00
javascript_after.js Making the JS sample file more interesting 2019-11-18 17:59:04 +07:00
javascript_before.js Making the JS sample file more interesting 2019-11-18 17:59:04 +07:00
javascript_simple_after.js Rename JS sample file 2022-02-13 17:18:39 +07:00
javascript_simple_before.js Rename JS sample file 2022-02-13 17:18:39 +07:00
json_after.json Rename JSON files to match sample file naming convention 2021-10-23 16:24:54 +07:00
json_before.json Rename JSON files to match sample file naming convention 2021-10-23 16:24:54 +07:00
jsx_after.jsx Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
jsx_before.jsx Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
julia_after.jl Add Julia sample files 2022-07-04 19:57:00 +07:00
julia_before.jl Add Julia sample files 2022-07-04 19:57:00 +07:00
load_after.js Add a JS test file showing a larger change 2021-07-27 23:48:03 +07:00
load_before.js Add a JS test file showing a larger change 2021-07-27 23:48:03 +07:00
lua_after.lua Add lua support 2022-03-30 06:21:10 +07:00
lua_before.lua Add lua support 2022-03-30 06:21:10 +07:00
makefile_after.mk Replace tabs during display, so parsing sees the original source 2023-01-01 22:44:47 +07:00
makefile_before.mk Replace tabs during display, so parsing sees the original source 2023-01-01 22:44:47 +07:00
metadata_after.clj Add regression test for #181 2022-04-09 12:18:51 +07:00
metadata_before.clj Add regression test for #181 2022-04-09 12:18:51 +07:00
modules_after.ml Add sample file showing slow perf large adjacent lists 2022-03-12 11:28:28 +07:00
modules_before.ml Add sample file showing slow perf large adjacent lists 2022-03-12 11:28:28 +07:00
multibyte_after.py Fix confusion between byte length and codepoint length in styling 2022-03-15 09:50:13 +07:00
multibyte_before.py Fix confusion between byte length and codepoint length in styling 2022-03-15 09:50:13 +07:00
multiline_string_after.ml Ensure that blank lines in multiline strings are shown as changed 2022-03-23 22:47:17 +07:00
multiline_string_before.ml Ensure that blank lines in multiline strings are shown as changed 2022-03-23 22:47:17 +07:00
nest_after.rs Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
nest_before.rs Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
nested_slider_after.rs Fix nested sliders in C-like languages 2022-04-08 09:41:56 +07:00
nested_slider_before.rs Fix nested sliders in C-like languages 2022-04-08 09:41:56 +07:00
nesting_after.el Use depth difference as a heuristic when comparing equal nodes 2022-03-08 09:44:55 +07:00
nesting_before.el Use depth difference as a heuristic when comparing equal nodes 2022-03-08 09:44:55 +07:00
nix_after.nix Add support for Nix 2022-03-29 22:46:09 +07:00
nix_before.nix Add support for Nix 2022-03-29 22:46:09 +07:00
ocaml_after.ml Treat attribute IDs in OCaml as atoms 2022-01-27 20:25:17 +07:00
ocaml_before.ml Treat attribute IDs in OCaml as atoms 2022-01-27 20:25:17 +07:00
outer_delimiter_after.el Preserve the outer delimiter when shrinking 2022-03-16 23:38:05 +07:00
outer_delimiter_before.el Preserve the outer delimiter when shrinking 2022-03-16 23:38:05 +07:00
pascal_after.pascal Add Pascal support 2022-09-13 00:05:23 +07:00
pascal_before.pascal Add Pascal support 2022-09-13 00:05:23 +07:00
perl_after.pl Treat perl regexes as atoms too 2022-04-29 18:28:01 +07:00
perl_before.pl Treat perl regexes as atoms too 2022-04-29 18:28:01 +07:00
prefer_outer_after.el Prefer outer delimiter in lisps 2022-05-11 11:54:02 +07:00
prefer_outer_before.el Prefer outer delimiter in lisps 2022-05-11 11:54:02 +07:00
preprocesor_after.h Discard '\n' nodes in C and C++ 2022-03-27 23:37:23 +07:00
preprocesor_before.h Discard '\n' nodes in C and C++ 2022-03-27 23:37:23 +07:00
qml_after.qml Add support for QML 2022-09-10 11:38:35 +07:00
qml_before.qml Add support for QML 2022-09-10 11:38:35 +07:00
ruby_after.rb Configure atoms for Ruby 2021-11-20 14:46:12 +07:00
ruby_before.rb Add basic Ruby support 2021-11-20 01:08:33 +07:00
scala_after.scala Test comment highlighting in Scala 2022-02-02 23:07:25 +07:00
scala_before.scala Test comment highlighting in Scala 2022-02-02 23:07:25 +07:00
simple_after.js Ensure we always include the first and last hunk line 2022-01-22 18:46:55 +07:00
simple_after.txt Ensure we always include the first and last hunk line 2022-01-22 18:46:55 +07:00
simple_before.js Ensure we always include the first and last hunk line 2022-01-22 18:46:55 +07:00
simple_before.txt Ensure we always include the first and last hunk line 2022-01-22 18:46:55 +07:00
slider_after.rs Add files from #134 to sample file collection 2022-02-12 11:10:23 +07:00
slider_at_end_after.json Fix sliders in a single global pass 2022-09-02 18:10:09 +07:00
slider_at_end_before.json Fix sliders in a single global pass 2022-09-02 18:10:09 +07:00
slider_before.rs Add files from #134 to sample file collection 2022-02-12 11:10:23 +07:00
slow_after.rs Add benchmark file (takes 3-4 seconds today) 2021-09-12 21:01:54 +07:00
slow_before.rs Add benchmark file (takes 3-4 seconds today) 2021-09-12 21:01:54 +07:00
small_after.js Add a small JS sample file 2019-11-18 17:47:59 +07:00
small_before.js Add a small JS sample file 2019-11-18 17:47:59 +07:00
swift_after.swift Add Swift support 2022-04-26 17:08:23 +07:00
swift_before.swift Add Swift support 2022-04-26 17:08:23 +07:00
syntax_error_after.js Ensure sample file is a syntax error for tree-sitter 2021-12-04 18:03:06 +07:00
syntax_error_before.js Ensure sample file is a syntax error for tree-sitter 2021-12-04 18:03:06 +07:00
tab_after.c Allow users to override the tab width 2022-04-28 20:47:04 +07:00
tab_before.c Allow users to override the tab width 2022-04-28 20:47:04 +07:00
tailwind_after.css Treat error nodes as atoms 2022-10-15 22:50:08 +07:00
tailwind_before.css Treat error nodes as atoms 2022-10-15 22:50:08 +07:00
text_after.txt Expand text sample file 2022-01-02 19:18:19 +07:00
text_before.txt Expand text sample file 2022-01-02 19:18:19 +07:00
todomvc_after.gleam add gleam 2022-03-31 14:08:05 +07:00
todomvc_before.gleam add gleam 2022-03-31 14:08:05 +07:00
toml_after.toml Add support for TOML 2022-04-14 21:21:36 +07:00
toml_before.toml Add support for TOML 2022-04-14 21:21:36 +07:00
typescript_after.ts Treat predefined_type as an atom in TypeScript 2023-01-07 22:43:50 +07:00
typescript_before.ts Treat predefined_type as an atom in TypeScript 2023-01-07 22:43:50 +07:00
typing_after.ml Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
typing_before.ml Make sample files naming consistent so they're all used in regression test 2022-02-13 17:21:20 +07:00
utf16_after.py Add sample files missing from b1b3756fa7 2022-09-01 09:21:25 +07:00
utf16_before.py Add sample files missing from b1b3756fa7 2022-09-01 09:21:25 +07:00
whitespace_after.tsx Ensure unchanged MatchedPos have the same number on LHS and RHS 2022-04-09 21:12:31 +07:00
whitespace_before.tsx Ensure unchanged MatchedPos have the same number on LHS and RHS 2022-04-09 21:12:31 +07:00
yaml_after.yaml Improve YAML handling 2022-04-03 22:26:33 +07:00
yaml_before.yaml Fix block sclars in YAML 2022-04-14 18:45:48 +07:00
zig_after.zig Add Zig support 2022-03-30 23:32:48 +07:00
zig_before.zig Fix parsing of built-in Zig identifiers 2022-04-09 19:38:07 +07:00