Prefer 'line-oriented diff' terminology

'Text diff' is confusing as the input files to difftastic's structural
diffing logic is also text. Line-oriented more accurately captures the
important aspect.
pull/832/head
Wilfred Hughes 2025-04-28 09:03:22 +07:00
parent eff3e16cad
commit 149c040c45
10 changed files with 77 additions and 119 deletions

@ -291,7 +291,7 @@ types.
### Diffing ### Diffing
Fixed an issue where adding or removing blank lines would be ignored Fixed an issue where adding or removing blank lines would be ignored
by the textual diffing logic. by the line-oriented diffing logic.
Directory diffing now respects `.gitignore` files. Directory diffing now respects `.gitignore` files.
@ -512,8 +512,8 @@ UTF-16. Many files can be decoded as UTF-16 without decoding errors
but produce nonsense results, so this heuristic seems to work better. but produce nonsense results, so this heuristic seems to work better.
Fixed an issue where difftastic would discard the last newline in a Fixed an issue where difftastic would discard the last newline in a
file before diffing. This was most noticeable when doing textual diffs file before diffing. This was most noticeable when doing line-oriented
and the last line had changed. diffs and the last line had changed.
### Display ### Display
@ -543,12 +543,12 @@ Added support for Newick and Racket.
### Diffing ### Diffing
Difftastic now uses a textual diff on files that have any parse Difftastic now uses a line-oriented diff on files that have any parse
errors. The parse error limit defaults to 0, but it is configurable errors. The parse error limit defaults to 0, but it is configurable
with `DFT_PARSE_ERROR_LIMIT` or `--parse-error-limit`. with `DFT_PARSE_ERROR_LIMIT` or `--parse-error-limit`.
Textual diffing now respects `--check-only`, consistent with syntactic Line-oriented diffing now respects `--check-only`, consistent with
diffing. structural diffing.
### Display ### Display
@ -603,13 +603,13 @@ Improved CSS parsing and HTML sublanguage parsing.
Added an `--ignore-comments` option. Added an `--ignore-comments` option.
Improved textual diffing performance, particularly when the two files Improved line-oriented diffing performance, particularly when the two
have few lines in common. files have few lines in common.
### Display ### Display
Fixed an issue with unwanted underlines with textual diffing when Fixed an issue with unwanted underlines with line-oriented diffing
DFT_BYTE_LIMIT is reached. when DFT_BYTE_LIMIT is reached.
Fixed a crash in inline display when the file ends with whitespace. Fixed a crash in inline display when the file ends with whitespace.
@ -640,8 +640,8 @@ constituents.
`--display=inline` now respects `--tab-width`. `--display=inline` now respects `--tab-width`.
Fixed an issue with unwanted underlines with textual diffing when Fixed an issue with unwanted underlines with line-oriented diffing
DFT_GRAPH_LIMIT is reached. when DFT_GRAPH_LIMIT is reached.
Improved syntax highlighting for predefined types in TypeScript. Improved syntax highlighting for predefined types in TypeScript.
@ -1241,9 +1241,9 @@ Fixed a crash when line wrapping produced an entirely blank line.
### Diffing ### Diffing
Improved word diffing (in both comments and textual diffs) when source Improved word diffing (in both comments and line-oriented diffs) when
contains Unicode characters. Word splitting now uses the Unicode the source contains Unicode characters. Word splitting now uses the
alphabetic property. Unicode alphabetic property.
Fixed a crash when comments contained multibyte Unicode characters. Fixed a crash when comments contained multibyte Unicode characters.
@ -1308,12 +1308,12 @@ Improved minor display issues when one file is longer than the other.
If given binary files, difftastic will now report if the file contents If given binary files, difftastic will now report if the file contents
are identical. are identical.
Difftastic will now use a text diff for large files, rather than Difftastic will now use a line-oriented diff for large files, rather
trying to use more memory than is available. This threshold is than trying to use more memory than is available. This threshold is
configurable with `--node-limit` and `DFT_NODE_LIMIT`. configurable with `--node-limit` and `DFT_NODE_LIMIT`.
Fixed a bug in the text diff logic where lines weren't shown if they Fixed a bug in the line-oriented diff logic where lines weren't shown
did not have both word additions and word removals. if they did not have both word additions and word removals.
### Command Line Interface ### Command Line Interface
@ -1433,9 +1433,9 @@ Fixed a parsing performance regression introduced in 0.13.
### Diffing ### Diffing
Text diffing now has a standalone implementation rather than reusing Line-oriented diffing now has a standalone implementation rather than
structural diff logic. This is significantly faster and highlighted reusing structural diff logic. This is significantly faster and
better. highlighted better.
Improved performance when diffing two identical files. This is common Improved performance when diffing two identical files. This is common
when diffing directories. when diffing directories.

@ -99,8 +99,8 @@ favourite tool, and I will link it in the README!
### What about parse errors? ### What about parse errors?
By default, difftastic falls back to a line-oriented text diff By default, difftastic falls back to a line-oriented diff whenever
whenever parse errors are encountered. parse errors are encountered.
This is a conservative choice to ensure that difftastic never claims This is a conservative choice to ensure that difftastic never claims
two syntactically different files are the same. two syntactically different files are the same.

@ -26,7 +26,7 @@ Set the background brightness.
Difftastic will prefer brighter colours on dark backgrounds. Difftastic will prefer brighter colours on dark backgrounds.
.TP .TP
\f[B]\-\-byte\-limit\f[R] \f[I]LIMIT\f[R] \f[B]\-\-byte\-limit\f[R] \f[I]LIMIT\f[R]
Use a text diff if either input file exceeds this size. Use a line\-oriented diff if either input file exceeds this size.
.TP .TP
\f[B]\-\-check\-only\f[R] \f[B]\-\-check\-only\f[R]
Report whether there are any changes, but don\[cq]t calculate them. Report whether there are any changes, but don\[cq]t calculate them.
@ -65,8 +65,8 @@ language or binary files), sets the exit code if there are any byte
changes. changes.
.TP .TP
\f[B]\-\-graph\-limit\f[R] \f[I]LIMIT\f[R] \f[B]\-\-graph\-limit\f[R] \f[I]LIMIT\f[R]
Use a text diff if the structural graph exceed this number of nodes in Use a line\-oriented diff if the structural graph exceed this number of
memory. nodes in memory.
.TP .TP
\f[B]\-h, \-\-help\f[R] \f[B]\-h, \-\-help\f[R]
Print help information. Print help information.
@ -108,7 +108,8 @@ When multiple overrides are specified, the first matching override wins.
.RE .RE
.TP .TP
\f[B]\-\-parse\-error\-limit\f[R] \f[I]LIMIT\f[R] \f[B]\-\-parse\-error\-limit\f[R] \f[I]LIMIT\f[R]
Use a text diff if the number of parse errors exceeds this value. Use a line\-oriented diff if the number of parse errors exceeds this
value.
.TP .TP
\f[B]\-\-skip\-unchanged\f[R] \f[B]\-\-skip\-unchanged\f[R]
Don\[cq]t display anything if a file is unchanged. Don\[cq]t display anything if a file is unchanged.

@ -35,7 +35,7 @@ OPTIONS
**\-\-byte-limit** _LIMIT_ **\-\-byte-limit** _LIMIT_
: Use a text diff if either input file exceeds this size. : Use a line-oriented diff if either input file exceeds this size.
**\-\-check-only** **\-\-check-only**
@ -71,7 +71,7 @@ OPTIONS
**\-\-graph-limit** _LIMIT_ **\-\-graph-limit** _LIMIT_
: Use a text diff if the structural graph exceed this number of nodes in memory. : Use a line-oriented diff if the structural graph exceed this number of nodes in memory.
**-h, \-\-help** **-h, \-\-help**
@ -111,7 +111,7 @@ OPTIONS
**\-\-parse-error-limit** _LIMIT_ **\-\-parse-error-limit** _LIMIT_
: Use a text diff if the number of parse errors exceeds this value. : Use a line-oriented diff if the number of parse errors exceeds this value.
**\-\-skip-unchanged** **\-\-skip-unchanged**

@ -1,4 +1,4 @@
<!DOCTYPE html> <!doctype html>
<html lang="en" data-bs-theme="dark"> <html lang="en" data-bs-theme="dark">
<head> <head>
<meta charset="utf-8" /> <meta charset="utf-8" />
@ -94,8 +94,8 @@
Difftastic parses your code with Difftastic parses your code with
<a href="https://tree-sitter.github.io/tree-sitter/" <a href="https://tree-sitter.github.io/tree-sitter/"
>tree-sitter</a >tree-sitter</a
>. Unlike a line-oriented text diff, difftastic understands that >. Unlike a line-oriented diff, difftastic understands that the
the inner expression hasn't changed here. inner expression hasn't changed here.
</p> </p>
</div> </div>
</div> </div>
@ -189,9 +189,7 @@
type="image/svg+xml" type="image/svg+xml"
></object> ></object>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">C++</h3>
C++
</h3>
</div> </div>
<div class="col d-flex align-items-start"> <div class="col d-flex align-items-start">
@ -203,9 +201,7 @@
type="image/svg+xml" type="image/svg+xml"
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">C#</h3>
C#
</h3>
</div> </div>
</div> </div>
@ -219,9 +215,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Clojure</h3>
Clojure
</h3>
</div> </div>
</div> </div>
@ -235,9 +229,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Dart</h3>
Dart
</h3>
</div> </div>
</div> </div>
@ -251,9 +243,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Erlang</h3>
Erlang
</h3>
</div> </div>
</div> </div>
@ -267,9 +257,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Go</h3>
Go
</h3>
</div> </div>
</div> </div>
@ -283,9 +271,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Haskell</h3>
Haskell
</h3>
</div> </div>
</div> </div>
@ -299,9 +285,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Java</h3>
Java
</h3>
</div> </div>
</div> </div>
@ -315,9 +299,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">JavaScript</h3>
JavaScript
</h3>
</div> </div>
</div> </div>
@ -331,9 +313,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Kotlin</h3>
Kotlin
</h3>
</div> </div>
</div> </div>
@ -347,9 +327,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Lisp</h3>
Lisp
</h3>
</div> </div>
</div> </div>
@ -363,9 +341,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Lua</h3>
Lua
</h3>
</div> </div>
</div> </div>
@ -379,9 +355,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">OCaml</h3>
OCaml
</h3>
</div> </div>
</div> </div>
@ -395,9 +369,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">PHP</h3>
PHP
</h3>
</div> </div>
</div> </div>
@ -410,9 +382,7 @@
type="image/svg+xml" type="image/svg+xml"
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Python</h3>
Python
</h3>
</div> </div>
</div> </div>
@ -426,9 +396,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">R</h3>
R
</h3>
</div> </div>
</div> </div>
@ -442,9 +410,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Ruby</h3>
Ruby
</h3>
</div> </div>
</div> </div>
@ -458,9 +424,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Rust</h3>
Rust
</h3>
</div> </div>
</div> </div>
@ -474,9 +438,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">Scala</h3>
Scala
</h3>
</div> </div>
</div> </div>
@ -490,9 +452,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">TypeScript</h3>
TypeScript
</h3>
</div> </div>
</div> </div>
@ -528,9 +488,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">HCL</h3>
HCL
</h3>
</div> </div>
</div> </div>
@ -544,9 +502,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">HTML</h3>
HTML
</h3>
</div> </div>
</div> </div>
@ -560,9 +516,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">JSON</h3>
JSON
</h3>
</div> </div>
</div> </div>
@ -576,9 +530,7 @@
></object> ></object>
<div> <div>
<h3 class="fw-bold mb-0 fs-4"> <h3 class="fw-bold mb-0 fs-4">YAML</h3>
YAML
</h3>
</div> </div>
</div> </div>
<!-- end formats --> <!-- end formats -->
@ -646,7 +598,7 @@
theme: "tango", theme: "tango",
cols: 133, cols: 133,
rows: 24, rows: 24,
} },
); );
</script> </script>
</body> </body>

@ -16,6 +16,10 @@ the hunk.
**LHS**: Left-hand side. Difftastic compares two items, and LHS refers **LHS**: Left-hand side. Difftastic compares two items, and LHS refers
to the first item. See also 'RHS'. to the first item. See also 'RHS'.
**Line-oriented**: A traditional diff that compares which lines have
been added or removed, unlike difftastic. For example, GNU diff or the
diffs displayed on GitHub.
**List**: A list is an item in difftastic's syntax tree structure that **List**: A list is an item in difftastic's syntax tree structure that
has an open delimiter, children, and a close delimiter. It represents has an open delimiter, children, and a close delimiter. It represents
things like expressions and function definitions. See also 'atom'. things like expressions and function definitions. See also 'atom'.

@ -52,15 +52,15 @@ A line-oriented diff does a much worse job here.
</code> </code>
</pre> </pre>
Some textual diff tools also highlight word changes (e.g. GitHub or Some line-oriented diff tools also highlight word changes (e.g. GitHub
git's `--word-diff`). They still don't understand the code or git's `--word-diff`). They still don't understand the code
though. Difftastic will always find matched delimiters: you can see though. Difftastic will always find matched delimiters: you can see
the closing `)` from `or_else` has been highlighted. the closing `)` from `or_else` has been highlighted.
## Fallback Textual Diffing ## Fallback Line-Oriented Diffing
If input files are not in a format that difftastic understands, it If input files are not in a format that difftastic understands, it
uses a conventional line-oriented text diff with word highlighting. uses a conventional line-oriented diff with word highlighting.
Difftastic will also use textual diffing when given extremely large Difftastic will also use line-oriented diffing when given extremely
inputs. large inputs.

@ -29,7 +29,7 @@
//! can change which item is marked as novel (e.g. either `B` in the //! can change which item is marked as novel (e.g. either `B` in the
//! example above) whilst still showing a valid, minimal diff. //! example above) whilst still showing a valid, minimal diff.
//! //!
//! A similar problem exists with line-based textual diffs, see //! A similar problem exists with line-oriented diffs, see
//! [diff-slider-tools](https://github.com/mhagger/diff-slider-tools) //! [diff-slider-tools](https://github.com/mhagger/diff-slider-tools)
//! for a thorough discussion. //! for a thorough discussion.

@ -107,8 +107,9 @@ fn line_len_in_bytes(line: &str) -> usize {
} }
} }
/// Build a vec of MatchedPos, performing a textual diff. Match up /// Build a vec of MatchedPos, performing a line-oriented diff. Match
/// unchanged lines, and match up unchanged words within novel lines. /// up unchanged lines, and match up unchanged words within novel
/// lines.
/// ///
/// The resulting vec only has novel items from the LHS. Callers /// The resulting vec only has novel items from the LHS. Callers
/// should do `change_positions(rhs_src, lhs_src)` to obtain /// should do `change_positions(rhs_src, lhs_src)` to obtain

@ -280,7 +280,7 @@ When multiple overrides are specified, the first matching override wins."))
Arg::new("byte-limit").long("byte-limit") Arg::new("byte-limit").long("byte-limit")
.value_name("LIMIT") .value_name("LIMIT")
.action(ArgAction::Set) .action(ArgAction::Set)
.help("Use a text diff if either input file exceeds this size.") .help("Use a line-oriented diff if either input file exceeds this size.")
.default_value(format!("{}", DEFAULT_BYTE_LIMIT)) .default_value(format!("{}", DEFAULT_BYTE_LIMIT))
.env("DFT_BYTE_LIMIT") .env("DFT_BYTE_LIMIT")
.value_parser(clap::value_parser!(usize)) .value_parser(clap::value_parser!(usize))
@ -289,7 +289,7 @@ When multiple overrides are specified, the first matching override wins."))
.arg( .arg(
Arg::new("graph-limit").long("graph-limit") Arg::new("graph-limit").long("graph-limit")
.value_name("LIMIT") .value_name("LIMIT")
.help("Use a text diff if the internal graph exceeds this number of vertices. This limit controls the worst case runtime and memory usage for difftastic. .help("Use a line-oriented diff if the internal graph exceeds this number of vertices. This limit controls the worst case runtime and memory usage for difftastic.
Higher values will allow difftastic to perform a structural diff in more cases. Higher values will also increase the time before difftastic gives up on structural diffing, and increase peak memory usage.") Higher values will allow difftastic to perform a structural diff in more cases. Higher values will also increase the time before difftastic gives up on structural diffing, and increase peak memory usage.")
.default_value(format!("{}", DEFAULT_GRAPH_LIMIT)) .default_value(format!("{}", DEFAULT_GRAPH_LIMIT))
@ -302,7 +302,7 @@ Higher values will allow difftastic to perform a structural diff in more cases.
Arg::new("parse-error-limit").long("parse-error-limit") Arg::new("parse-error-limit").long("parse-error-limit")
.value_name("LIMIT") .value_name("LIMIT")
.action(ArgAction::Set) .action(ArgAction::Set)
.help("Use a text diff if the number of parse errors exceeds this value.") .help("Use a line-oriented diff if the number of parse errors exceeds this value.")
.default_value(format!("{}", DEFAULT_PARSE_ERROR_LIMIT)) .default_value(format!("{}", DEFAULT_PARSE_ERROR_LIMIT))
.env("DFT_PARSE_ERROR_LIMIT") .env("DFT_PARSE_ERROR_LIMIT")
.value_parser(clap::value_parser!(usize)) .value_parser(clap::value_parser!(usize))