Most diff tools today use LCS algorithms. They typically apply to lines, but in some cases they're word-based.
GNU diff
This seems to be Myers algorithm. git-diff originally shelled out to diff directly,
see this source code review.
This is the default diff algorithm in GNU diff and git diff. It finds the longest common subsequence (LCS) and is used on a line-by-line basis.
# Modern diff supports colour, but see also
# https://www.colordiff.org/
$ diff --color=always -u sample_files/css_before.css sample_files/css_after.css
Patience Diff
Myers diff has a problem with sliders:
if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
+if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
$smtp_server = $_;
Instead of:
+if (!$smtp_server) {
+ $smtp_server = $repo->config('sendemail.smtpserver');
+}
if (!$smtp_server) {
foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
if (-x $_) {
Git has a --indent-heuristic that was added to reduce the
likelihood of making a bad
choice. There's
a corpus of test files
where the ideal diff has been chosen by a human, to test different
heuristics.
The patience diff algorithm is an LCS algorithm that aims to do a better job with sliders. It produces great results by doing more work.
# Original behaviour
$ git diff --no-indent-heuristic --no-index sample_files/css_before.css sample_files/css_after.css
# As of git 2.11, this heuristic is enabled by default.
$ git diff --indent-heuristic --no-index sample_files/css_before.css sample_files/css_after.css
# Patience algorithm does a better a job in this example.
$ git diff --patience --no-index sample_files/css_before.css sample_files/css_after.css
Diff Match Patch also has some excellent discussions of diff designs on the author's website (e.g diff strategies).
Histogram Diff
Git 1.7.7+ also has a histogram algorithm, which aims to produce better results than Myers' algorithm but without the slowdown of the patience algorithm.
# Inferior to patience on this example file.
$ git diff --histogram --no-index sample_files/css_before.css sample_files/css_after.css
Smart Hunk Headers
https://tekin.co.uk/2020/10/better-git-diff-output-for-ruby-python-elixir-and-more
@@ -24,7 +24,7 @@ class TicketPdf
ApplicationController.render(
"tickets/index.html.haml",
layout: "tickets",
- assigns: { tickets: tickets }
+ assigns: { tickets: tickets, event_name: event_name }
)
end
Note how class TicketPdf is shown at the beginning.
Side-by-side Diff
$ diff -y --color=always sample_files/css_before.css sample_files/css_after.css
prettydiff
prettydiff does really well out of the box with the sample files here. It implements LCS on words.
delta
delta performs line based diffs, then uses levenshtein distance to highlight parts within the line. It also offers syntax highlighting.
diff-so-fancy
diff-so-fancy consumes normal diff output, so it's line based. It also performs word highlighting within lines, and generally has a prettier set of defaults.
git-split-diffs
git-split-diffs emulates GitHub's diff formatting style in a terminal.