Example Domain
++ This domain is for use in illustrative examples in documents. You may + use this domain in literature without prior coordination or asking for + permission. +
+ +diff --git a/CHANGELOG.md b/CHANGELOG.md index 7767219c6..5b48941bc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,9 @@ ## 0.30 (unreleased) +### Parsing + +Added support for HTML. + ## 0.29.1 (released 13th June 2022) Fixed a major memory regression in 0.29 when performing large diff --git a/build.rs b/build.rs index 249f27b69..4a508a0c1 100644 --- a/build.rs +++ b/build.rs @@ -137,6 +137,11 @@ fn main() { src_dir: "vendor/tree-sitter-hcl-src", extra_files: vec!["scanner.cc"], }, + TreeSitterParser { + name: "tree-sitter-html", + src_dir: "vendor/tree-sitter-html-src", + extra_files: vec!["scanner.cc"], + }, TreeSitterParser { name: "tree-sitter-janet-simple", src_dir: "vendor/tree-sitter-janet-simple-src", diff --git a/manual/src/languages_supported.md b/manual/src/languages_supported.md index a596059af..6ce42ad35 100644 --- a/manual/src/languages_supported.md +++ b/manual/src/languages_supported.md @@ -40,6 +40,7 @@ Difftastic also supports the following structured text formats. |----------|-----------------------------------------------------------------------------------| | CSS | [tree-sitter/tree-sitter-css](https://github.com/tree-sitter/tree-sitter-css) | | HCL | [MichaHoffmann/tree-sitter-hcl](https://github.com/MichaHoffmann/tree-sitter-hcl) | +| HTML | [tree-sitter/tree-sitter-html](https://github.com/tree-sitter/tree-sitter-html) | | JSON | [tree-sitter/tree-sitter-json](https://github.com/tree-sitter/tree-sitter-json) | | TOML | [ikatyang/tree-sitter-toml](https://github.com/ikatyang/tree-sitter-toml) | | YAML | [ikatyang/tree-sitter-yaml](https://github.com/ikatyang/tree-sitter-yaml) | diff --git a/sample_files/compare.expected b/sample_files/compare.expected index d865a1078..fa7107b44 100644 --- a/sample_files/compare.expected +++ b/sample_files/compare.expected @@ -52,6 +52,12 @@ bce74573e003cc6b729a63a4bc34c4af - sample_files/helpful-unit-test-before.el sample_files/helpful-unit-test-after.el 79597af48ff80bcf9f5d02d20c51606d - +sample_files/html_before.html sample_files/html_after.html +949b14014822274f3578636275c8e6d6 - + +sample_files/html_simple_before.html sample_files/html_simple_after.html +13b374996a2b449f79638b2ddcf0c5d8 - + sample_files/identical_before.scala sample_files/identical_after.scala 9c7319f61833e46a0a8cb6c01cc997c9 - diff --git a/sample_files/html_after.html b/sample_files/html_after.html new file mode 100644 index 000000000..99b339277 --- /dev/null +++ b/sample_files/html_after.html @@ -0,0 +1,58 @@ + + +
++ This domain is for use in illustrative examples in documents. You may + use this domain in literature without prior coordination or asking for + permission. +
+ +This domain is for use in illustrative examples in documents. You may use this + domain in literature without prior coordination or asking for permission.
+ +Story about bar.
+ + diff --git a/sample_files/html_simple_before.html b/sample_files/html_simple_before.html new file mode 100644 index 000000000..cabc15bb6 --- /dev/null +++ b/sample_files/html_simple_before.html @@ -0,0 +1,9 @@ + + +Story about foo.
+ + diff --git a/src/diff/sliders.rs b/src/diff/sliders.rs index e81123200..e8227b0be 100644 --- a/src/diff/sliders.rs +++ b/src/diff/sliders.rs @@ -64,7 +64,7 @@ fn prefer_outer_delimiter(language: guess_language::Language) -> bool { // languages have syntax like `foo(bar)` or `foo[bar]` where // the inner delimiter is more relevant. Bash | C | CPlusPlus | CSharp | Css | Dart | Elixir | Elm | Elvish | Gleam | Go - | Haskell | Java | JavaScript | Jsx | Kotlin | Lua | Nix | OCaml | OCamlInterface + | Haskell | Html | Java | JavaScript | Jsx | Kotlin | Lua | Nix | OCaml | OCamlInterface | Perl | Php | Python | Ruby | Rust | Scala | Swift | Tsx | TypeScript | Yaml | Zig => { false } diff --git a/src/parse/guess_language.rs b/src/parse/guess_language.rs index 45b2b1da5..b1696a934 100644 --- a/src/parse/guess_language.rs +++ b/src/parse/guess_language.rs @@ -34,6 +34,7 @@ pub enum Language { Go, Haskell, Hcl, + Html, Janet, Java, JavaScript, @@ -113,6 +114,7 @@ fn from_emacs_mode_header(src: &str) -> Option