diff --git a/manual/src/adding_a_parser.md b/manual/src/adding_a_parser.md index 8743fd062..6928b42d0 100644 --- a/manual/src/adding_a_parser.md +++ b/manual/src/adding_a_parser.md @@ -10,60 +10,44 @@ parsers](https://tree-sitter.github.io/tree-sitter/#available-parsers). ## Add the source code -Once you've found a parser, add it as a git subtree to -`vendored_parsers/`. We'll use -[tree-sitter-json](https://github.com/tree-sitter/tree-sitter-json) as -an example. - -``` -$ git subtree add --prefix=vendored_parsers/tree-sitter-json https://github.com/tree-sitter/tree-sitter-json.git master -``` - -## Configure the build - -Cargo does not allow packages to include subdirectories that contain a -`Cargo.toml`. Add a symlink to the `src/` parser subdirectory. - -``` -$ cd vendored_parsers -$ ln -s tree-sitter-json/src tree-sitter-json-src -``` - -You can now add the parser to build by including the directory in -`build.rs`. - +Ideally, the parser should be available as a Rust crate on crates.io. +If that's the case, add it to `Cargo.toml` in the alphabetically sorted list +of parser dependencies. For instance: ``` -TreeSitterParser { - name: "tree-sitter-json", - src_dir: "vendored_parsers/tree-sitter-json-src", - extra_files: vec![], -}, +tree-sitter-json = "0.24.8" ``` - -If your parser includes custom C or C++ files for lexing (e.g. a -`scanner.cc`), add them to `extra_files`. +Otherwise, it is possible to [vendor the parser in difftastic's source code](./parser_vendoring.md), +but this should only be used as a last resort. ## Configure parsing Add an entry to `tree_sitter_parser.rs` for your language. -``` +```rust Json => { - let language = unsafe { tree_sitter_json() }; + let language_fn = tree_sitter_json::LANGUAGE; + let language = tree_sitter::Language::new(language_fn); + TreeSitterConfig { language, atom_nodes: vec!["string"].into_iter().collect(), delimiter_tokens: vec![("{", "}"), ("[", "]")], - highlight_query: ts::Query::new( - language, - include_str!("../../vendored_parsers/highlights/json.scm"), - ) - .unwrap(), + highlight_query: ts::Query::new(language, tree_sitter_json::HIGHLIGHTS_QUERY) + .unwrap(), sub_languages: vec![], } } ``` +If the Rust crate does not include a `HIGHLIGHTS_QUERY`, then you need to include +it from a file instead, with +``` +include_str!("../../vendored_parsers/highlights/json.scm") +``` +Many parser repositories include a highlights query in the repository without +exposing it in the Rust crate. In that case you can include it as +`vendored_parsers/highlights/json.scm` in the repository. + `atom_nodes` is a list of tree-sitter node names that should be treated as atoms even though the nodes have children. This is common for things like string literals or interpolated strings, where the diff --git a/manual/src/parser_vendoring.md b/manual/src/parser_vendoring.md index 355b28014..a5b3f9b5d 100644 --- a/manual/src/parser_vendoring.md +++ b/manual/src/parser_vendoring.md @@ -2,9 +2,44 @@ ## Git Subtrees -Tree-sitter parsers are sometimes packaged on npm, sometimes packaged -on crates.io, and have different release frequencies. Difftastic uses -git subtrees (not git submodules) to track parsers. +Tree-sitter parsers are sometimes not packaged on crates.io. In that case, Difftastic uses +git subtrees (not git submodules) to track them. + +## Vendoring a parser + +Once you've found the source repository for the parser, add it as a git subtree to +`vendored_parsers/`. We'll use +[tree-sitter-json](https://github.com/tree-sitter/tree-sitter-json) as +an example. + +``` +$ git subtree add --prefix=vendored_parsers/tree-sitter-json https://github.com/tree-sitter/tree-sitter-json.git master +``` + +### Configure the build + +Cargo does not allow packages to include subdirectories that contain a +`Cargo.toml`. Add a symlink to the `src/` parser subdirectory. + +``` +$ cd vendored_parsers +$ ln -s tree-sitter-json/src tree-sitter-json-src +``` + +You can now add the parser to build by including the directory in +`build.rs`. + +``` +TreeSitterParser { + name: "tree-sitter-json", + src_dir: "vendored_parsers/tree-sitter-json-src", + extra_files: vec![], +}, +``` + +If your parser includes custom C or C++ files for lexing (e.g. a +`scanner.cc`), add them to `extra_files`. + ## Updating a parser