4.1 KiB
Scope of tree-sitter-clojure
TLDR
Only "primitives" (e.g. symbols, lists, etc.)
are supported, i.e. no higher level constructs like defn.
The Details
Why
For some background, Clojure (and other Lisps) have runtime extensible "syntax" via macros, but AFAIU tree-sitter's current design assumes a fixed syntax.
Keeping the above in mind, below are some of the factors that influenced the current stance on scope:
- Clojure has no language specification. This means it's unclear what to try to support in the grammar. For example,
defnis defined in theclojure.corenamespace, but then so are a lot of other things. - Each additional item added to the grammar increases the chance of a conflict which in turn may adversely impact correct parsing, but also makes the grammar harder to extend and maintain. In some cases this may lead to degraded performance (though it may be premature to be concerned about this point).
Alternatives
It is possible to use tree-sitter-clojure as a base to add additional constructs to a "derived" grammar. For example, such a grammar might be specialized to look for "definitions". At least in emacs-tree-sitter, it is technically possibly to have multiple grammars be used on single buffer:
If you want 2 parse trees in the same buffer instead, you would need to define an advice for tree-sitter--do-parse, as well as additional buffer-local variables for the secondary grammar.
Apparently it became possible in September of 2020 for queries to match on any of a node's supertypes. It may be possible to make a list supertype that is "composed of" defn and things that are not defn. tree-sitter-clojure-def is an attempt at realizing this apoproach.
However, depending on one's goals, it might make more sense to consider leveraging clj-kondo's analysis capabilities as clj-kondo already understands Clojure pretty well. IIUC, clojure-lsp does this.
Miscellaneous Points
-
Earlier attempts at adding
defand friends resulted in unacceptably high error rates [1]. The tests were conducted against code from Clojars (uncontrived code). FWIW, two of the previous tree-sitter-clojure attempts (by oakmac and Tavistock) also had unacceptably high error rates [2] and they both attempted to support higher level constructs. -
For use cases like structural editing, it seems important to be able to distinguish between the following sorts of cases:
defnused for defining a function, and- Using the symbol
defnwithin a macro to construct code to define a function
AFAICT, the approach taken in tree-sitter-clojure-def does not make telling these sorts of things apart possible.
-
It doesn't seem possible to support all "defining" macros like
defsomething(e.g.efaf35558a/src/clj/com/rpl/specter.cljc (L57-L60)) since a user's Clojure code can define these.
Footnotes
- [1] Author's opinion :)
- [2] Author's opinion again :)