**Problem:** Tree-sitter 0.24.0 introduced a new symbol type to denote
supertype nodes (`TSSymbolTypeSupertype`). Now, `language.inspect()`
(and the query `omnifunc`) return supertype symbols, but with double
quotes around them.
**Solution:** Mark a symbol as "named" based on it *not* being an
anonymous node, rather than checking that it is a regular node (which a
supertype also is not).
**Problems:**
- `vim.treesitter.language.inspect()` returns duplicate
symbol names, sometimes up to 6 of one kind in the case of `markdown`
- The list-like `symbols` table can have holes and is thus not even a
valid msgpack table anyway, mentioned in a test
**Solution:** Return symbols as a map, rather than a list, where field
names are the names of the symbol. The boolean value associated with the
field encodes whether or not the symbol is named.
Note that anonymous nodes are surrounded with double quotes (`"`) to
prevent potential collisions with named counterparts that have the same
identifier.
This commit also marks `child_containing_descendant()` as deprecated
(per upstream's documentation), and uses `child_with_descendant()` in
its place. Minimum required tree-sitter version will now be `0.24`.
Problem: No clear way to check whether parsers are available for a given
language.
Solution: Make `language.add()` return `true` if a parser was
successfully added and `nil` otherwise. Use explicit `assert` instead of
relying on thrown errors.
Problem:
Headings in :help do not stand out visually.
Solution:
Define a non-standard `@markup.heading.1.delimiter` group and
special-case it in `highlight_group.c`.
FUTURE:
This is a cheap workaround until we have #25718 which will enable:
- fully driven by `vimdoc/highlights.scm` instead of using highlight
tricks (`guibg=bg guifg=bg guisp=fg`)
- better support of "cterm" ('notermguicolors')
**Problem:** `is_ancestor()` uses a slow, bottom-up parent lookup which
has performance pitfalls detailed in #28512.
**Solution:** Take `is_ancestor()` from $O(n^2)$ to $O(n)$ by
incorporating the use of the `child_containing_descendant()` function
**Problem:** `vim.treesitter.get_parser` will throw an error if no parser
can be found.
- This means the caller is responsible for wrapping it in a `pcall`,
which is easy to forget
- It also makes it slightly harder to potentially memoize `get_parser`
in the future
- It's a bit unintuitive since many other `get_*` style functions
conventionally return `nil` if no object is found (e.g. `get_node`,
`get_lang`, `query.get`, etc.)
**Solution:** Return `nil` if no parser can be found or created
- This requires a function signature change, and some new assertions in
places where the parser will always (or should always) be found.
- This commit starts by making this change internally, since it is
breaking. Eventually it will be rolled out to the public API.
For context, see https://github.com/neovim/neovim/pull/24738. Before
that PR, Nvim did not correctly handle captures with quantifiers. That
PR made the correct behavior opt-in to minimize breaking changes, with
the intention that the correct behavior would eventually become the
default. Users can still opt-in to the old (incorrect) behavior for now,
but this option will eventually be removed completely.
BREAKING CHANGE: Any plugin which uses `Query:iter_matches()` must
update their call sites to expect an array of nodes in the `match`
table, rather than a single node.
Problem: Installing treesitter parser is hard (harder than
climbing to heaven).
Solution: Add optional support for wasm parsers with `wasmtime`.
Notes:
* Needs to be enabled by setting `ENABLE_WASMTIME` for tree-sitter and
Neovim. Build with
`make CMAKE_EXTRA_FLAGS=-DENABLE_WASMTIME=ON
DEPS_CMAKE_FLAGS=-DENABLE_WASMTIME=ON`
* Adds optional Rust (obviously) and C11 dependencies.
* Wasmtime comes with a lot of features that can negatively affect
Neovim performance due to library and symbol table size. Make sure to
build with minimal features and full LTO.
* To reduce re-compilation times, install `sccache` and build with
`RUSTC_WRAPPER=<path/to/sccache> make ...`
Problem:
Tests have lots of exec_lua calls which input blocks of code
provided as unformatted strings.
Solution:
Teach exec_lua how to handle functions.
This is identical to `named_node_for_range` except that it includes
anonymous nodes. This maintains consistency in the API because we
already have `descendant_for_range` and `named_descendant_for_range`.
Problem: Neovim bundles treesitter parsers for bash and python but does
not use them by default. This dilutes the messaging about the bundled
parsers being required for functionality or reasonable out-of-the-box
experience. It also increases the risk of query incompatibilities for no
gain.
Solution: Stop bundling bash and python parser and queries.
Problem:
Error when calling vim.treesitter.start() and vim.treesitter.stop() in
init.lua.
Solution:
Ensure syntaxset augroup exists after loading synload.vim.
Instead of looping over all captured nodes, just take the end range from
the last node in the list. This uses the fact that nodes returned by
iter_matches are ordered by their range (earlier to later).
Problem: Treesitter highlighter clears the already populated highlight
state when performing spell checking while drawing a
smoothscrolled topline.
Solution: Save and restore the highlight state in the highlighter's
_on_spell_nav callback.
Problem:
`o`-ing on a folded line opens the fold, because the new line gets the
fold level from the above line (level '='), which extends the fold to
the new line. `O` has a similar problem when run on the line below a
fold.
Solution:
Use -1 for the added line to get the lower level from the above/below
line.
Problem:
1. When interacting with multiple :InspectTree and the source buffer
windows there is a high chance of errors due to the window ids not
being updated and validated.
2. Not all InspectTree windows were closed when the source buffer was
closed.
Solution:
1. Update InspectTree window id on `CursorMoved` event and validate
source buffer window id before trying to navigate to it.
2. Close all InspectTree windows
Problem: `has-ancestor?` is O(n²) for the depth of the tree since it iterates over each of the node's ancestors (bottom-up), and each ancestor takes O(n) time.
This happens because tree-sitter's nodes don't store their parent nodes, and the tree is searched (top-down) each time a new parent is requested.
Solution: Make use of new `ts_node_child_containing_descendant()` in tree-sitter v0.22.6 (which is now the minimum required version) to rewrite the `has-ancestor?` predicate in C to become O(n).
For a sample file, decreases the time taken by `has-ancestor?` from 360ms to 6ms.
Instead of painfully messing with timing to determine if queries were
reparsed, we can simply keep a counter next to the call to ts_query_new
Also memoization had a hidden dependency on the garbage collection of
the the key, a hash value which never is kept around in memory. this was
done intentionally as the hash does not capture all relevant state for the
query (external included files) even if actual query objects still
would be reachable in memory. To make the test fully deterministic in
CI, we explicitly control GC.
* fix(treesitter): enforce lowercase language names
Problem: On case-insensitive file systems (e.g., macOS), `has_parser`
will return `true` for uppercase aliases, which will then try to inject
the uppercase language unsuccessfully.
Solution: Enforce and assume parser names to be lowercase when
resolving language names.
Specifically, functions that are run in the context of the test runner
are put in module `test/testutil.lua` while the functions that are run
in the context of the test session are put in
`test/functional/testnvim.lua`.
Closes https://github.com/neovim/neovim/issues/27004.
Move test cases that are more about treesitter query API rather than
parser API or LanguageTree out of "treesitter/parser_spec", and collect
them in another test suite "treesitter/query_spec".
General refactoring, including:
- Improve whitespace and indentation
- Prefix captures with `@`
- Add more comments on `iter_capture()` tests
- Move `test_query` up closer to the fixture source string
No behavioral changes are made.
Tree-sitter queries can add URLs to a capture using the `#set!`
directive, e.g.
(inline_link
(link_text) @text.reference
(link_destination) @text.uri
(#set! @text.reference "url" @text.uri))
The pattern above is included by default in the `markdown_inline`
highlight query so that users with supporting terminals will see
hyperlinks. For now, this creates a hyperlink for *all* Markdown URLs of
the pattern [link text](link url), even if `link url` does not contain
a valid protocol (e.g. if `link url` is a path to a file). We may wish to
change this in the future to only linkify when the URL has a valid
protocol scheme, but for now we delegate handling this to the terminal
emulator.
In order to support directives which reference other nodes, the
highlighter must be updated to use `iter_matches` rather than
`iter_captures`. The former provides the `match` table which maps
capture IDs to nodes. However, this has its own challenges:
- `iter_matches` does not guarantee the order in which patterns are
iterated matches the order in the query file. So we must enforce
ordering manually using "subpriorities" (#27131). The pattern index of
each match dictates the extmark's subpriority.
- When injections are used, the highlighter contains multiple trees. The
pattern indices of each tree must be offset relative to the maximum
pattern index from all previous trees to ensure that extmarks appear
in the correct order.
- The `iter_captures` implementation currently has a bug where the
"match" table is only returned for the first capture within a pattern
(see #27274). This bug means that `#set!` directives in a query
apply only to the first capture within a pattern. Unfortunately, many
queries in the wild have come to depend on this behavior.
`iter_matches` does not share this flaw, so switching to
`iter_matches` exposed bugs in existing highlight queries. These
queries have been updated in this repo, but may still need to be
updated by users. The `#set!` directive applies to the _entire_ query
pattern when used without a capture argument. To make `#set!`
apply only to a single capture, the capture must be given as an
argument.
Query patterns can contain quantifiers (e.g. (foo)+ @bar), so a single
capture can map to multiple nodes. The iter_matches API can not handle
this situation because the match table incorrectly maps capture indices
to a single node instead of to an array of nodes.
The match table should be updated to map capture indices to an array of
nodes. However, this is a massively breaking change, so must be done
with a proper deprecation period.
`iter_matches`, `add_predicate` and `add_directive` must opt-in to the
correct behavior for backward compatibility. This is done with a new
"all" option. This option will become the default and removed after the
0.10 release.
Co-authored-by: Christian Clason <c.clason@uni-graz.at>
Co-authored-by: MDeiml <matthias@deiml.net>
Co-authored-by: Gregory Anders <greg@gpanders.com>
* use `Special` as default for `@markup.*`, especially `@markup.raw` and
`@markup.math` (`@markup` itself is never used)
* use `Structure` for `@markup.environment`
* highlight all of `@markup.link` as Underlined (otherwise concealed
links are invisible)
Problem: Sharing queries with upstream and Helix is difficult due to
different capture names.
Solution: Define and document a new set of standard captures that
matches tree-sitter "standard captures" (where defined) and is closer to
Helix' Atom-style nested groups.
This is a breaking change for colorschemes that defined highlights based
on the old captures. On the other hand, the default colorscheme now
defines links for all standard captures (not just those used in bundled
queries), improving the out-of-the-box experience.