Problem: Parsed language annotations can be random garbage so
`nvim_get_runtime_file` throws an error.
Solution: Validate that `alias` is a valid language name before trying
to find a parser for it.
Improve error messages for `:InspectTree`, when no parsers are available
for the current buffer and filetype. We can show more informative and
helpful error message for users (e.g., which lang was searched for):
```
... No parser available for the given buffer:
+... no parser for 'custom_ft' language, see :help treesitter-parsers
```
Also improve the relevant docs for *treesitter-parsers*.
Problem:
A region managed by an injected parser may shrink after re-running the
injection query. If the updated region goes out of the range to be
parsed, then the corresponding tree will remain outdated, possibly
retaining the nodes that shouldn't exist anymore. This results in
outdated highlights.
Solution:
Re-parse an invalid tree if its region intersects the range to be
parsed.
- Remove some unused fields
- Prefix classes with `vim.`
- Move around some functions so the query stuff is at the top.
- Improve type hints
- Rework how hl_cache is implemented
Problem:
Treesitter highlighter's on_line was iterating all the parsed trees,
which can be quite a lot when injection is used. This may slow down
scrolling and cursor movement in big files with many comment injections
(e.g., lsp/_meta/protocol.lua).
Solution:
In on_win, collect trees inside the visible range, and use them in
on_line.
NOTE:
This optimization depends on the correctness of on_win's botline_guess
parameter (i.e., it's always greater than or equal to the line numbers
passed to on_line). The documentation does not guarantee this, but I
have never noticed a problem so far.
* Collect on_bytes and flush at the invocation of the scheduled callback
to take account of commands that triggers multiple on_bytes.
* More accurately track movement of folds so that foldexpr returns
reasonable values even when the scheduled computation is not run yet.
* Start computing folds from the line above (+ foldminlines) the changed
lines to handle the folds that are removed due to the size limit.
* Shrink folds that end at the line at which another fold starts to
assign proper level to that line.
* Use level '=' for lines that are not computed yet.
Problem:
`LanguageTree:for_each_tree` calls itself for child nodes, so when we
calls `for_each_tree` inside `for_each_tree`, this quickly leads to
exponential tree calls.
Solution:
Use `pairs(child:trees())` directly in this case, as we don't need the
extra callback for each children, this is already handled from the outer
`for_each_tree` call
When first opened, the tree-sitter inspector traverses all of the nodes
in the buffer to calculate an array of nodes. This traversal is done
only once, and _all_ nodes (both named and anonymous) are included.
Toggling anonymous nodes in the inspector only changes how the tree is
drawn in the buffer, but does not affect the underlying data structure
at all.
When the buffer is traversed and the list of nodes is calculated, we
don't know whether or not anonymous nodes will be displayed in the
inspector or not. Thus, we cannot determine during traversal where to
put closing parentheses. Instead, this must be done when drawing.
When we draw, the tree structure has been flatted into a single array,
so we lose parent-child relationships that would otherwise make
determining the number of closing parentheses straightforward. However,
we can instead rely on the fact that a delta between the depth of a node
and the depth of the successive node _must_ mean that more closing
parentheses are required:
(foo
(bar)
(baz) ↑
│
└ (bar) and (baz) have different depths, so (bar) must have an
extra closing parenthesis
This does not depend on whether or not anonymous nodes are displayed and
so works in both cases.
Problem: Only injections under the top level tree are found.
Solution: Iterate through all trees to find injections. When two
injections are contained within the same node in the parent tree, prefer
the injection with the larger byte length.
When parsing with a range, languagetree looks up injections and adds
them if needed. This explicitly invalidates parser, making `is_valid`
report `false` both when including and excluding children.
This is an attempt to describe desired behaviour of `is_valid` in tests,
with what ended up being a single line change to satisfy them.
Problem: Visual highlight is inconsistent on a folded line with
treesitter foldtext.
Solution: Don't added Folded highlight as it is already in background.
This is incorrect in the following scenario:
1. The language tree is Lua > Vim > Lua.
2. An edit simultaneously wipes out the `_regions` of all nodes, while
taking the Vim injection off-screen.
3. The Vim injection is not re-parsed, so the child Lua `_regions` is
still `nil`.
4. The child Lua is assumed, incorrectly, to occupy the whole document.
5. This causes the injections to be parsed again, resulting in Lua > Vim
> Lua > Vim.
6. Now, by the same process, Vim ends up with its range assumed over the
whole document. Now the parse is broken and results in broken
highlighting and poor performance.
It should be fine to instead treat an unparsed node as occupying
nothing (i.e. effectively non-existent). Since, either:
- The parent was just parsed, hence defining `_regions`
- The parent was not just parsed, in which case this node doesn't need
to be parsed either.
Also, the name `has_regions` is confusing; it seems to simply
mean the opposite of "root" or "full_document". However, this PR does
not touch it.
Memoizes a function, using a custom function to hash the arguments.
Private for now until:
- There are other places in the codebase that could benefit from this
(e.g. LSP), but might require other changes to accommodate.
- Invalidation of the cache needs to be controllable. Using weak tables
is an acceptable invalidation policy, but it shouldn't be the only
one.
- I don't think the story around `hash_fn` is completely thought out. We
may be able to have a good default hash_fn by hashing each argument,
so basically a better 'concat'.
Problem:
With incremental injection parsing, injected languages' parsers parse
only the relevant regions and stores the result in _trees with the index
of the corresponding region. Therefore, there can be holes in _trees.
Solution:
* Use generic table functions where appropriate.
* Fix type annotations and docs.
Problem:
It doesn't make much sense to flatten each region (= list of ranges).
This coincidentally worked for region with a single range.
Solution:
Custom function for combining regions.
Problem
---
If a highlighter query returns a significant number of predicate
non-matches, the highlighter will scan well past the end of the window.
Solution
---
In the iterator returned from `iter_captures`, accept an optional
parameter `end_line`. If no parameter provided, the behavior is
unchanged, hence this is a non-invasive tweak.
Fixes: #25113nvim-treesitter/nvim-treesitter#5057
The name for_each_child is misleading and caused bugs.
After #25111, #25115, there are no more usages of `for_each_child` in Nvim.
In the future if we want to restore this functionality we can consider a
generalized vim.traverse(node, key, visitor) function.
Problem:
Folds are opened when the visible range changes even if there are no
modifications to the buffer, e.g, when using zM for the first time. If
the parsed tree was invalid, on_win re-parses and gets empty tree
changes, which triggers fold updates.
Solution:
Don't update folds in on_changedtree if there are no changes.
`LanguageTree:parse` is recursive, and calls
`LanguageTree:for_each_child`, which is also recursive.
That means that, starting from the third level (child of child of root),
nodes will be parsed twice.
Which then means that if the tree is N layers deep, there will be ~2^N
parses even if the branching factor is 1.
Now, why was the tree deepening with each character inserted? And why
did this only regress in #24647? These are mysteries for another time.
Fixes: #25104
Problem:
* The guessed botline might be smaller than the actual botline e.g. when
there are folds and the user is typing in insert mode. This may result
in incorrect treesitter highlights for injections.
* botline can be larger than the last line number of the buffer, which
results in errors when placing extmarks.
Solution:
* Take a more conservative approximation. I am not sure if it is
sufficient to guarantee correctness, but it seems to be good enough
for the case mentioned above.
* Clamp it to the last line number.
Co-authored-by: Lewis Russell <me@lewisr.dev>
Problem:
With treesitter fold, InsertLeave can be slow, because a single session
of insert mode may schedule multiple fold updates in on_bytes and
on_changedtree.
Solution:
Don't create duplicate autocmds.
According to `:h TSNode` docs, there's also `TSNode:sexpr()` and
`TSNode:has_error()` that is part of `TSNode` class, but this wasn't
documented in `treesitter/_meta.lua`.
Adding missing fields in so the types is similar to `:h TSNode`
Problem:
Treesitter highlighting is slow for large files with lots of injections.
Solution:
Only parse injections we are going to render during a redraw cycle.
---
- `LanguageTree:parse()` will no longer parse injections by default and
now requires an explicit range argument to be passed.
- `TSHighlighter` now parses injections incrementally during on_win
callbacks for the line range being rendered.
- Plugins which require certain injections to be parsed must run
`parser:parse({ start_row, end_row })` before using the tree.
* feat(treesitter): add injection language fallback
Problem: injection languages are often specified via aliases (e.g.,
filetype or in upper case), requiring custom directives.
Solution: include lookup logic (try as parser name, then filetype, then
lowercase) in LanguageTree itself and remove `#inject-language`
directive.
Co-authored-by: Lewis Russell <me@lewisr.dev>
Problem: luals returns stricter diagnostics with bundled luarc.json
Solution: Improve some function and type annotations:
* use recognized uv.* types
* disable diagnostic for global `vim` in shared.lua
* docs: don't start comment lines with taglink (otherwise LuaLS will interpret it as a type)
* add type alias for lpeg pattern
* fix return annotation for `vim.secure.trust`
* rename local Range object in vim.version (shadows `Range` in vim.treesitter)
* fix some "missing fields" warnings
* add missing required fields for test functions in eval.lua
* rename lsp meta files for consistency
When an injection has not set include children, make sure not to add
the injection if no ranges are determined.
This could happen when there is an injection with a child that has the
same range as itself. e.g. consider this Makefile snippet
```make
foo:
$(VAR)
```
Line 2 has an injection for bash and a make variable reference. If
include-children isn't set (default), then there is no range on line 2
to inject since the variable reference needs to be excluded.
This caused the language tree to return an empty range, which the parser
now interprets to mean the full buffer. This caused makefiles to have
completely broken highlighting.
* docs(lua): teach lua2dox how to table
* docs(lua): teach gen_vimdoc.py about local functions
No more need to mark local functions with @private
* docs(lua): mention @nodoc and @meta in dev-lua-doc
* fixup!
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
---------
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
Problem: When using treesitter foldexpr,
* :diffput/get open diff folds, and
* folds are not updated in other windows that contain the updated
buffer.
Solution: Update folds in all windows that contain the updated buffer
and use expr foldmethod.
perf(treesitter): cache vim.treesitter.query.get
Problem:
vim.treesitter.query.get searches and reads query files every time it's
called, if user hasn't overridden the query. So this can incur slowdown
when called frequently.
This can happen when using treesitter foldexpr. For example, when using
`:h :range!` in markdown file to format fenced codeblock, on_changedtree
in _fold.lua is triggered many times despite that the tree doesn't have
syntactic changes (might be a bug in LanguageTree). (Incidentally, the
resulting fold is incorrect due to a bug in `:h range!`.) on_changedtree
calls vim.treesitter.query.get for each tree changes. In addition, it
may request folds queries for injected languages without fold queries,
such as markdown_inline.
Solution:
* Cache the result of vim.treesitter.query.get.
* If query file was not found, fail quickly at later calls.
Problem: Treesitter fold is not updated if treesitter hightlight is not
active. More precisely, updating folds requires `LanguageTree:parse()`.
Solution: Call `parse()` before computing folds and compute folds when
lines are added/removed.
This doesn't guarantee correctness of the folds, because some changes
that don't add/remove line won't update the folds even if they should
(e.g. adding pair of braces). But it is good enough for most cases,
while not introducing big overhead.
Also, if highlighting is active, it is likely that
`TSHighlighter._on_buf` already ran `parse()` (or vice versa).
Problem:
"playground" is new jargon that overlaps with existing concepts:
"dev" (`:help dev`) and "view" (also "scratch" `:help scratch-buffer`) .
Solution:
We should consistently use "dev" as the namespace for where "developer
tools" live. For purposes of a "throwaway sandbox object", we can use
the name "view".
- Rename `TSPlayground` => `TSView`
- Rename `playground.lua` => `dev.lua`
- Add bindings to Treesitter ts_parser_set_logger and ts_parser_logger
- Add logfile with path STDPATH('log')/treesitter.c
- Rework existing LanguageTree loggin to use logfile
- Begin implementing log levels for vim.g.__ts_debug
When injections are added or removed make sure to:
- invoke 'changedtree' callbacks for when new trees are added.
- invoke 'changedtree' callbacks for when trees are invalidated
- redraw regions when languagetree children are removed
fix(treesitter playground): wrong range of a node displayed in playground
The call parameters order of the function `get_range_str` is flipped for the last two arguments compared to the declaration.
Problem:
Codebase inconsistently binds vim.api onto a or api.
Solution:
Use api everywhere. a as an identifier is too short to have at the
module level.