Problem:
Treesitter highlighting is slow for large files with lots of injections.
Solution:
Only parse injections we are going to render during a redraw cycle.
---
- `LanguageTree:parse()` will no longer parse injections by default and
now requires an explicit range argument to be passed.
- `TSHighlighter` now parses injections incrementally during on_win
callbacks for the line range being rendered.
- Plugins which require certain injections to be parsed must run
`parser:parse({ start_row, end_row })` before using the tree.
* feat(treesitter): add injection language fallback
Problem: injection languages are often specified via aliases (e.g.,
filetype or in upper case), requiring custom directives.
Solution: include lookup logic (try as parser name, then filetype, then
lowercase) in LanguageTree itself and remove `#inject-language`
directive.
Co-authored-by: Lewis Russell <me@lewisr.dev>
Problem: luals returns stricter diagnostics with bundled luarc.json
Solution: Improve some function and type annotations:
* use recognized uv.* types
* disable diagnostic for global `vim` in shared.lua
* docs: don't start comment lines with taglink (otherwise LuaLS will interpret it as a type)
* add type alias for lpeg pattern
* fix return annotation for `vim.secure.trust`
* rename local Range object in vim.version (shadows `Range` in vim.treesitter)
* fix some "missing fields" warnings
* add missing required fields for test functions in eval.lua
* rename lsp meta files for consistency
When an injection has not set include children, make sure not to add
the injection if no ranges are determined.
This could happen when there is an injection with a child that has the
same range as itself. e.g. consider this Makefile snippet
```make
foo:
$(VAR)
```
Line 2 has an injection for bash and a make variable reference. If
include-children isn't set (default), then there is no range on line 2
to inject since the variable reference needs to be excluded.
This caused the language tree to return an empty range, which the parser
now interprets to mean the full buffer. This caused makefiles to have
completely broken highlighting.
* docs(lua): teach lua2dox how to table
* docs(lua): teach gen_vimdoc.py about local functions
No more need to mark local functions with @private
* docs(lua): mention @nodoc and @meta in dev-lua-doc
* fixup!
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
---------
Co-authored-by: Justin M. Keyes <justinkz@gmail.com>
Problem: When using treesitter foldexpr,
* :diffput/get open diff folds, and
* folds are not updated in other windows that contain the updated
buffer.
Solution: Update folds in all windows that contain the updated buffer
and use expr foldmethod.
perf(treesitter): cache vim.treesitter.query.get
Problem:
vim.treesitter.query.get searches and reads query files every time it's
called, if user hasn't overridden the query. So this can incur slowdown
when called frequently.
This can happen when using treesitter foldexpr. For example, when using
`:h :range!` in markdown file to format fenced codeblock, on_changedtree
in _fold.lua is triggered many times despite that the tree doesn't have
syntactic changes (might be a bug in LanguageTree). (Incidentally, the
resulting fold is incorrect due to a bug in `:h range!`.) on_changedtree
calls vim.treesitter.query.get for each tree changes. In addition, it
may request folds queries for injected languages without fold queries,
such as markdown_inline.
Solution:
* Cache the result of vim.treesitter.query.get.
* If query file was not found, fail quickly at later calls.
Problem: Treesitter fold is not updated if treesitter hightlight is not
active. More precisely, updating folds requires `LanguageTree:parse()`.
Solution: Call `parse()` before computing folds and compute folds when
lines are added/removed.
This doesn't guarantee correctness of the folds, because some changes
that don't add/remove line won't update the folds even if they should
(e.g. adding pair of braces). But it is good enough for most cases,
while not introducing big overhead.
Also, if highlighting is active, it is likely that
`TSHighlighter._on_buf` already ran `parse()` (or vice versa).
Problem:
"playground" is new jargon that overlaps with existing concepts:
"dev" (`:help dev`) and "view" (also "scratch" `:help scratch-buffer`) .
Solution:
We should consistently use "dev" as the namespace for where "developer
tools" live. For purposes of a "throwaway sandbox object", we can use
the name "view".
- Rename `TSPlayground` => `TSView`
- Rename `playground.lua` => `dev.lua`
- Add bindings to Treesitter ts_parser_set_logger and ts_parser_logger
- Add logfile with path STDPATH('log')/treesitter.c
- Rework existing LanguageTree loggin to use logfile
- Begin implementing log levels for vim.g.__ts_debug
When injections are added or removed make sure to:
- invoke 'changedtree' callbacks for when new trees are added.
- invoke 'changedtree' callbacks for when trees are invalidated
- redraw regions when languagetree children are removed
fix(treesitter playground): wrong range of a node displayed in playground
The call parameters order of the function `get_range_str` is flipped for the last two arguments compared to the declaration.
Problem:
Codebase inconsistently binds vim.api onto a or api.
Solution:
Use api everywhere. a as an identifier is too short to have at the
module level.
Never return the changes an only notify them using the `on_changedtree`
callback.
It is not guaranteed for a plugin that it'll be the first one to call
`tree:parse()` and thus get the changes.
Closes#19915
Problem:
gen_vimdoc.py / lua2dox.lua does not support @defgroup or \defgroup
except for "api-foo" modules.
Solution:
Modify `gen_vimdoc.py` to look for section names based on `helptag_fmt`.
TODO:
- Support @module ?
https://github.com/LuaLS/lua-language-server/wiki/Annotations#module
Problem:
Help tags like vim.treesitter.language.add() are confusing because
`vim.treesitter.language` is (thankfully) not a user-facing module.
Solution:
Ignore the "fstem" when generating "treesitter" tags.
Problem:
Treesitter injections are slow because all injected trees are invalidated on every change.
Solution:
Implement smarter invalidation to avoid reparsing injected regions.
- In on_bytes, try and update self._regions as best we can. This PR just offsets any regions after the change.
- Add valid flags for each region in self._regions.
- Call on_bytes recursively for all children.
- We still need to run the query every time for the top level tree. I don't know how to avoid this. However, if the new injection ranges don't change, then we re-use the old trees and avoid reparsing children.
This should result in roughly a 2-3x reduction in tree parsing when the comment injections are enabled.
Problem:
vim.treesitter does not know how to map a specific filetype to a parser.
This creates problems since in a few places (including in vim.treesitter itself), the filetype is incorrectly used in place of lang.
Solution:
Add an API to enable this:
- Add vim.treesitter.language.add() as a replacement for vim.treesitter.language.require_language().
- Optional arguments are now passed via an opts table.
- Also takes a filetype (or list of filetypes) so we can keep track of what filetypes are associated with which langs.
- Deprecated vim.treesitter.language.require_language().
- Add vim.treesitter.language.get_lang() which returns the associated lang for a given filetype.
- Add vim.treesitter.language.register() to associate filetypes to a lang without loading the parser.
- Render node ranges as virtual text
- Set filettype=query. The virtual text is to avoid parsing errors.
- Make sure highlights text is always in view.
Problem: Some injections (like markdown) allow specifying arbitrary
language names for code blocks, which may be lead to errors when
looking for a corresponding parser in runtime path.
Solution: Validate that the language name only contains alphanumeric
characters and `_` (e.g., for `c_sharp`) and error otherwise.
Add a "show_tree" function to view a textual representation of the
nodes in a language tree in a window. Moving the cursor in the
window highlights the corresponding text in the source buffer, and
moving the cursor in the source buffer highlights the corresponding
nodes in the window.
Use the first, not last, query for a language on runtimepath. Typically,
this implies that a user query will override a site plugin query, which
will override a bundled runtime query.
Problem: Treesitter queries for a given language in runtime were merged together,
leading to errors if they targeted different parser versions (e.g., bundled viml queries
and those shipped by nvim-treesitter).
Solution: Runtime queries now work as follows:
* The last query in the rtp without `; extends` in the header will be used as the base query
* All queries (without a specific order) with `; extends` are concatenated with the base query
BREAKING CHANGE: queries need to be updated if they are meant to extend other queries
- Added 'spell' option to extmarks:
Extmarks with this set will have the region spellchecked.
- Added 'noplainbuffer' option to 'spelloptions':
This is used to tell Neovim not to spellcheck the buffer. The old
behaviour was to spell check the whole buffer unless :syntax was set.
- Added spelling support to the treesitter highlighter:
@spell captures in highlights.scm are used to define regions which
should be spell checked.
- Added support for navigating spell errors for extmarks:
Works for both ephemeral and static extmarks
- Added '_on_spell_nav' callback for decoration providers:
Since ephemeral callbacks are only drawn for the visible screen,
providers must implement this callback to instruct Neovim which
regions in the buffer need can be spell checked.
The callback takes a start position and an end position.
Note: this callback is subject to change hence the _ prefix.
- Added spell captures for built-in support languages
Co-authored-by: Lewis Russell <lewis6991@gmail.com>
Co-authored-by: Björn Linse <bjorn.linse@gmail.com>
This removes the support for defining links via
vim.treesitter.highlighter.hl_map (never documented, but plugins did
anyway), or the uppercase-only `@FooGroup.Bar` to `FooGroup` rule.
The fallback is now strictly `@foo.bar.lang` to `@foo.bar` to `@foo`,
and casing is irrelevant (as it already was outside of treesitter)
For compatibility, define default links to builting syntax groups
as defined by pre-existing color schemes
The private 'get_node_range' function from the languagetree module has
been renamed and remains private as it serve a purpose that is only
relevant inside the languagetree module.
The 'get_node_range' upstreamed from nvim-treesitter in the treesitter
module has been made public as it is in itself a utlity function.
As part of the upstream of utility functions from nvim-treesitter, this
option when set to false allows to return a table (downstream behavior).
Effectively making the switch from the downstream to the upstream
function much easier.
Previously the `offset!` directive populated the metadata in such a way
that the new range could be attributed to a specific capture. #14046
made it so the directive simply stored just the new range in the
metadata and information about what capture the range is based from is
lost.
This change reverts that whilst also correcting the docs.
When trying to load a language parser, escape the value of
the language.
With language injection, the language might be picked up from the
buffer. If this value is erroneous it can cause `nvim_get_runtime_file`
to hard error.
E.g., the markdown expression `~~~{` will extract '{' as a language and
then try to get the parser using `parser/{*` as the pattern.
Co-authored-by: Sean Dewar <seandewar@users.noreply.github.com>
Co-authored-by: Gregory Anders <greg@gpanders.com>
Co-authored-by: Sebastian Volland <seb@baunz.net>
Co-authored-by: Lewis Russell <lewis6991@gmail.com>
Co-authored-by: zeertzjq <zeertzjq@outlook.com>
Based on https://github.com/neovim/neovim/pull/14445
This extends `vim.treesitter.query.get_node_text` to return the text
that spans a node's range even if start_row ~= end_row.
The official developer documentation in in :h dev-lua-doc specifies to
use "--@" for special/magic tokens. However, this format is not
consistent with EmmyLua notation (used by some Lua language servers) nor
with the C version of the magic docstring tokens which use three comment
characters.
Further, the code base is currently split between usage of "--@",
"---@", and "--- @". In an effort to remain consistent, change all Lua
magic tokens to use "---@" and update the developer documentation
accordingly.
This changes the behavior of the hl_cache to the old one.
- when the capture exists as a hlgroup -> use it
- when hl_map contains a mapping -> use it
- else do nothing (before: map capture to non-existing capture)
Before also captures `@foo.bar` would intend to use the hlgroup `foo.bar`
which results in a confusing error since hlgroups can't contain dots.
For the case of Clojure and other Lisp syntax highlighting, it is
necessary to create huge regexps consisting of hundreds of symbols with
the pipe (|) character. To make things more difficult, these Lisp
symbols sometimes consists of special characters that are themselves
part of special regexp characters like '*'. In addition to being
difficult to maintain, it's performance is suboptimal.
This patch introduces a new predicate to perform 'source' matching in
amortized constant time. This is accomplished by compiling a hash table
on the first use.