Then we can just load metadata in C as a single msgpack blob. Which also
can be used directly as binarly data, instead of first unpacking all the
functions and ui_events metadata to immediately pack it again, which was
a bit of a silly walk (and one extra usecase of `msgpack_rpc_from_object`
which will get yak shaved in the next PR)
Problems:
- Illegal bytes after valid UTF-8 char cause utf_cp_*_off() to fail.
- When stream isn't NUL-terminated, utf_cp_*_off() may go over the end.
Solution: Don't go over end of the char of end of the string.
Functions like file_open_new() and file_open_fd_new() which just is a
wrapper around the real functions but with an extra xmalloc/xfree around
is an anti-pattern. If the caller really needs to allocate a
FileDescriptor as a heap object, it can do that directly.
FileDescriptor by itself is pretty much a pointer, or rather two:
the OS fd index and a pointer to a buffer. So most of the time an extra
pointer layer is just wasteful.
In the case of scriptin[curscript] in getchar.c, curscript used
to mean in practice:
N+1 open scripts when curscript>0
zero or one open scripts when curscript==0
Which means scriptin[0] had to be compared to NULL to disambiguate the
curscript=0 case.
Instead, use curscript==-1 to mean that are no script,
then all pointer comparisons dissappear and we can just use an array of
structs without extra pointers.
Note: this contains two _temporary_ changes which can be reverted
once the Arena vs no-Arena distinction in API wrappers has been removed.
Both nlua_push_Object and object_to_vim_take_luaref() has been changed
to take the object argument as a pointer. This is not going to be
necessary once these are only used with arena (or not at all) allocated
Objects.
The object_to_vim() variant which leaves luaref untouched might need to
stay for a little longer.
The `get_indent_str_vtab()` function currently calls `tabstop_padding()`
every time a tab is encountered (unless tabstops aren't used).
`tabstop_padding()` either does a division by 'tabstop' If 'vartabstop'
is not set, or iterates through the 'vartabstop' list to find current
tab width.
Since the virtual column only increases, we can keep track of where the
next tabstop would be, and update this information once it was reached.
`get_indent_str_vtab()` also depends on 'listchars' "tab" value from the
current window, even though it may be called for a line from the same
buffer in a different window. In most cases, it is called with tabstops
enabled (last argument was `false`), so I split the function into one
that uses tabstops and the other that doesn't.
I removed `get_indent_str()` since I couldn't find any calls to it.
Currently having two separate memory strategies for API return values is
a bit unnecessary, and mostly a consequence of converting the hot spot
cases which needed it first. But there is really no downside to using
arena everywhere (which implies also directly using strings which are
allocated earlier or even statically, without copy).
There only restriction is we need to know the size of arrays in advance,
but this info can often be passed on from some earlier stage if it is
missing.
This collects some "small" cases. The more complex stuff will get a PR
each.
This expands on the global "don't pay for what you don't use" rules for
these special extmark decorations:
- inline virtual text, which needs to be processed in plines.c when we
calculate the size of text on screen
- virtual lines, which are needed when calculating "filler" lines
- signs, with text and/or highlights, both of which needs to be
processed for the entire line already at the beginning of a line.
This adds a count to each node of the marktree, for how many special
marks of each kind can be found in the subtree for this node. This makes
it possible to quickly skip over these extra checks, when working in
regions of the buffer not containing these kind of marks, instead of
before where this could just be skipped if the entire _buffer_
didn't contain such marks.
A bit big, but practically it was a lot simpler to change over all
fillchars and all listchars at once, to not need to maintain two
parallel implementations.
This is mostly an internal refactor, but it also removes an arbitrary
limitation: that 'fillchars' and 'listchars' values can only be
single-codepoint characters. Now any character which fits into a single
screen cell can be used.
Problem: Many places in the code use `findoption()` to access an option using its name, even if the option index is available. This is very slow because it requires looping through the options array over and over.
Solution: Use option index instead of name wherever possible. Also introduce an `OptIndex` enum which contains the index for every option as enum constants, this eliminates the need to pass static option names as strings.
Problem:
Not all Lua code is checked by stylua. Automating code-style is an
important mechanism for reducing time spent on accidental
(non-essential) complexity.
Solution:
- Enable lintlua for `test/unit/` directory.
- TODO: only `test/functional/` remains unchecked.
previous: 45fe4d11ad
previous: 517f0cc634
Problem: buffer text with composing chars are converted from UTF-8
to an array of up to seven UTF-32 values and then converted back
to UTF-8 strings.
Solution: Convert buffer text directly to UTF-8 based schar_T values.
The limit of the text size is now in schar_T bytes, which is currently
31+1 but easily could be raised as it no longer multiplies the size
of the entire screen grid when not used, the full size is only required
for temporary scratch buffers.
Also does some general cleanup to win_line text handling, which was
unnecessarily complicated due to multibyte rendering being an "opt-in"
feature long ago. Nowadays, a char is just a char, regardless if it consists
of one ASCII byte or multiple bytes.
We already have an extensive suite of static analysis tools we use,
which causes a fair bit of redundancy as we get duplicate warnings. PVS
is also prone to give false warnings which creates a lot of work to
identify and disable.
refactor: use a more idiomatic loop to iterate over the cells
There are two cases in which the following assertion would fail:
```c
assert(g->icell < g->ncells);
```
1. If `g->ncells = 0`. Update this to be legal.
2. If an EOF is reached while parsing `wrap`. In this case, the unpacker
attempts to resume from `cells`, which is a bug. Create a new state
for parsing `wrap`.
Reference: https://neovim.io/doc/user/ui.html#ui-event-grid_line
- Move vimoption_T to option.h
- option_defs.h is for option-related types
- option_vars.h corresponds to Vim's option.h
- option_defs.h and option_vars.h don't include each other
Linux added these types to their userspace headers in [6.5], which
causes unit tests to fail like
```
-------- Running tests from test/unit/api/private_helpers_spec.lua
RUN vim_to_object converts true: 17.00 ms ERR
test/unit/helpers.lua:748: test/unit/helpers.lua:732: (string) '
test/unit/helpers.lua:264: ';' expected near '__s128' at line 194'
exit code: 256
stack traceback:
test/unit/helpers.lua:748: in function 'itp_parent'
test/unit/helpers.lua:784: in function <test/unit/helpers.lua:774>
```
Since we don't use these types, they can be ignored to avoid LuaJIT's C
parser choking on them.
[6.5]: 224d80c584
The removes the previous restriction that nvim_buf_set_extmark()
could not be used to highlight arbitrary multi-line regions
The problem can be summarized as follows: let's assume an extmark with a
hl_group is placed covering the region (5,0) to (50,0) Now, consider
what happens if nvim needs to redraw a window covering the lines 20-30.
It needs to be able to ask the marktree what extmarks cover this region,
even if they don't begin or end here.
Therefore the marktree needs to be augmented with the information covers
a point, not just what marks begin or end there. To do this, we augment
each node with a field "intersect" which is a set the ids of the
marks which overlap this node, but only if it is not part of the set of
any parent. This ensures the number of nodes that need to be explicitly
marked grows only logarithmically with the total number of explicitly
nodes (and thus the number of of overlapping marks).
Thus we can quickly iterate all marks which overlaps any query position
by looking up what leaf node contains that position. Then we only need
to consider all "start" marks within that leaf node, and the "intersect"
set of that node and all its parents.
Now, and the major source of complexity is that the tree restructuring
operations (to ensure that each node has T-1 <= size <= 2*T-1) also need
to update these sets. If a full inner node is split in two, one of the
new parents might start to completely overlap some ranges and its ids
will need to be moved from its children's sets to its own set.
Similarly, if two undersized nodes gets joined into one, it might no
longer completely overlap some ranges, and now the children which do
needs to have the have the ids in its set instead. And then there are
the pivots! Yes the pivot operations when a child gets moved from one
parent to another.
This involves two redesigns of the map.c implementations:
1. Change of macro style and code organization
The old khash.h and map.c implementation used huge #define blocks with a
lot of backslash line continuations.
This instead uses the "implementation file" .c.h pattern. Such a file is
meant to be included multiple times, with different macros set prior to
inclusion as parameters. we already use this pattern e.g. for
eval/typval_encode.c.h to implement different typval encoders reusing a
similar structure.
We can structure this code into two parts. one that only depends on key
type and is enough to implement sets, and one which depends on both key
and value to implement maps (as a wrapper around sets, with an added
value[] array)
2. Separate the main hash buckets from the key / value arrays
Change the hack buckets to only contain an index into separate key /
value arrays
This is a common pattern in modern, state of the art hashmap
implementations. Even though this leads to one more allocated array, it
is this often is a net reduction of memory consumption. Consider
key+value consuming at least 12 bytes per pair. On average, we will have
twice as many buckets per item.
Thus old implementation:
2*12 = 24 bytes per item
New implementation
1*12 + 2*4 = 20 bytes per item
And the difference gets bigger with larger items.
One might think we have pulled a fast one here, as wouldn't the average size of
the new key/value arrays be 1.5 slots per items due to amortized grows?
But remember, these arrays are fully dense, and thus the accessed memory,
measured in _cache lines_, the unit which actually matters, will be the
fully used memory but just rounded up to the nearest cache line
boundary.
This has some other interesting properties, such as an insert-only
set/map will be fully ordered by insert only. Preserving this ordering
in face of deletions is more tricky tho. As we currently don't use
ordered maps, the "delete" operation maintains compactness of the item
arrays in the simplest way by breaking the ordering. It would be
possible to implement an order-preserving delete although at some cost,
like allowing the items array to become non-dense until the next rehash.
Finally, in face of these two major changes, all code used in khash.h
has been integrated into map.c and friends. Given the heavy edits it
makes no sense to "layer" the code into a vendored and a wrapper part.
Rather, the layered cake follows the specialization depth: code shared
for all maps, code specialized to a key type (and its equivalence
relation), and finally code specialized to value+key type.
Problem: Cannot use positional arguments for printf()
Solution: Support positional arguments in string formatting
closes: vim/vim#121400c6181fec4
Co-authored-by: Christ van Willegen <cvwillegen@gmail.com>
Problem: luals returns stricter diagnostics with bundled luarc.json
Solution: Improve some function and type annotations:
* use recognized uv.* types
* disable diagnostic for global `vim` in shared.lua
* docs: don't start comment lines with taglink (otherwise LuaLS will interpret it as a type)
* add type alias for lpeg pattern
* fix return annotation for `vim.secure.trust`
* rename local Range object in vim.version (shadows `Range` in vim.treesitter)
* fix some "missing fields" warnings
* add missing required fields for test functions in eval.lua
* rename lsp meta files for consistency
Adjust relevant Lua tests. Refactor testing logic for tv_get_string_*
functions into test_string_fn().
Note that vim_snprintf(), which is used for stringifying floats, always
calls xfree(tofree), even if tofree is NULL, so we need to expect that
in the alloc log.
Problem: Vim9: builtin function arguments not checked at compile time.
Solution: Add more type checks. (Yegappan Lakshmanan, closesvim/vim#8539)
5b73992d8f
Co-authored-by: Yegappan Lakshmanan <yegappan@yahoo.com>
Here's the headline: when run in sync mode (last argument cb=NULL),
these functions don't actually use the uv_loop_t.
An earlier version of this patch instead replaced fs_loop with
using main_loop.uv on the main thread and luv_loop() on luv worker
threads. However this made the code more complicated for no reason.
Also arbitrarily, half of these functions would attempt to handle
UV_ENOMEM by try_to_free_memory(). This would mostly happen
on windows because it needs to allocate a converted WCHAR buffer.
This should be a quite rare situation. Your system is pretty
much hosed already if you cannot allocate like 50 WCHAR:s.
Therefore, take the liberty of simply removing this fallback.
In addition, we tried to "recover" from ENOMEM in read()/readv()
this way which doesn't make any sense. The read buffer(s) are already
allocated at this point.
This would also be an issue when using these functions on a worker
thread, as try_to_free_memory() is not thread-safe. Currently
os_file_is_readable() and os_is_dir() is used by worker threads
(as part of nvim__get_runtime(), to implement require from 'rtp' in
threads).
In the end, these changes makes _all_ os/fs.c functions thread-safe,
and we thus don't need to document and maintain a thread-safe subset.
Previously, there were three low-level delay entry points
- os_delay(ms, ignoreinput=true): sleep for ms, only break on got_int
- os_delay(ms, ignoreinput=false): sleep for ms, break on any key input
os_microdelay(us, false): equivalent, but in μs (not directly called)
- os_microdelay(us, true): sleep for μs, never break.
The implementation of the latter two both used uv_cond_timedwait()
This could have been for two reasons:
1. allow another thread to "interrupt" the wait
2. uv_cond_timedwait() has higher resolution than uv_sleep()
However we (1) never used the first, even when TUI was a thread, and
(2) nowhere in the codebase are we using μs resolution, it is always a ms
multiplied with 1000.
In addition, os_delay(ms, false) would completely block the thread for
100ms intervals and in between check for input. This is not how event handling
is done alound here.
Therefore:
Replace the implementation of os_delay(ms, false) to use
LOOP_PROCESS_EVENTS_UNTIL which does a proper epoll wait with a timeout,
instead of the 100ms timer panic.
Replace os_microdelay(us, false) with a direct wrapper of uv_sleep.