This introduces a better set of defaults for large databases. I've
experimentally determined that it results in much better throughput in a
couple of scenarios with large databases, but I can't give any
guarantees the values are always optimal. They're probably no worse than
the defaults though.
There was a problem in iterating the sequence index that could result
in missing updates. The issue is that while the index was (correctly)
iterated in a snapshot, the actual file infos were read dirty outside of
the snapshot. This fixes this by doing the reads inside the snapshot,
and also updates a couple of other places that did the same thing more
or less harmfully (I didn't investigate).
To avoid similar issues in the future I did some renaming of the
getFile* methods - the ones in a transaction are just getFile, while the
ones directly on the database are variants of getFileDirty to highlight
what's going on.
The problem here is that we would update the sequence index before
updating the FileInfos, which would result in a high sequence number
pointing to a low-sequence FileInfo. The index sender would pick up the
high sequence number, send the old file, and think everything was good.
On the receiving side the old file is a no-op and ignored. The file
remains out of sync until another update for it happens.
This fixes that by correcting the order of operations in the database
update: first we remove old sequence index entries, then we update the
FileInfos (which now don't have anything pointing to them) and then we
add the sequence indexes (which the index sender can see).
The other option is to add "proper" transactions where required at the
database layer. I actually have a branch for that, but it's literally
thousands of lines of diff and I'm putting that off for another day as
this solves the problem...
Adds a receive only folder type that does not send changes, and where the user can optionally revert local changes. Also changes some of the icons to make the three folder types distinguishable.
We have the invalid bit to indicate that a file isn't good. That's enough for remote devices. For ourselves, it would be good to know sometimes why the file isn't good - because it's an unsupported type, because it matches an ignore pattern, or because we detected the data is bad and we need to rescan it.
Or, and this is the main future reason for the PR, because it's a change detected on a receive only device. We will want something like the invalid flag for those changes, but marking them as invalid today means the scanner will rehash them. Hence something more fine grained is required.
This introduces a LocalFlags fields to the FileInfo where we can stash things that we care about locally. For example,
FlagLocalUnsupported = 1 << 0 // The kind is unsupported, e.g. symlinks on Windows
FlagLocalIgnored = 1 << 1 // Matches local ignore patterns
FlagLocalMustRescan = 1 << 2 // Doesn't match content on disk, must be rechecked fully
The LocalFlags fields isn't sent over the wire; instead the Invalid attribute is calculated based on the flags at index sending time. It's on the FileInfo anyway because that's what we serialize to database etc.
The actual Invalid flag should after this just be considered when building the global state and figuring out availability for remote devices. It is not used for local file index entries.
Instead of walking and unmarshalling the entire db and sorting the resulting
file infos by sequence, add store device keys by sequence number in the
database. Thus only the required file infos need be unmarshalled and are already
sorted by index.
This keeps the data we need about sequence numbers and object counts
persistently in the database. The sizeTracker is expanded into a
metadataTracker than handled multiple folders, and the Counts struct is
made protobuf serializable. It gains a Sequence field to assist in
tracking that as well, and a collection of Counts become a CountsSet
(for serialization purposes).
The initial database scan is also a consistency check of the global
entries. This shouldn't strictly be necessary. Nonetheless I added a
created timestamp to the metadata and set a variable to compare against
that. When the time since the metadata creation is old enough, we drop
the metadata and rebuild from scratch like we used to, while also
consistency checking.
A new environment variable STCHECKDBEVERY can override this interval,
and for example be set to zero to force the check immediately.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4547
LGTM: imsodin
This removes a significant, complex chunk of database code. The
"replace" operation walked both the old and new in lockstep and made the
relevant changes to make the new situation correct. But since delta
indexes we pretty much never need this - we just used replace to drop
the existing data and start over.
This makes that explicit and removes the complexity.
(This is one of those things that would be annoying to make case
insensitive, while the actual "drop and then insert" that we do is
easier.)
This is fairly well unit tested...
The one change to the tests is to cover the fact that previously replace
with something identical didn't bump the sequence number, while
obviously removing everything and re-inserting does. This is not
behavior we depend on anywhere.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/4500
LGTM: imsodin, AudriusButkevicius
This changes the BEP protocol to use protocol buffer serialization
instead of XDR, and therefore also the database format. The local
discovery protocol is also updated to be protocol buffer format.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/3276
LGTM: AudriusButkevicius
This adds a metric for "committed items" to the database instance that I
use in the test code, and a couple of tests that ensure that scans that
don't change anything also don't commit anything.
There was a case in the scanner where we set the invalid bit on files
that are ignored, even though they were already ignored and had the
invalid bit set. I had assumed this would result in an extra database
commit, but it was in fact filtered out by the Set... Anyway, I think we
can save some work on not pushing that change to the Set at all.
GitHub-Pull-Request: https://github.com/syncthing/syncthing/pull/3298
We're going to need the db.Instance to keep some state, and for that to
work we need the same one passed around everywhere. Hence this moves the
leveldb-specific file opening stuff into the db package and exports the
dbInstance type.