4b5f2762ec
Make free space calculation less pessimistic and more realistic, which in turn improves 'statfs()' reports. Now it lies by 10%-20%, instead of 20%-30% (10% more honest). Results of "freespace" test (120MiB volume, 16KiB LEB size, 512 bytes page size). Before the change: freespace: Test 1: fill the space we have 3 times freespace: was free: 78274560 bytes 74.6 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 18214912 bytes 17.4 MiB, wrote 23.3% more than predicted freespace: was free: 76754944 bytes 73.2 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 19738624 bytes 18.8 MiB, wrote 25.7% more than predicted freespace: was free: 76759040 bytes 73.2 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 19730432 bytes 18.8 MiB, wrote 25.7% more than predicted freespace: Test 1 finished freespace: Test 2: gradually lessen amount of free space and fill the FS freespace: do 10 steps, lessen free space by6977722
bytes 6.7 MiB each time freespace: was free: 72273920 bytes 68.9 MiB, wrote: 88891392 bytes 84.8 MiB, delta: 16617472 bytes 15.8 MiB, wrote 23.0% more than predicted freespace: was free: 66154496 bytes 63.1 MiB, wrote: 81506304 bytes 77.7 MiB, delta: 15351808 bytes 14.6 MiB, wrote 23.2% more than predicted freespace: was free: 58732544 bytes 56.0 MiB, wrote: 72572928 bytes 69.2 MiB, delta: 13840384 bytes 13.2 MiB, wrote 23.6% more than predicted freespace: was free: 51552256 bytes 49.2 MiB, wrote: 63754240 bytes 60.8 MiB, delta: 12201984 bytes 11.6 MiB, wrote 23.7% more than predicted freespace: was free: 44404736 bytes 42.3 MiB, wrote: 54943744 bytes 52.4 MiB, delta: 10539008 bytes 10.1 MiB, wrote 23.7% more than predicted freespace: was free: 37285888 bytes 35.6 MiB, wrote: 46161920 bytes 44.0 MiB, delta: 8876032 bytes 8.5 MiB, wrote 23.8% more than predicted freespace: was free: 30171136 bytes 28.8 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 7213056 bytes 6.9 MiB, wrote 23.9% more than predicted freespace: was free: 23048192 bytes 22.0 MiB, wrote: 28606464 bytes 27.3 MiB, delta: 5558272 bytes 5.3 MiB, wrote 24.1% more than predicted freespace: was free: 15941632 bytes 15.2 MiB, wrote: 19828736 bytes 18.9 MiB, delta: 3887104 bytes 3.7 MiB, wrote 24.4% more than predicted freespace: was free: 8830976 bytes 8.4 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 2232320 bytes 2.1 MiB, wrote 25.3% more than predicted freespace: Test 2 finished freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS freespace: do 10 steps, lessen free space by 6985541 bytes 6.7 MiB each time freespace: trashing: was free: 76840960 bytes 73.3 MiB, need free: 6985550 bytes 6.7 MiB, files created: 248311, delete 225737 (90.9% of them) freespace: was free: 65228800 bytes 62.2 MiB, wrote: 82530304 bytes 78.7 MiB, delta: 17301504 bytes 16.5 MiB, wrote 26.5% more than predicted freespace: trashing: was free: 74485760 bytes 71.0 MiB, need free: 13971091 bytes 13.3 MiB, files created: 248712, delete 202061 (81.2% of them) freespace: was free: 55025664 bytes 52.5 MiB, wrote: 71925760 bytes 68.6 MiB, delta: 16900096 bytes 16.1 MiB, wrote 30.7% more than predicted freespace: trashing: was free: 75550720 bytes 72.1 MiB, need free: 20956632 bytes 20.0 MiB, files created: 248849, delete 179822 (72.3% of them) freespace: was free: 46669824 bytes 44.5 MiB, wrote: 63197184 bytes 60.3 MiB, delta: 16527360 bytes 15.8 MiB, wrote 35.4% more than predicted freespace: trashing: was free: 76214272 bytes 72.7 MiB, need free: 27942173 bytes 26.6 MiB, files created: 248789, delete 157576 (63.3% of them) freespace: was free: 39129088 bytes 37.3 MiB, wrote: 55164928 bytes 52.6 MiB, delta: 16035840 bytes 15.3 MiB, wrote 41.0% more than predicted freespace: trashing: was free: 77398016 bytes 73.8 MiB, need free: 34927714 bytes 33.3 MiB, files created: 248711, delete 136474 (54.9% of them) freespace: was free: 32325632 bytes 30.8 MiB, wrote: 48234496 bytes 46.0 MiB, delta: 15908864 bytes 15.2 MiB, wrote 49.2% more than predicted freespace: trashing: was free: 75796480 bytes 72.3 MiB, need free: 41913255 bytes 40.0 MiB, files created: 248674, delete 111164 (44.7% of them) freespace: was free: 25079808 bytes 23.9 MiB, wrote: 40775680 bytes 38.9 MiB, delta: 15695872 bytes 15.0 MiB, wrote 62.6% more than predicted freespace: trashing: was free: 78209024 bytes 74.6 MiB, need free: 48898796 bytes 46.6 MiB, files created: 248708, delete 93207 (37.5% of them) freespace: was free: 20582400 bytes 19.6 MiB, wrote: 34844672 bytes 33.2 MiB, delta: 14262272 bytes 13.6 MiB, wrote 69.3% more than predicted freespace: trashing: was free: 77328384 bytes 73.7 MiB, need free: 55884337 bytes 53.3 MiB, files created: 248644, delete 68951 (27.7% of them) freespace: was free: 14368768 bytes 13.7 MiB, wrote: 28278784 bytes 27.0 MiB, delta: 13910016 bytes 13.3 MiB, wrote 96.8% more than predicted freespace: trashing: was free: 77434880 bytes 73.8 MiB, need free: 62869878 bytes 60.0 MiB, files created: 248640, delete 46767 (18.8% of them) freespace: was free: 8286208 bytes 7.9 MiB, wrote: 21811200 bytes 20.8 MiB, delta: 13524992 bytes 12.9 MiB, wrote 163.2% more than predicted freespace: trashing: was free: 77856768 bytes 74.2 MiB, need free: 69855419 bytes 66.6 MiB, files created: 248576, delete 25546 (10.3% of them) freespace: was free: 5570560 bytes 5.3 MiB, wrote: 8187904 bytes 7.8 MiB, delta: 2617344 bytes 2.5 MiB, wrote 47.0% more than predicted freespace: Test 3 finished freespace: finished successfully After the change: freespace: Test 1: fill the space we have 3 times freespace: was free: 85204992 bytes 81.3 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 11284480 bytes 10.8 MiB, wrote 13.2% more than predicted freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96489472 bytes 92.0 MiB, delta: 12935168 bytes 12.3 MiB, wrote 15.5% more than predicted freespace: was free: 83554304 bytes 79.7 MiB, wrote: 96493568 bytes 92.0 MiB, delta: 12939264 bytes 12.3 MiB, wrote 15.5% more than predicted freespace: Test 1 finished freespace: Test 2: gradually lessen amount of free space and fill the FS freespace: do 10 steps, lessen free space by 7596218 bytes 7.2 MiB each time freespace: was free: 78675968 bytes 75.0 MiB, wrote: 88903680 bytes 84.8 MiB, delta: 10227712 bytes 9.8 MiB, wrote 13.0% more than predicted freespace: was free: 72015872 bytes 68.7 MiB, wrote: 81514496 bytes 77.7 MiB, delta: 9498624 bytes 9.1 MiB, wrote 13.2% more than predicted freespace: was free: 63938560 bytes 61.0 MiB, wrote: 72589312 bytes 69.2 MiB, delta:8650752
bytes 8.2 MiB, wrote 13.5% more than predicted freespace: was free: 56127488 bytes 53.5 MiB, wrote: 63762432 bytes 60.8 MiB, delta: 7634944 bytes 7.3 MiB, wrote 13.6% more than predicted freespace: was free: 48336896 bytes 46.1 MiB, wrote: 54935552 bytes 52.4 MiB, delta: 6598656 bytes 6.3 MiB, wrote 13.7% more than predicted freespace: was free: 40587264 bytes 38.7 MiB, wrote: 46157824 bytes 44.0 MiB, delta: 5570560 bytes 5.3 MiB, wrote 13.7% more than predicted freespace: was free: 32841728 bytes 31.3 MiB, wrote: 37384192 bytes 35.7 MiB, delta: 4542464 bytes 4.3 MiB, wrote 13.8% more than predicted freespace: was free: 25100288 bytes 23.9 MiB, wrote: 28618752 bytes 27.3 MiB, delta: 3518464 bytes 3.4 MiB, wrote 14.0% more than predicted freespace: was free: 17342464 bytes 16.5 MiB, wrote: 19841024 bytes 18.9 MiB, delta: 2498560 bytes 2.4 MiB, wrote 14.4% more than predicted freespace: was free: 9605120 bytes 9.2 MiB, wrote: 11063296 bytes 10.6 MiB, delta: 1458176 bytes 1.4 MiB, wrote 15.2% more than predicted freespace: Test 2 finished freespace: Test 3: gradually lessen amount of free space by trashing and fill the FS freespace: do 10 steps, lessen free space by 7606272 bytes 7.3 MiB each time freespace: trashing: was free: 83668992 bytes 79.8 MiB, need free: 7606272 bytes 7.3 MiB, files created: 248297, delete 225724 (90.9% of them) freespace: was free: 70803456 bytes 67.5 MiB, wrote: 82485248 bytes 78.7 MiB, delta: 11681792 bytes 11.1 MiB, wrote 16.5% more than predicted freespace: trashing: was free: 81080320 bytes 77.3 MiB, need free: 15212544 bytes 14.5 MiB, files created: 248711, delete 202047 (81.2% of them) freespace: was free: 59867136 bytes 57.1 MiB, wrote: 71897088 bytes 68.6 MiB, delta: 12029952 bytes 11.5 MiB, wrote 20.1% more than predicted freespace: trashing: was free: 82243584 bytes 78.4 MiB, need free: 22818816 bytes 21.8 MiB, files created: 248866, delete 179817 (72.3% of them) freespace: was free: 50905088 bytes 48.5 MiB, wrote: 63168512 bytes 60.2 MiB, delta: 12263424 bytes 11.7 MiB, wrote 24.1% more than predicted freespace: trashing: was free: 83402752 bytes 79.5 MiB, need free: 30425088 bytes 29.0 MiB, files created: 248920, delete 158114 (63.5% of them) freespace: was free: 42651648 bytes 40.7 MiB, wrote: 55406592 bytes 52.8 MiB, delta: 12754944 bytes 12.2 MiB, wrote 29.9% more than predicted freespace: trashing: was free: 84402176 bytes 80.5 MiB, need free: 38031360 bytes 36.3 MiB, files created: 248709, delete 136641 (54.9% of them) freespace: was free: 35233792 bytes 33.6 MiB, wrote: 48250880 bytes 46.0 MiB, delta: 13017088 bytes 12.4 MiB, wrote 36.9% more than predicted freespace: trashing: was free: 82530304 bytes 78.7 MiB, need free: 45637632 bytes 43.5 MiB, files created: 248778, delete 111208 (44.7% of them) freespace: was free: 27287552 bytes 26.0 MiB, wrote: 40267776 bytes 38.4 MiB, delta: 12980224 bytes 12.4 MiB, wrote 47.6% more than predicted freespace: trashing: was free: 85114880 bytes 81.2 MiB, need free: 53243904 bytes 50.8 MiB, files created: 248508, delete 93052 (37.4% of them) freespace: was free: 22437888 bytes 21.4 MiB, wrote: 35328000 bytes 33.7 MiB, delta: 12890112 bytes 12.3 MiB, wrote 57.4% more than predicted freespace: trashing: was free: 84103168 bytes 80.2 MiB, need free: 60850176 bytes 58.0 MiB, files created: 248637, delete 68743 (27.6% of them) freespace: was free: 15536128 bytes 14.8 MiB, wrote: 28319744 bytes 27.0 MiB, delta: 12783616 bytes 12.2 MiB, wrote 82.3% more than predicted freespace: trashing: was free: 84357120 bytes 80.4 MiB, need free: 68456448 bytes 65.3 MiB, files created: 248567, delete 46852 (18.8% of them) freespace: was free: 9015296 bytes 8.6 MiB, wrote: 22044672 bytes 21.0 MiB, delta: 13029376 bytes 12.4 MiB, wrote 144.5% more than predicted freespace: trashing: was free: 84942848 bytes 81.0 MiB, need free: 76062720 bytes 72.5 MiB, files created: 248636, delete 25993 (10.5% of them) freespace: was free: 6086656 bytes 5.8 MiB, wrote: 8331264 bytes 7.9 MiB, delta: 2244608 bytes 2.1 MiB, wrote 36.9% more than predicted freespace: Test 3 finished freespace: finished successfully Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
782 lines
25 KiB
C
782 lines
25 KiB
C
/*
|
|
* This file is part of UBIFS.
|
|
*
|
|
* Copyright (C) 2006-2008 Nokia Corporation.
|
|
*
|
|
* This program is free software; you can redistribute it and/or modify it
|
|
* under the terms of the GNU General Public License version 2 as published by
|
|
* the Free Software Foundation.
|
|
*
|
|
* This program is distributed in the hope that it will be useful, but WITHOUT
|
|
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
|
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
|
|
* more details.
|
|
*
|
|
* You should have received a copy of the GNU General Public License along with
|
|
* this program; if not, write to the Free Software Foundation, Inc., 51
|
|
* Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
|
|
*
|
|
* Authors: Adrian Hunter
|
|
* Artem Bityutskiy (Битюцкий Артём)
|
|
*/
|
|
|
|
/*
|
|
* This file implements the budgeting sub-system which is responsible for UBIFS
|
|
* space management.
|
|
*
|
|
* Factors such as compression, wasted space at the ends of LEBs, space in other
|
|
* journal heads, the effect of updates on the index, and so on, make it
|
|
* impossible to accurately predict the amount of space needed. Consequently
|
|
* approximations are used.
|
|
*/
|
|
|
|
#include "ubifs.h"
|
|
#include <linux/writeback.h>
|
|
#include <asm/div64.h>
|
|
|
|
/*
|
|
* When pessimistic budget calculations say that there is no enough space,
|
|
* UBIFS starts writing back dirty inodes and pages, doing garbage collection,
|
|
* or committing. The below constants define maximum number of times UBIFS
|
|
* repeats the operations.
|
|
*/
|
|
#define MAX_SHRINK_RETRIES 8
|
|
#define MAX_GC_RETRIES 4
|
|
#define MAX_CMT_RETRIES 2
|
|
#define MAX_NOSPC_RETRIES 1
|
|
|
|
/*
|
|
* The below constant defines amount of dirty pages which should be written
|
|
* back at when trying to shrink the liability.
|
|
*/
|
|
#define NR_TO_WRITE 16
|
|
|
|
/**
|
|
* struct retries_info - information about re-tries while making free space.
|
|
* @prev_liability: previous liability
|
|
* @shrink_cnt: how many times the liability was shrinked
|
|
* @shrink_retries: count of liability shrink re-tries (increased when
|
|
* liability does not shrink)
|
|
* @try_gc: GC should be tried first
|
|
* @gc_retries: how many times GC was run
|
|
* @cmt_retries: how many times commit has been done
|
|
* @nospc_retries: how many times GC returned %-ENOSPC
|
|
*
|
|
* Since we consider budgeting to be the fast-path, and this structure has to
|
|
* be allocated on stack and zeroed out, we make it smaller using bit-fields.
|
|
*/
|
|
struct retries_info {
|
|
long long prev_liability;
|
|
unsigned int shrink_cnt;
|
|
unsigned int shrink_retries:5;
|
|
unsigned int try_gc:1;
|
|
unsigned int gc_retries:4;
|
|
unsigned int cmt_retries:3;
|
|
unsigned int nospc_retries:1;
|
|
};
|
|
|
|
/**
|
|
* shrink_liability - write-back some dirty pages/inodes.
|
|
* @c: UBIFS file-system description object
|
|
* @nr_to_write: how many dirty pages to write-back
|
|
*
|
|
* This function shrinks UBIFS liability by means of writing back some amount
|
|
* of dirty inodes and their pages. Returns the amount of pages which were
|
|
* written back. The returned value does not include dirty inodes which were
|
|
* synchronized.
|
|
*
|
|
* Note, this function synchronizes even VFS inodes which are locked
|
|
* (@i_mutex) by the caller of the budgeting function, because write-back does
|
|
* not touch @i_mutex.
|
|
*/
|
|
static int shrink_liability(struct ubifs_info *c, int nr_to_write)
|
|
{
|
|
int nr_written;
|
|
struct writeback_control wbc = {
|
|
.sync_mode = WB_SYNC_NONE,
|
|
.range_end = LLONG_MAX,
|
|
.nr_to_write = nr_to_write,
|
|
};
|
|
|
|
generic_sync_sb_inodes(c->vfs_sb, &wbc);
|
|
nr_written = nr_to_write - wbc.nr_to_write;
|
|
|
|
if (!nr_written) {
|
|
/*
|
|
* Re-try again but wait on pages/inodes which are being
|
|
* written-back concurrently (e.g., by pdflush).
|
|
*/
|
|
memset(&wbc, 0, sizeof(struct writeback_control));
|
|
wbc.sync_mode = WB_SYNC_ALL;
|
|
wbc.range_end = LLONG_MAX;
|
|
wbc.nr_to_write = nr_to_write;
|
|
generic_sync_sb_inodes(c->vfs_sb, &wbc);
|
|
nr_written = nr_to_write - wbc.nr_to_write;
|
|
}
|
|
|
|
dbg_budg("%d pages were written back", nr_written);
|
|
return nr_written;
|
|
}
|
|
|
|
|
|
/**
|
|
* run_gc - run garbage collector.
|
|
* @c: UBIFS file-system description object
|
|
*
|
|
* This function runs garbage collector to make some more free space. Returns
|
|
* zero if a free LEB has been produced, %-EAGAIN if commit is required, and a
|
|
* negative error code in case of failure.
|
|
*/
|
|
static int run_gc(struct ubifs_info *c)
|
|
{
|
|
int err, lnum;
|
|
|
|
/* Make some free space by garbage-collecting dirty space */
|
|
down_read(&c->commit_sem);
|
|
lnum = ubifs_garbage_collect(c, 1);
|
|
up_read(&c->commit_sem);
|
|
if (lnum < 0)
|
|
return lnum;
|
|
|
|
/* GC freed one LEB, return it to lprops */
|
|
dbg_budg("GC freed LEB %d", lnum);
|
|
err = ubifs_return_leb(c, lnum);
|
|
if (err)
|
|
return err;
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* make_free_space - make more free space on the file-system.
|
|
* @c: UBIFS file-system description object
|
|
* @ri: information about previous invocations of this function
|
|
*
|
|
* This function is called when an operation cannot be budgeted because there
|
|
* is supposedly no free space. But in most cases there is some free space:
|
|
* o budgeting is pessimistic, so it always budgets more then it is actually
|
|
* needed, so shrinking the liability is one way to make free space - the
|
|
* cached data will take less space then it was budgeted for;
|
|
* o GC may turn some dark space into free space (budgeting treats dark space
|
|
* as not available);
|
|
* o commit may free some LEB, i.e., turn freeable LEBs into free LEBs.
|
|
*
|
|
* So this function tries to do the above. Returns %-EAGAIN if some free space
|
|
* was presumably made and the caller has to re-try budgeting the operation.
|
|
* Returns %-ENOSPC if it couldn't do more free space, and other negative error
|
|
* codes on failures.
|
|
*/
|
|
static int make_free_space(struct ubifs_info *c, struct retries_info *ri)
|
|
{
|
|
int err;
|
|
|
|
/*
|
|
* If we have some dirty pages and inodes (liability), try to write
|
|
* them back unless this was tried too many times without effect
|
|
* already.
|
|
*/
|
|
if (ri->shrink_retries < MAX_SHRINK_RETRIES && !ri->try_gc) {
|
|
long long liability;
|
|
|
|
spin_lock(&c->space_lock);
|
|
liability = c->budg_idx_growth + c->budg_data_growth +
|
|
c->budg_dd_growth;
|
|
spin_unlock(&c->space_lock);
|
|
|
|
if (ri->prev_liability >= liability) {
|
|
/* Liability does not shrink, next time try GC then */
|
|
ri->shrink_retries += 1;
|
|
if (ri->gc_retries < MAX_GC_RETRIES)
|
|
ri->try_gc = 1;
|
|
dbg_budg("liability did not shrink: retries %d of %d",
|
|
ri->shrink_retries, MAX_SHRINK_RETRIES);
|
|
}
|
|
|
|
dbg_budg("force write-back (count %d)", ri->shrink_cnt);
|
|
shrink_liability(c, NR_TO_WRITE + ri->shrink_cnt);
|
|
|
|
ri->prev_liability = liability;
|
|
ri->shrink_cnt += 1;
|
|
return -EAGAIN;
|
|
}
|
|
|
|
/*
|
|
* Try to run garbage collector unless it was already tried too many
|
|
* times.
|
|
*/
|
|
if (ri->gc_retries < MAX_GC_RETRIES) {
|
|
ri->gc_retries += 1;
|
|
dbg_budg("run GC, retries %d of %d",
|
|
ri->gc_retries, MAX_GC_RETRIES);
|
|
|
|
ri->try_gc = 0;
|
|
err = run_gc(c);
|
|
if (!err)
|
|
return -EAGAIN;
|
|
|
|
if (err == -EAGAIN) {
|
|
dbg_budg("GC asked to commit");
|
|
err = ubifs_run_commit(c);
|
|
if (err)
|
|
return err;
|
|
return -EAGAIN;
|
|
}
|
|
|
|
if (err != -ENOSPC)
|
|
return err;
|
|
|
|
/*
|
|
* GC could not make any progress. If this is the first time,
|
|
* then it makes sense to try to commit, because it might make
|
|
* some dirty space.
|
|
*/
|
|
dbg_budg("GC returned -ENOSPC, retries %d",
|
|
ri->nospc_retries);
|
|
if (ri->nospc_retries >= MAX_NOSPC_RETRIES)
|
|
return err;
|
|
ri->nospc_retries += 1;
|
|
}
|
|
|
|
/* Neither GC nor write-back helped, try to commit */
|
|
if (ri->cmt_retries < MAX_CMT_RETRIES) {
|
|
ri->cmt_retries += 1;
|
|
dbg_budg("run commit, retries %d of %d",
|
|
ri->cmt_retries, MAX_CMT_RETRIES);
|
|
err = ubifs_run_commit(c);
|
|
if (err)
|
|
return err;
|
|
return -EAGAIN;
|
|
}
|
|
return -ENOSPC;
|
|
}
|
|
|
|
/**
|
|
* ubifs_calc_min_idx_lebs - calculate amount of eraseblocks for the index.
|
|
* @c: UBIFS file-system description object
|
|
*
|
|
* This function calculates and returns the number of eraseblocks which should
|
|
* be kept for index usage.
|
|
*/
|
|
int ubifs_calc_min_idx_lebs(struct ubifs_info *c)
|
|
{
|
|
int ret;
|
|
uint64_t idx_size;
|
|
|
|
idx_size = c->old_idx_sz + c->budg_idx_growth + c->budg_uncommitted_idx;
|
|
|
|
/* And make sure we have thrice the index size of space reserved */
|
|
idx_size = idx_size + (idx_size << 1);
|
|
|
|
/*
|
|
* We do not maintain 'old_idx_size' as 'old_idx_lebs'/'old_idx_bytes'
|
|
* pair, nor similarly the two variables for the new index size, so we
|
|
* have to do this costly 64-bit division on fast-path.
|
|
*/
|
|
if (do_div(idx_size, c->leb_size - c->max_idx_node_sz))
|
|
ret = idx_size + 1;
|
|
else
|
|
ret = idx_size;
|
|
/*
|
|
* The index head is not available for the in-the-gaps method, so add an
|
|
* extra LEB to compensate.
|
|
*/
|
|
ret += 1;
|
|
/*
|
|
* At present the index needs at least 2 LEBs: one for the index head
|
|
* and one for in-the-gaps method (which currently does not cater for
|
|
* the index head and so excludes it from consideration).
|
|
*/
|
|
if (ret < 2)
|
|
ret = 2;
|
|
return ret;
|
|
}
|
|
|
|
/**
|
|
* ubifs_calc_available - calculate available FS space.
|
|
* @c: UBIFS file-system description object
|
|
* @min_idx_lebs: minimum number of LEBs reserved for the index
|
|
*
|
|
* This function calculates and returns amount of FS space available for use.
|
|
*/
|
|
long long ubifs_calc_available(const struct ubifs_info *c, int min_idx_lebs)
|
|
{
|
|
int subtract_lebs;
|
|
long long available;
|
|
|
|
available = c->main_bytes - c->lst.total_used;
|
|
|
|
/*
|
|
* Now 'available' contains theoretically available flash space
|
|
* assuming there is no index, so we have to subtract the space which
|
|
* is reserved for the index.
|
|
*/
|
|
subtract_lebs = min_idx_lebs;
|
|
|
|
/* Take into account that GC reserves one LEB for its own needs */
|
|
subtract_lebs += 1;
|
|
|
|
/*
|
|
* The GC journal head LEB is not really accessible. And since
|
|
* different write types go to different heads, we may count only on
|
|
* one head's space.
|
|
*/
|
|
subtract_lebs += c->jhead_cnt - 1;
|
|
|
|
/* We also reserve one LEB for deletions, which bypass budgeting */
|
|
subtract_lebs += 1;
|
|
|
|
available -= (long long)subtract_lebs * c->leb_size;
|
|
|
|
/* Subtract the dead space which is not available for use */
|
|
available -= c->lst.total_dead;
|
|
|
|
/*
|
|
* Subtract dark space, which might or might not be usable - it depends
|
|
* on the data which we have on the media and which will be written. If
|
|
* this is a lot of uncompressed or not-compressible data, the dark
|
|
* space cannot be used.
|
|
*/
|
|
available -= c->lst.total_dark;
|
|
|
|
/*
|
|
* However, there is more dark space. The index may be bigger than
|
|
* @min_idx_lebs. Those extra LEBs are assumed to be available, but
|
|
* their dark space is not included in total_dark, so it is subtracted
|
|
* here.
|
|
*/
|
|
if (c->lst.idx_lebs > min_idx_lebs) {
|
|
subtract_lebs = c->lst.idx_lebs - min_idx_lebs;
|
|
available -= subtract_lebs * c->dark_wm;
|
|
}
|
|
|
|
/* The calculations are rough and may end up with a negative number */
|
|
return available > 0 ? available : 0;
|
|
}
|
|
|
|
/**
|
|
* can_use_rp - check whether the user is allowed to use reserved pool.
|
|
* @c: UBIFS file-system description object
|
|
*
|
|
* UBIFS has so-called "reserved pool" which is flash space reserved
|
|
* for the superuser and for uses whose UID/GID is recorded in UBIFS superblock.
|
|
* This function checks whether current user is allowed to use reserved pool.
|
|
* Returns %1 current user is allowed to use reserved pool and %0 otherwise.
|
|
*/
|
|
static int can_use_rp(struct ubifs_info *c)
|
|
{
|
|
if (current->fsuid == c->rp_uid || capable(CAP_SYS_RESOURCE) ||
|
|
(c->rp_gid != 0 && in_group_p(c->rp_gid)))
|
|
return 1;
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* do_budget_space - reserve flash space for index and data growth.
|
|
* @c: UBIFS file-system description object
|
|
*
|
|
* This function makes sure UBIFS has enough free eraseblocks for index growth
|
|
* and data.
|
|
*
|
|
* When budgeting index space, UBIFS reserves thrice as many LEBs as the index
|
|
* would take if it was consolidated and written to the flash. This guarantees
|
|
* that the "in-the-gaps" commit method always succeeds and UBIFS will always
|
|
* be able to commit dirty index. So this function basically adds amount of
|
|
* budgeted index space to the size of the current index, multiplies this by 3,
|
|
* and makes sure this does not exceed the amount of free eraseblocks.
|
|
*
|
|
* Notes about @c->min_idx_lebs and @c->lst.idx_lebs variables:
|
|
* o @c->lst.idx_lebs is the number of LEBs the index currently uses. It might
|
|
* be large, because UBIFS does not do any index consolidation as long as
|
|
* there is free space. IOW, the index may take a lot of LEBs, but the LEBs
|
|
* will contain a lot of dirt.
|
|
* o @c->min_idx_lebs is the the index presumably takes. IOW, the index may be
|
|
* consolidated to take up to @c->min_idx_lebs LEBs.
|
|
*
|
|
* This function returns zero in case of success, and %-ENOSPC in case of
|
|
* failure.
|
|
*/
|
|
static int do_budget_space(struct ubifs_info *c)
|
|
{
|
|
long long outstanding, available;
|
|
int lebs, rsvd_idx_lebs, min_idx_lebs;
|
|
|
|
/* First budget index space */
|
|
min_idx_lebs = ubifs_calc_min_idx_lebs(c);
|
|
|
|
/* Now 'min_idx_lebs' contains number of LEBs to reserve */
|
|
if (min_idx_lebs > c->lst.idx_lebs)
|
|
rsvd_idx_lebs = min_idx_lebs - c->lst.idx_lebs;
|
|
else
|
|
rsvd_idx_lebs = 0;
|
|
|
|
/*
|
|
* The number of LEBs that are available to be used by the index is:
|
|
*
|
|
* @c->lst.empty_lebs + @c->freeable_cnt + @c->idx_gc_cnt -
|
|
* @c->lst.taken_empty_lebs
|
|
*
|
|
* @empty_lebs are available because they are empty. @freeable_cnt are
|
|
* available because they contain only free and dirty space and the
|
|
* index allocation always occurs after wbufs are synch'ed.
|
|
* @idx_gc_cnt are available because they are index LEBs that have been
|
|
* garbage collected (including trivial GC) and are awaiting the commit
|
|
* before they can be unmapped - note that the in-the-gaps method will
|
|
* grab these if it needs them. @taken_empty_lebs are empty_lebs that
|
|
* have already been allocated for some purpose (also includes those
|
|
* LEBs on the @idx_gc list).
|
|
*
|
|
* Note, @taken_empty_lebs may temporarily be higher by one because of
|
|
* the way we serialize LEB allocations and budgeting. See a comment in
|
|
* 'ubifs_find_free_space()'.
|
|
*/
|
|
lebs = c->lst.empty_lebs + c->freeable_cnt + c->idx_gc_cnt -
|
|
c->lst.taken_empty_lebs;
|
|
if (unlikely(rsvd_idx_lebs > lebs)) {
|
|
dbg_budg("out of indexing space: min_idx_lebs %d (old %d), "
|
|
"rsvd_idx_lebs %d", min_idx_lebs, c->min_idx_lebs,
|
|
rsvd_idx_lebs);
|
|
return -ENOSPC;
|
|
}
|
|
|
|
available = ubifs_calc_available(c, min_idx_lebs);
|
|
outstanding = c->budg_data_growth + c->budg_dd_growth;
|
|
|
|
if (unlikely(available < outstanding)) {
|
|
dbg_budg("out of data space: available %lld, outstanding %lld",
|
|
available, outstanding);
|
|
return -ENOSPC;
|
|
}
|
|
|
|
if (available - outstanding <= c->rp_size && !can_use_rp(c))
|
|
return -ENOSPC;
|
|
|
|
c->min_idx_lebs = min_idx_lebs;
|
|
return 0;
|
|
}
|
|
|
|
/**
|
|
* calc_idx_growth - calculate approximate index growth from budgeting request.
|
|
* @c: UBIFS file-system description object
|
|
* @req: budgeting request
|
|
*
|
|
* For now we assume each new node adds one znode. But this is rather poor
|
|
* approximation, though.
|
|
*/
|
|
static int calc_idx_growth(const struct ubifs_info *c,
|
|
const struct ubifs_budget_req *req)
|
|
{
|
|
int znodes;
|
|
|
|
znodes = req->new_ino + (req->new_page << UBIFS_BLOCKS_PER_PAGE_SHIFT) +
|
|
req->new_dent;
|
|
return znodes * c->max_idx_node_sz;
|
|
}
|
|
|
|
/**
|
|
* calc_data_growth - calculate approximate amount of new data from budgeting
|
|
* request.
|
|
* @c: UBIFS file-system description object
|
|
* @req: budgeting request
|
|
*/
|
|
static int calc_data_growth(const struct ubifs_info *c,
|
|
const struct ubifs_budget_req *req)
|
|
{
|
|
int data_growth;
|
|
|
|
data_growth = req->new_ino ? c->inode_budget : 0;
|
|
if (req->new_page)
|
|
data_growth += c->page_budget;
|
|
if (req->new_dent)
|
|
data_growth += c->dent_budget;
|
|
data_growth += req->new_ino_d;
|
|
return data_growth;
|
|
}
|
|
|
|
/**
|
|
* calc_dd_growth - calculate approximate amount of data which makes other data
|
|
* dirty from budgeting request.
|
|
* @c: UBIFS file-system description object
|
|
* @req: budgeting request
|
|
*/
|
|
static int calc_dd_growth(const struct ubifs_info *c,
|
|
const struct ubifs_budget_req *req)
|
|
{
|
|
int dd_growth;
|
|
|
|
dd_growth = req->dirtied_page ? c->page_budget : 0;
|
|
|
|
if (req->dirtied_ino)
|
|
dd_growth += c->inode_budget << (req->dirtied_ino - 1);
|
|
if (req->mod_dent)
|
|
dd_growth += c->dent_budget;
|
|
dd_growth += req->dirtied_ino_d;
|
|
return dd_growth;
|
|
}
|
|
|
|
/**
|
|
* ubifs_budget_space - ensure there is enough space to complete an operation.
|
|
* @c: UBIFS file-system description object
|
|
* @req: budget request
|
|
*
|
|
* This function allocates budget for an operation. It uses pessimistic
|
|
* approximation of how much flash space the operation needs. The goal of this
|
|
* function is to make sure UBIFS always has flash space to flush all dirty
|
|
* pages, dirty inodes, and dirty znodes (liability). This function may force
|
|
* commit, garbage-collection or write-back. Returns zero in case of success,
|
|
* %-ENOSPC if there is no free space and other negative error codes in case of
|
|
* failures.
|
|
*/
|
|
int ubifs_budget_space(struct ubifs_info *c, struct ubifs_budget_req *req)
|
|
{
|
|
int uninitialized_var(cmt_retries), uninitialized_var(wb_retries);
|
|
int err, idx_growth, data_growth, dd_growth;
|
|
struct retries_info ri;
|
|
|
|
ubifs_assert(req->new_page <= 1);
|
|
ubifs_assert(req->dirtied_page <= 1);
|
|
ubifs_assert(req->new_dent <= 1);
|
|
ubifs_assert(req->mod_dent <= 1);
|
|
ubifs_assert(req->new_ino <= 1);
|
|
ubifs_assert(req->new_ino_d <= UBIFS_MAX_INO_DATA);
|
|
ubifs_assert(req->dirtied_ino <= 4);
|
|
ubifs_assert(req->dirtied_ino_d <= UBIFS_MAX_INO_DATA * 4);
|
|
ubifs_assert(!(req->new_ino_d & 7));
|
|
ubifs_assert(!(req->dirtied_ino_d & 7));
|
|
|
|
data_growth = calc_data_growth(c, req);
|
|
dd_growth = calc_dd_growth(c, req);
|
|
if (!data_growth && !dd_growth)
|
|
return 0;
|
|
idx_growth = calc_idx_growth(c, req);
|
|
memset(&ri, 0, sizeof(struct retries_info));
|
|
|
|
again:
|
|
spin_lock(&c->space_lock);
|
|
ubifs_assert(c->budg_idx_growth >= 0);
|
|
ubifs_assert(c->budg_data_growth >= 0);
|
|
ubifs_assert(c->budg_dd_growth >= 0);
|
|
|
|
if (unlikely(c->nospace) && (c->nospace_rp || !can_use_rp(c))) {
|
|
dbg_budg("no space");
|
|
spin_unlock(&c->space_lock);
|
|
return -ENOSPC;
|
|
}
|
|
|
|
c->budg_idx_growth += idx_growth;
|
|
c->budg_data_growth += data_growth;
|
|
c->budg_dd_growth += dd_growth;
|
|
|
|
err = do_budget_space(c);
|
|
if (likely(!err)) {
|
|
req->idx_growth = idx_growth;
|
|
req->data_growth = data_growth;
|
|
req->dd_growth = dd_growth;
|
|
spin_unlock(&c->space_lock);
|
|
return 0;
|
|
}
|
|
|
|
/* Restore the old values */
|
|
c->budg_idx_growth -= idx_growth;
|
|
c->budg_data_growth -= data_growth;
|
|
c->budg_dd_growth -= dd_growth;
|
|
spin_unlock(&c->space_lock);
|
|
|
|
if (req->fast) {
|
|
dbg_budg("no space for fast budgeting");
|
|
return err;
|
|
}
|
|
|
|
err = make_free_space(c, &ri);
|
|
if (err == -EAGAIN) {
|
|
dbg_budg("try again");
|
|
cond_resched();
|
|
goto again;
|
|
} else if (err == -ENOSPC) {
|
|
dbg_budg("FS is full, -ENOSPC");
|
|
c->nospace = 1;
|
|
if (can_use_rp(c) || c->rp_size == 0)
|
|
c->nospace_rp = 1;
|
|
smp_wmb();
|
|
} else
|
|
ubifs_err("cannot budget space, error %d", err);
|
|
return err;
|
|
}
|
|
|
|
/**
|
|
* ubifs_release_budget - release budgeted free space.
|
|
* @c: UBIFS file-system description object
|
|
* @req: budget request
|
|
*
|
|
* This function releases the space budgeted by 'ubifs_budget_space()'. Note,
|
|
* since the index changes (which were budgeted for in @req->idx_growth) will
|
|
* only be written to the media on commit, this function moves the index budget
|
|
* from @c->budg_idx_growth to @c->budg_uncommitted_idx. The latter will be
|
|
* zeroed by the commit operation.
|
|
*/
|
|
void ubifs_release_budget(struct ubifs_info *c, struct ubifs_budget_req *req)
|
|
{
|
|
ubifs_assert(req->new_page <= 1);
|
|
ubifs_assert(req->dirtied_page <= 1);
|
|
ubifs_assert(req->new_dent <= 1);
|
|
ubifs_assert(req->mod_dent <= 1);
|
|
ubifs_assert(req->new_ino <= 1);
|
|
ubifs_assert(req->new_ino_d <= UBIFS_MAX_INO_DATA);
|
|
ubifs_assert(req->dirtied_ino <= 4);
|
|
ubifs_assert(req->dirtied_ino_d <= UBIFS_MAX_INO_DATA * 4);
|
|
ubifs_assert(!(req->new_ino_d & 7));
|
|
ubifs_assert(!(req->dirtied_ino_d & 7));
|
|
if (!req->recalculate) {
|
|
ubifs_assert(req->idx_growth >= 0);
|
|
ubifs_assert(req->data_growth >= 0);
|
|
ubifs_assert(req->dd_growth >= 0);
|
|
}
|
|
|
|
if (req->recalculate) {
|
|
req->data_growth = calc_data_growth(c, req);
|
|
req->dd_growth = calc_dd_growth(c, req);
|
|
req->idx_growth = calc_idx_growth(c, req);
|
|
}
|
|
|
|
if (!req->data_growth && !req->dd_growth)
|
|
return;
|
|
|
|
c->nospace = c->nospace_rp = 0;
|
|
smp_wmb();
|
|
|
|
spin_lock(&c->space_lock);
|
|
c->budg_idx_growth -= req->idx_growth;
|
|
c->budg_uncommitted_idx += req->idx_growth;
|
|
c->budg_data_growth -= req->data_growth;
|
|
c->budg_dd_growth -= req->dd_growth;
|
|
c->min_idx_lebs = ubifs_calc_min_idx_lebs(c);
|
|
|
|
ubifs_assert(c->budg_idx_growth >= 0);
|
|
ubifs_assert(c->budg_data_growth >= 0);
|
|
ubifs_assert(c->budg_dd_growth >= 0);
|
|
ubifs_assert(c->min_idx_lebs < c->main_lebs);
|
|
ubifs_assert(!(c->budg_idx_growth & 7));
|
|
ubifs_assert(!(c->budg_data_growth & 7));
|
|
ubifs_assert(!(c->budg_dd_growth & 7));
|
|
spin_unlock(&c->space_lock);
|
|
}
|
|
|
|
/**
|
|
* ubifs_convert_page_budget - convert budget of a new page.
|
|
* @c: UBIFS file-system description object
|
|
*
|
|
* This function converts budget which was allocated for a new page of data to
|
|
* the budget of changing an existing page of data. The latter is smaller then
|
|
* the former, so this function only does simple re-calculation and does not
|
|
* involve any write-back.
|
|
*/
|
|
void ubifs_convert_page_budget(struct ubifs_info *c)
|
|
{
|
|
spin_lock(&c->space_lock);
|
|
/* Release the index growth reservation */
|
|
c->budg_idx_growth -= c->max_idx_node_sz << UBIFS_BLOCKS_PER_PAGE_SHIFT;
|
|
/* Release the data growth reservation */
|
|
c->budg_data_growth -= c->page_budget;
|
|
/* Increase the dirty data growth reservation instead */
|
|
c->budg_dd_growth += c->page_budget;
|
|
/* And re-calculate the indexing space reservation */
|
|
c->min_idx_lebs = ubifs_calc_min_idx_lebs(c);
|
|
spin_unlock(&c->space_lock);
|
|
}
|
|
|
|
/**
|
|
* ubifs_release_dirty_inode_budget - release dirty inode budget.
|
|
* @c: UBIFS file-system description object
|
|
* @ui: UBIFS inode to release the budget for
|
|
*
|
|
* This function releases budget corresponding to a dirty inode. It is usually
|
|
* called when after the inode has been written to the media and marked as
|
|
* clean.
|
|
*/
|
|
void ubifs_release_dirty_inode_budget(struct ubifs_info *c,
|
|
struct ubifs_inode *ui)
|
|
{
|
|
struct ubifs_budget_req req;
|
|
|
|
memset(&req, 0, sizeof(struct ubifs_budget_req));
|
|
req.dd_growth = c->inode_budget + ALIGN(ui->data_len, 8);
|
|
ubifs_release_budget(c, &req);
|
|
}
|
|
|
|
/**
|
|
* ubifs_reported_space - calculate reported free space.
|
|
* @c: the UBIFS file-system description object
|
|
* @free: amount of free space
|
|
*
|
|
* This function calculates amount of free space which will be reported to
|
|
* user-space. User-space application tend to expect that if the file-system
|
|
* (e.g., via the 'statfs()' call) reports that it has N bytes available, they
|
|
* are able to write a file of size N. UBIFS attaches node headers to each data
|
|
* node and it has to write indexind nodes as well. This introduces additional
|
|
* overhead, and UBIFS it has to report sligtly less free space to meet the
|
|
* above expectetion.
|
|
*
|
|
* This function assumes free space is made up of uncompressed data nodes and
|
|
* full index nodes (one per data node, tripled because we always allow enough
|
|
* space to write the index thrice).
|
|
*
|
|
* Note, the calculation is pessimistic, which means that most of the time
|
|
* UBIFS reports less space than it actually has.
|
|
*/
|
|
long long ubifs_reported_space(const struct ubifs_info *c, uint64_t free)
|
|
{
|
|
int divisor, factor;
|
|
|
|
/*
|
|
* Reported space size is @free * X, where X is UBIFS block size
|
|
* divided by UBIFS block size + all overhead one data block
|
|
* introduces. The overhead is the node header + indexing overhead.
|
|
*
|
|
* Indexing overhead is calculations are based on the following
|
|
* formula: I = N/(f - 1) + 1, where I - number of indexing nodes, N -
|
|
* number of data nodes, f - fanout. Because effective UBIFS fanout is
|
|
* twice as less than maximum fanout, we assume that each data node
|
|
* introduces 3 * @c->max_idx_node_sz / (@c->fanout/2 - 1) bytes.
|
|
* Note, the multiplier 3 is because UBIFS reseves thrice as more space
|
|
* for the index.
|
|
*/
|
|
factor = UBIFS_BLOCK_SIZE;
|
|
divisor = UBIFS_MAX_DATA_NODE_SZ;
|
|
divisor += (c->max_idx_node_sz * 3) / ((c->fanout >> 1) - 1);
|
|
free *= factor;
|
|
do_div(free, divisor);
|
|
return free;
|
|
}
|
|
|
|
/**
|
|
* ubifs_budg_get_free_space - return amount of free space.
|
|
* @c: UBIFS file-system description object
|
|
*
|
|
* This function returns amount of free space on the file-system.
|
|
*/
|
|
long long ubifs_budg_get_free_space(struct ubifs_info *c)
|
|
{
|
|
int min_idx_lebs;
|
|
long long available, outstanding, free;
|
|
|
|
spin_lock(&c->space_lock);
|
|
min_idx_lebs = ubifs_calc_min_idx_lebs(c);
|
|
outstanding = c->budg_data_growth + c->budg_dd_growth;
|
|
|
|
/*
|
|
* Force the amount available to the total size reported if the used
|
|
* space is zero.
|
|
*/
|
|
if (c->lst.total_used <= UBIFS_INO_NODE_SZ && !outstanding) {
|
|
spin_unlock(&c->space_lock);
|
|
return (long long)c->block_cnt << UBIFS_BLOCK_SHIFT;
|
|
}
|
|
|
|
available = ubifs_calc_available(c, min_idx_lebs);
|
|
spin_unlock(&c->space_lock);
|
|
|
|
if (available > outstanding)
|
|
free = ubifs_reported_space(c, available - outstanding);
|
|
else
|
|
free = 0;
|
|
return free;
|
|
}
|