0dcd5a10d9
The current implementation of xlog_assign_tail_lsn() assumes that when the AIL is empty, the log tail matches the LSN of the last written commit record. This is recorded in xlog_state_set_callback() as log->l_last_sync_lsn when the iclog state changes to XLOG_STATE_CALLBACK. This change is then immediately followed by running the callbacks on the iclog which then insert the log items into the AIL at the "commit lsn" of that checkpoint. The AIL tracks log items via the start record LSN of the checkpoint, not the commit record LSN. This is because we can pipeline multiple checkpoints, and so the start record of checkpoint N+1 can be written before the commit record of checkpoint N. i.e: start N commit N +-------------+------------+----------------+ start N+1 commit N+1 The tail of the log cannot be moved to the LSN of commit N when all the items of that checkpoint are written back, because then the start record for N+1 is no longer in the active portion of the log and recovery will fail/corrupt the filesystem. Hence when all the log items in checkpoint N are written back, the tail of the log most now only move as far forwards as the start LSN of checkpoint N+1. Hence we cannot use the maximum start record LSN the AIL sees as a replacement the pointer to the current head of the on-disk log records. However, we currently only use the l_last_sync_lsn when the AIL is empty - when there is no start LSN remaining, the tail of the log moves to the LSN of the last commit record as this is where recovery needs to start searching for recoverable records. THe next checkpoint will have a start record LSN that is higher than l_last_sync_lsn, and so everything still works correctly when new checkpoints are written to an otherwise empty log. l_last_sync_lsn is an atomic variable because it is currently updated when an iclog with callbacks attached moves to the CALLBACK state. While we hold the icloglock at this point, we don't hold the AIL lock. When we assign the log tail, we hold the AIL lock, not the icloglock because we have to look up the AIL. Hence it is an atomic variable so it's not bound to a specific lock context. However, the iclog callbacks are only used for CIL checkpoints. We don't use callbacks with unmount record writes, so the l_last_sync_lsn variable only gets updated when we are processing CIL checkpoint callbacks. And those callbacks run under AIL lock contexts, not icloglock context. The CIL checkpoint already knows what the LSN of the iclog the commit record was written to (obtained when written into the iclog before submission) and so we can update the l_last_sync_lsn under the AIL lock in this callback. No other iclog callbacks will run until the currently executing one completes, and hence we can update the l_last_sync_lsn under the AIL lock safely. This means l_last_sync_lsn can move to the AIL as the "ail_head_lsn" and it can be used to replace the atomic l_last_sync_lsn in the iclog code. This makes tracking the log tail belong entirely to the AIL, rather than being smeared across log, iclog and AIL state and locking. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
54 lines
1.2 KiB
C
54 lines
1.2 KiB
C
// SPDX-License-Identifier: GPL-2.0
|
|
/*
|
|
* Copyright (c) 2009, Christoph Hellwig
|
|
* All Rights Reserved.
|
|
*/
|
|
#include "xfs.h"
|
|
#include "xfs_fs.h"
|
|
#include "xfs_shared.h"
|
|
#include "xfs_bit.h"
|
|
#include "xfs_format.h"
|
|
#include "xfs_log_format.h"
|
|
#include "xfs_trans_resv.h"
|
|
#include "xfs_mount.h"
|
|
#include "xfs_defer.h"
|
|
#include "xfs_da_format.h"
|
|
#include "xfs_inode.h"
|
|
#include "xfs_btree.h"
|
|
#include "xfs_da_btree.h"
|
|
#include "xfs_alloc.h"
|
|
#include "xfs_bmap.h"
|
|
#include "xfs_attr.h"
|
|
#include "xfs_trans.h"
|
|
#include "xfs_log.h"
|
|
#include "xfs_log_priv.h"
|
|
#include "xfs_trans_priv.h"
|
|
#include "xfs_buf_item.h"
|
|
#include "xfs_quota.h"
|
|
#include "xfs_dquot_item.h"
|
|
#include "xfs_dquot.h"
|
|
#include "xfs_log_recover.h"
|
|
#include "xfs_filestream.h"
|
|
#include "xfs_fsmap.h"
|
|
#include "xfs_btree_staging.h"
|
|
#include "xfs_icache.h"
|
|
#include "xfs_ag.h"
|
|
#include "xfs_ag_resv.h"
|
|
#include "xfs_error.h"
|
|
#include <linux/iomap.h>
|
|
#include "xfs_iomap.h"
|
|
#include "xfs_buf_mem.h"
|
|
#include "xfs_btree_mem.h"
|
|
#include "xfs_exchmaps.h"
|
|
#include "xfs_exchrange.h"
|
|
#include "xfs_parent.h"
|
|
#include "xfs_rmap.h"
|
|
#include "xfs_refcount.h"
|
|
|
|
/*
|
|
* We include this last to have the helpers above available for the trace
|
|
* event implementations.
|
|
*/
|
|
#define CREATE_TRACE_POINTS
|
|
#include "xfs_trace.h"
|