1
linux/fs
Toshiyuki Okajima 0c9169ccad ext4: fix potential infinite loop in ext4_da_writepages()
On linux-2.6.36-rc2, if we execute the following script, we can hang
the system when the /bin/sync command is executed:

========================================================================
#!/bin/sh

echo -n "HANG UP TEST: "
/bin/dd if=/dev/zero of=/tmp/img bs=1k count=1 seek=1M 2> /dev/null
/sbin/mkfs.ext4 -Fq /tmp/img
/bin/mount -o loop -t ext4 /tmp/img /mnt
/bin/dd if=/dev/zero of=/mnt/file bs=1 count=1 \
seek=$((16*1024*1024*1024*1024-4096)) 2> /dev/null
/bin/sync
/bin/umount /mnt
echo "DONE"
exit 0
========================================================================

We can see the following backtrace if we get the kdump when this
hangup occurs:

======================================================================
kthread()
=> bdi_writeback_thread()
   => wb_do_writeback()
      => wb_writeback()
         => writeback_inodes_wb()
            => writeback_sb_inodes()
               => writeback_single_inode()
                  => ext4_da_writepages()  ---+ 
                                ^ infinite    |
                                |   loop      |
                                +-------------+
======================================================================

The reason why this hangup happens is described as follows:
1) We write the last extent block of the file whose size is the filesystem 
   maximum size.
2) "BH_Delay" flag is set on the buffer_head of its block.
3) - the member, "m_lblk" of struct mpage_da_data is 4294967295 (UINT_MAX)
   - the member, "m_len" of struct mpage_da_data is 1
  mpage_put_bnr_to_bhs() which is called via ext4_da_writepages()
  cannot clear "BH_Delay" flag of the buffer_head because the type of
  m_lblk is ext4_lblk_t and then m_lblk + m_len is overflow.

  Therefore an infinite loop occurs because ext4_da_writepages()
  cannot write the page (which corresponds to the block) since
  "BH_Delay" flag isn't cleared.
----------------------------------------------------------------------
static void mpage_put_bnr_to_bhs(struct mpage_da_data *mpd,
				struct ext4_map_blocks *map)
{
...
	int blocks = map->m_len;
...
		do {
			// cur_logical = 4294967295
			// map->m_lblk = 4294967295
			// blocks = 1
			// *** map->m_lblk + blocks == 0 (OVERFLOW!) ***
			// (cur_logical >= map->m_lblk + blocks) => true
			if (cur_logical >= map->m_lblk + blocks)
				break;
----------------------------------------------------------------------

NOTE: Mounting with the nodelalloc option will avoid this codepath,
and thus, avoid this hang

Signed-off-by: Toshiyuki Okajima <toshi.okajima@jp.fujitsu.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2010-10-27 21:30:07 -04:00
..
9p fs/9p: Don't use dotl version of mknod for dotu inode operations 2010-09-13 08:13:03 -05:00
adfs check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
affs AFFS: wait for sb synchronization when needed 2010-08-09 16:48:51 -04:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 2010-08-13 10:37:30 -07:00
autofs autofs/autofs4: Move compat_ioctl handling into fs 2010-08-09 00:13:34 +02:00
autofs4 autofs4: remove unneeded null check in try_to_fill_dentry() 2010-08-11 08:59:06 -07:00
befs
bfs BFS: clean up the superblock usage 2010-08-09 16:48:53 -04:00
btrfs Merge branch 'for-2.6.36' of git://git.kernel.dk/linux-2.6-block 2010-08-10 15:22:42 -07:00
cachefiles Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
ceph ceph: select CRYPTO 2010-09-17 12:30:31 -07:00
cifs cifs: fix potential double put of TCP session reference 2010-09-14 23:21:03 +00:00
coda Coda: mount hangs because of missed REQ_WRITE rename 2010-09-19 11:03:09 -07:00
configfs
cramfs cramfs: only unlock new inodes 2010-08-18 01:01:33 -04:00
debugfs
devpts
dlm fs/dlm: Drop unnecessary null test 2010-08-05 14:23:45 -05:00
ecryptfs eCryptfs: Fix encrypted file name lookup regression 2010-08-27 10:50:53 -05:00
efs
exofs Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd 2010-08-11 09:19:43 -07:00
exportfs
ext2 mbcache: Remove unused features 2010-08-09 16:48:45 -04:00
ext3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ext4 ext4: fix potential infinite loop in ext4_da_writepages() 2010-10-27 21:30:07 -04:00
fat remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
freevxfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
fscache Add a dummy printk function for the maintenance of unused printks 2010-08-12 09:51:35 -07:00
fuse fuse: fix lock annotations 2010-09-07 13:42:41 +02:00
gfs2 GFS2: gfs2_logd should be using interruptible waits 2010-09-17 14:00:10 +01:00
hfs convert remaining ->clear_inode() to ->evict_inode() 2010-08-09 16:48:37 -04:00
hfsplus convert remaining ->clear_inode() to ->evict_inode() 2010-08-09 16:48:37 -04:00
hostfs hostfs ->follow_link() braino 2010-08-18 06:21:10 -04:00
hpfs switch hpfs to ->evict_inode() 2010-08-09 16:48:17 -04:00
hppfs switch hppfs to ->evict_inode() 2010-08-09 16:48:16 -04:00
hugetlbfs new helper: end_writeback() 2010-08-09 16:47:49 -04:00
isofs isofs: Fix lseek() to position beyond 4 GB 2010-08-11 00:29:47 -04:00
jbd remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
jbd2 jbd2: Add sanity check for attempts to start handle during umount 2010-10-27 21:30:04 -04:00
jffs2 Merge git://git.infradead.org/mtd-2.6 2010-08-10 11:49:21 -07:00
jfs jfs: don't allow os2 xattr namespace overlap with others 2010-08-10 15:33:09 -07:00
lockd
logfs logfs: kill BKL 2010-08-14 00:24:24 +02:00
minix minix: fix regression in minix_mkdir() 2010-09-09 18:57:25 -07:00
ncpfs Merge branch 'bkl/ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing 2010-08-10 13:58:28 -07:00
nfs SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies 2010-09-12 19:57:50 -04:00
nfs_common
nfsd SUNRPC: Fix the NFSv4 and RPCSEC_GSS Kconfig dependencies 2010-09-12 19:57:50 -04:00
nilfs2 nilfs2: fix leak of shadow dat inode in error path of load_nilfs 2010-08-30 10:18:03 +09:00
nls
notify fsnotify: drop two useless bools in the fnsotify main loop 2010-08-27 21:42:11 -04:00
ntfs convert remaining ->clear_inode() to ->evict_inode() 2010-08-09 16:48:37 -04:00
ocfs2 o2dlm: force free mles during dlm exit 2010-09-23 14:16:53 -07:00
omfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/bcopeland/omfs 2010-08-10 11:47:36 -07:00
openpromfs
partitions [S390] partitions: fix build error in ibm partition detection code 2010-08-13 10:06:55 +02:00
proc /proc/pid/smaps: fix dirty pages accounting 2010-09-22 17:22:39 -07:00
qnx4 get rid of cont_write_begin_newtrunc 2010-08-09 16:47:31 -04:00
quota Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ramfs check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
reiserfs remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
romfs
smbfs switch smbfs to evict_inode() 2010-08-09 16:48:00 -04:00
squashfs Squashfs: fix checkpatch.pl warnings 2010-08-08 22:29:33 +00:00
sysfs sysfs: checking for NULL instead of ERR_PTR 2010-09-03 17:26:28 -07:00
sysv fs/sysv/super.c: add support for non-PDP11 v7 filesystems 2010-08-11 08:59:23 -07:00
ubifs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
udf Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 2010-08-10 11:26:52 -07:00
ufs remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
xfs xfs: log IO completion workqueue is a high priority queue 2010-09-10 10:16:54 -05:00
aio.c aio: do not return ERESTARTSYS as a result of AIO 2010-09-22 17:22:39 -07:00
anon_inodes.c
attr.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
bad_inode.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
binfmt_aout.c
binfmt_elf_fdpic.c
binfmt_elf.c
binfmt_em86.c
binfmt_flat.c
binfmt_misc.c binfmt_misc: fix binfmt_misc priority 2010-09-09 18:57:24 -07:00
binfmt_script.c Make do_execve() take a const filename pointer 2010-08-17 18:07:43 -07:00
binfmt_som.c
bio-integrity.c fs/bio-integrity.c: return -ENOMEM on kmalloc failure 2010-08-23 13:36:59 +02:00
bio.c block: unify flags for struct bio and struct request 2010-08-07 18:20:39 +02:00
block_dev.c blkdev: cgroup whitelist permission fix 2010-08-11 08:59:18 -07:00
buffer.c remove SWRITE* I/O types 2010-08-18 01:09:01 -04:00
char_dev.c char: Mark /dev/zero and /dev/kmem as not capable of writeback 2010-09-22 09:48:47 +02:00
compat_binfmt_elf.c
compat_ioctl.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
compat.c Prevent freeing uninitialized pointer in compat_do_readv_writev 2010-09-22 17:22:38 -07:00
dcache.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
dcookies.c
direct-io.c O_DIRECT: fix the splitting up of contiguous I/O 2010-09-09 18:57:22 -07:00
drop_caches.c simplify checks for I_CLEAR/I_FREEING 2010-08-09 16:47:44 -04:00
eventfd.c
eventpoll.c
exec.c execve: make responsive to SIGKILL with large arguments 2010-09-10 08:10:26 -07:00
fcntl.c vfs: take O_NONBLOCK out of the O_* uniqueness test 2010-09-09 18:57:25 -07:00
fifo.c
file_table.c fs: scale files_lock 2010-08-18 08:35:48 -04:00
file.c vfs: use kmalloc() to allocate fdmem if possible 2010-08-11 08:59:02 -07:00
filesystems.c
fs_struct.c fs: fs_struct rwlock to spinlock 2010-08-18 08:35:46 -04:00
fs-writeback.c bdi: Fix warnings in __mark_inode_dirty for /dev/zero and friends 2010-09-22 09:48:47 +02:00
generic_acl.c vfs: update ctime when changing the file's permission by setfacl 2010-08-18 01:04:22 -04:00
inode.c Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify 2010-08-10 11:39:13 -07:00
internal.h fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
ioctl.c bkl: Remove locked .ioctl file operation 2010-08-14 00:24:24 +02:00
ioprio.c
Kconfig
Kconfig.binfmt
libfs.c check ATTR_SIZE contraints in inode_change_ok 2010-08-09 16:47:39 -04:00
locks.c
Makefile
mbcache.c mbcache: Limit the maximum number of cache entries 2010-08-18 06:24:41 -04:00
mpage.c
namei.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
namespace.c VFS: Sanity check mount flags passed to change_mnt_propagation() 2010-09-07 13:46:20 -07:00
nfsctl.c
no-block.c
open.c fs: cleanup files_lock locking 2010-08-18 08:35:47 -04:00
pipe.c
pnode.c fs: brlock vfsmount_lock 2010-08-18 08:35:48 -04:00
pnode.h
posix_acl.c
read_write.c
read_write.h
readdir.c vfs: fix warning: 'dirent' is used uninitialized in this function 2010-08-09 20:45:05 -07:00
select.c
seq_file.c
signalfd.c signalfd: fill in ssi_int for posix timers and message queues 2010-08-11 08:59:20 -07:00
splice.c splice: fix misuse of SPLICE_F_NONBLOCK 2010-08-07 18:52:56 +02:00
stack.c
stat.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
statfs.c add f_flags to struct statfs(64) 2010-08-09 16:48:44 -04:00
super.c fs: scale files_lock 2010-08-18 08:35:48 -04:00
sync.c get rid of file_fsync() 2010-08-09 16:47:43 -04:00
timerfd.c
utimes.c Mark arguments to certain syscalls as being const 2010-08-13 16:53:13 -07:00
xattr_acl.c
xattr.c