linux

Author	SHA1	Message	Date
Dave Chinner	2c761270d5	lib: Introduce generic list_sort function There are two copies of list_sort() in the tree already, one in the DRM code, another in ubifs. Now XFS needs this as well. Create a generic list_sort() function from the ubifs version and convert existing users to it so we don't end up with yet another copy in the tree. Signed-off-by: Dave Chinner <david@fromorbit.com> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Artem Bityutskiy <dedekind@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-12 21:02:00 -08:00
Linus Torvalds	1b4d40a517	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: Ensure we force all busy extents in range to disk xfs: Don't flush stale inodes xfs: fix timestamp handling in xfs_setattr xfs: use DECLARE_EVENT_CLASS	2010-01-11 09:48:48 -08:00
Linus Torvalds	79ecb043ea	Merge git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes * git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-2.6-fixes: GFS2: Use MAX_LFS_FILESIZE for meta inode size GFS2: Fix gfs2_xattr_acl_chmod() GFS2: Fix locking bug in rename GFS2: Ensure uptodate inode size when using O_APPEND	2010-01-11 09:48:29 -08:00
Linus Torvalds	db1fc95744	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6: quota: Fix dquot_transfer for filesystems different from ext4	2010-01-11 09:48:14 -08:00
Minchan Kim	7f53a09ed4	smaps: fix wrong rss count A long time ago we regarded zero page as file_rss and vm_normal_page doesn't return NULL. But now, we reinstated ZERO_PAGE and vm_normal_page's implementation can return NULL in case of zero page. Also we don't count it with file_rss any more. Then, RSS and PSS can't be matched. For consistency, Let's ignore zero page in smaps_pte_range. Signed-off-by: Minchan Kim <minchan.kim@gmail.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> Acked-by: Matt Mackall <mpm@selenic.com> Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-11 09:34:07 -08:00
KOSAKI Motohiro	1306d603fc	proc: partially revert "procfs: provide stack information for threads" Commit `d899bf7b` (procfs: provide stack information for threads) introduced to show stack information in /proc/{pid}/status. But it cause large performance regression. Unfortunately /proc/{pid}/status is used ps command too and ps is one of most important component. Because both to take mmap_sem and page table walk are heavily operation. If many process run, the ps performance is, [before `d899bf7b`] % perf stat ps >/dev/null Performance counter stats for 'ps': 4090.435806 task-clock-msecs # 0.032 CPUs 229 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 234 page-faults # 0.000 M/sec 8587565207 cycles # 2099.425 M/sec 9866662403 instructions # 1.149 IPC 3789415411 cache-references # 926.409 M/sec 30419509 cache-misses # 7.437 M/sec 128.859521955 seconds time elapsed [after `d899bf7b`] % perf stat ps > /dev/null Performance counter stats for 'ps': 4305.081146 task-clock-msecs # 0.028 CPUs 480 context-switches # 0.000 M/sec 2 CPU-migrations # 0.000 M/sec 237 page-faults # 0.000 M/sec 9021211334 cycles # 2095.480 M/sec 10605887536 instructions # 1.176 IPC 3612650999 cache-references # 839.160 M/sec 23917502 cache-misses # 5.556 M/sec 152.277819582 seconds time elapsed Thus, this patch revert it. Fortunately /proc/{pid}/task/{tid}/smaps provide almost same information. we can use it. Commit `d899bf7b` introduced two features: 1) Add the annotattion of [thread stack: xxxx] mark to /proc/{pid}/task/{tid}/maps. 2) Add StackUsage field to /proc/{pid}/status. I only revert (2), because I haven't seen (1) cause regression. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Stefani Seibold <stefani@seibold.net> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-11 09:34:06 -08:00
Jan Kara	05b5d89823	quota: Fix dquot_transfer for filesystems different from ext4 Commit `fd8fbfc1` modified the way we find amount of reserved space belonging to an inode. The amount of reserved space is checked from dquot_transfer and thus inode_reserved_space gets called even for filesystems that don't provide get_reserved_space callback which results in a BUG. Fix the problem by checking get_reserved_space callback and return 0 if the filesystem does not provide it. CC: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2010-01-11 13:06:41 +01:00
Steven Whitehouse	ba198098a2	GFS2: Use MAX_LFS_FILESIZE for meta inode size Using ~0ULL was cauing sign issues in filemap_fdatawrite_range, so use MAX_LFS_FILESIZE instead. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2010-01-11 08:57:55 +00:00
Dave Chinner	fd45e47841	xfs: Ensure we force all busy extents in range to disk When we search for and find a busy extent during allocation we force the log out to ensure the extent free transaction is on disk before the allocation transaction. The current implementation has a subtle bug in it--it does not handle multiple overlapping ranges. That is, if we free lots of little extents into a single contiguous extent, then allocate the contiguous extent, the busy search code stops searching at the first extent it finds that overlaps the allocated range. It then uses the commit LSN of the transaction to force the log out to. Unfortunately, the other busy ranges might have more recent commit LSNs than the first busy extent that is found, and this results in xfs_alloc_search_busy() returning before all the extent free transactions are on disk for the range being allocated. This can lead to potential metadata corruption or stale data exposure after a crash because log replay won't replay all the extent free transactions that cover the allocation range. Modified-by: Alex Elder <aelder@sgi.com> (Dropped the "found" argument from the xfs_alloc_busysearch trace event.) Signed-off-by: Dave Chinner <david@fromorbit.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-01-10 12:22:02 -06:00
Dave Chinner	44e08c45cc	xfs: Don't flush stale inodes Because inodes remain in cache much longer than inode buffers do under memory pressure, we can get the situation where we have stale, dirty inodes being reclaimed but the backing storage has been freed. Hence we should never, ever flush XFS_ISTALE inodes to disk as there is no guarantee that the backing buffer is in cache and still marked stale when the flush occurs. Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-01-10 12:22:00 -06:00
Christoph Hellwig	d6d59bada3	xfs: fix timestamp handling in xfs_setattr We currently have some rather odd code in xfs_setattr for updating the a/c/mtime timestamps: - first we do a non-transaction update if all three are updated together - second we implicitly update the ctime for various changes instead of relying on the ATTR_CTIME flag - third we set the timestamps to the current time instead of the arguments in the iattr structure in many cases. This patch makes sure we update it in a consistent way: - always transactional - ctime is only updated if ATTR_CTIME is set or we do a size update, which is a special case - always to the times passed in from the caller instead of the current time The only non-size caller of xfs_setattr that doesn't come from the VFS is updated to set ATTR_CTIME and pass in a valid ctime value. Reported-by: Eric Blake <ebb9@byu.net> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-01-10 12:21:58 -06:00
Christoph Hellwig	ea9a48881e	xfs: use DECLARE_EVENT_CLASS Using DECLARE_EVENT_CLASS allows us to to use trace event code instead of duplicating it in the binary. This was not available before 2.6.33 so it had to be done as a separate step once the prerequisite was merged. This only requires changes to xfs_trace.h and the results are rather impressive: hch@brick:~/work/linux-2.6/obj-kvm$ size fs/xfs/xfs.o* text data bss dec hex filename 607732 41884 3616 653232 9f7b0 fs/xfs/xfs.o 1026732 41884 3808 1072424 105d28 fs/xfs/xfs.o.old Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-01-10 12:21:56 -06:00
Linus Torvalds	82062e7b50	Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing * 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: reiserfs: Relax reiserfs_xattr_set_handle() while acquiring xattr locks reiserfs: Fix unreachable statement reiserfs: Don't call reiserfs_get_acl() with the reiserfs lock reiserfs: Relax lock on xattr removing reiserfs: Relax the lock before truncating pages reiserfs: Fix recursive lock on lchown reiserfs: Fix mistake in down_write() conversion	2010-01-08 14:03:55 -08:00
Linus Torvalds	dbd6a7cfea	Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs * 'for-linus' of git://oss.sgi.com/xfs/xfs: xfs: kill some warnings on i386 builds	2010-01-08 13:57:32 -08:00
Linus Torvalds	e2b6d02cca	Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: nfs: fix oops in nfs_rename() sunrpc: fix build-time warning sunrpc: on successful gss error pipe write, don't return error SUNRPC: Fix the return value in gss_import_sec_context() SUNRPC: Fix up an error return value in gss_import_sec_context_kerberos()	2010-01-08 13:55:14 -08:00
Dave Chinner	a539bd8c86	xfs: kill some warnings on i386 builds Randy Dunlap Reported printk() format-related warnings reported on i386 builds in his environment. Dave Chinner provided this patch to eliminate them. Signed-off by: Dave Chinner <david@fromorbit.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Alex Elder <aelder@sgi.com>	2010-01-08 13:32:29 -06:00
Steven Whitehouse	e412bdb126	GFS2: Fix gfs2_xattr_acl_chmod() The ref counting for the bh returned by gfs2_ea_find() was wrong. This patch ensures that we always drop the ref count to that bh correctly. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2010-01-08 13:42:59 +00:00
Steven Whitehouse	24b977b5fd	GFS2: Fix locking bug in rename The rename code was taking a resource group lock in cases where it wasn't actually needed, this caused problems if the rename was resulting in an inode being unlinked. The patch ensures that we only take the rgrp lock early if it is really needed. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2010-01-08 13:42:42 +00:00
Steven Whitehouse	56aa616a03	GFS2: Ensure uptodate inode size when using O_APPEND The VFS reads the inode size during generic_file_aio_write() but with no locking around it. In order to get the expected result from O_APPEND opens, this patch updated the inode size before calling generic_file_aio_write() There is of course still a race here, in that there is nothing to prevent another node coming in and extending the file in the mean time. On the other hand, when used with file locking this will ensure that the expected results are obtained. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>	2010-01-08 13:42:27 +00:00
Frederic Weisbecker	31370f62ba	reiserfs: Relax reiserfs_xattr_set_handle() while acquiring xattr locks Fix remaining xattr locks acquired in reiserfs_xattr_set_handle() while we are holding the reiserfs lock to avoid lock inversions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-07 16:02:53 +01:00
Jiri Slaby	e0baec1b63	reiserfs: Fix unreachable statement Stanse found an unreachable statement in reiserfs_ioctl. There is a if followed by error assignment and `break' with no braces. Add the braces so that we don't break every time, but only in error case, so that REISERFS_IOC_SETVERSION actually works when it returns no error. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Reiserfs <reiserfs-devel@vger.kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-01-07 14:03:18 +01:00
Frederic Weisbecker	6c28705418	reiserfs: Don't call reiserfs_get_acl() with the reiserfs lock reiserfs_get_acl is usually not called under the reiserfs lock, as it doesn't need it. But it happens when it is called by reiserfs_acl_chmod(), which creates a dependency inversion against the private xattr inodes mutexes for the given inode. We need to call it without the reiserfs lock, especially since it's unnecessary. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-07 13:46:48 +01:00
Mike Frysinger	04e4f2b18c	FDPIC: Respect PT_GNU_STACK exec protection markings when creating NOMMU stack The current code will load the stack size and protection markings, but then only use the markings in the MMU code path. The NOMMU code path always passes PROT_EXEC to the mmap() call. While this doesn't matter to most people whilst the code is running, it will cause a pointless icache flush when starting every FDPIC application. Typically this icache flush will be of a region on the order of 128KB in size, or may be the entire icache, depending on the facilities available on the CPU. In the case where the arch default behaviour seems to be desired (EXSTACK_DEFAULT), we probe VM_STACK_FLAGS for VM_EXEC to determine whether we should be setting PROT_EXEC or not. For arches that support an MPU (Memory Protection Unit - an MMU without the virtual mapping capability), setting PROT_EXEC or not will make an important difference. It should be noted that this change also affects the executability of the brk region, since ELF-FDPIC has that share with the stack. However, this is probably irrelevant as NOMMU programs aren't likely to use the brk region, preferring instead allocation via mmap(). Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-06 18:16:02 -08:00
Linus Torvalds	93939f4e5d	Merge branch 'for-2.6.33' of git://linux-nfs.org/~bfields/linux * 'for-2.6.33' of git://linux-nfs.org/~bfields/linux: sunrpc: fix peername failed on closed listener nfsd: make sure data is on disk before calling ->fsync nfsd: fix "insecure" export option	2010-01-06 18:10:15 -08:00
OGAWA Hirofumi	56335936de	nfs: fix oops in nfs_rename() Recent change is missing to update "rehash". With that change, it will become the cause of adding dentry to hash twice. This explains the reason of Oops (dereference the freed dentry in __d_lookup()) on my machine. Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Reported-by: Marvin <marvin24@gmx.de> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-01-06 18:48:26 -05:00
Christoph Hellwig	7211a4e859	nfsd: make sure data is on disk before calling ->fsync nfsd is not using vfs_fsync, so I missed it when changing the calling convention during the 2.6.32 window. This patch fixes it to not only start the data writeout, but also wait for it to complete before calling into ->fsync. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: stable@kernel.org Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2010-01-06 17:37:26 -05:00
Linus Torvalds	c6f7afaeed	Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd * 'for-linus' of git://git.open-osd.org/linux-open-osd: exofs: simple_write_end does not mark_inode_dirty exofs: fix pnfs_osd re-definitions in pre-pnfs trees	2010-01-06 01:41:07 -08:00
Linus Torvalds	6307daad84	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Handle O_DIRECT when writing to a refcounted cluster.	2010-01-05 16:01:04 -08:00
Boaz Harrosh	efd124b999	exofs: simple_write_end does not mark_inode_dirty exofs uses simple_write_end() for it's .write_end handler. But it is not enough because simple_write_end() does not call mark_inode_dirty() when it extends i_size. So even if we do call mark_inode_dirty at beginning of write out, with a very long IO and a saturated system we might get the .write_inode() called while still extend-writing to file and miss out on the last i_size updates. So override .write_end, call simple_write_end(), and afterwords if i_size was changed call mark_inode_dirty(). It stands to logic that since simple_write_end() was the one extending i_size it should also call mark_inode_dirty(). But it looks like all users of simple_write_end() are memory-bound pseudo filesystems, who could careless about mark_inode_dirty(). I might submit a warning-comment patch to simple_write_end() in future. CC: Stable <stable@kernel.org> Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2010-01-05 09:14:32 +02:00
Boaz Harrosh	89be503021	exofs: fix pnfs_osd re-definitions in pre-pnfs trees Some on disk exofs constants and types are defined in the pnfs_osd_xdr.h file. Since we needed these types before the pnfs-objects code was accepted to mainline we duplicated the minimal needed definitions into an exofs local header. The definitions where conditionally included depending on !CONFIG_PNFS defined. So if PNFS was present in the tree definitions are taken from there and if not they are defined locally. That was all good but, the CONFIG_PNFS is planed to be included upstream before the pnfs-objects is also included. (The first pnfs batch might be pnfs-files only) So condition exofs local definitions on the absence of pnfs_osd_xdr.h inclusion (__PNFS_OSD_XDR_H__ not defined). User code must make sure that in future pnfs_osd_xdr.h will be included before fs/exofs/pnfs.h, which happens to be so in current code. Once pnfs-objects hits mainline, exofs's local header will be removed. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>	2010-01-05 09:14:32 +02:00
Frederic Weisbecker	4f3be1b5a9	reiserfs: Relax lock on xattr removing When we remove an xattr, we call lookup_and_delete_xattr() that takes some private xattr inodes mutexes. But we hold the reiserfs lock at this time, which leads to dependency inversions. We can safely call lookup_and_delete_xattr() without the reiserfs lock, where xattr inodes lookups only need the xattr inodes mutexes. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-05 08:00:50 +01:00
Frederic Weisbecker	108d3943c0	reiserfs: Relax the lock before truncating pages While truncating a file, reiserfs_setattr() calls inode_setattr() that will truncate the mapping for the given inode, but for that it needs the pages locks. In order to release these, the owners need the reiserfs lock to complete their jobs. But they can't, as we don't release it before calling inode_setattr(). We need to do that to fix the following softlockups: INFO: task flush-8:0:2149 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. flush-8:0 D f51af998 0 2149 2 0x00000000 f51af9ac 00000092 00000002 f51af998 c2803304 00000000 c1894ad0 010f3000 f51af9cc c1462604 c189ef80 f51af974 c1710304 f715b450 f715b5ec c2807c40 00000000 0005bb00 c2803320 c102c55b c1710304 c2807c50 c2803304 00000246 Call Trace: [<c1462604>] ? schedule+0x434/0xb20 [<c102c55b>] ? resched_task+0x4b/0x70 [<c106fa22>] ? mark_held_locks+0x62/0x80 [<c146414d>] ? mutex_lock_nested+0x1fd/0x350 [<c14640b9>] mutex_lock_nested+0x169/0x350 [<c1178cde>] ? reiserfs_write_lock+0x2e/0x40 [<c1178cde>] reiserfs_write_lock+0x2e/0x40 [<c11719a2>] do_journal_end+0xc2/0xe70 [<c1172912>] journal_end+0xb2/0x120 [<c11686b3>] ? pathrelse+0x33/0xb0 [<c11729e4>] reiserfs_end_persistent_transaction+0x64/0x70 [<c1153caa>] reiserfs_get_block+0x12ba/0x15f0 [<c106fa22>] ? mark_held_locks+0x62/0x80 [<c1154b24>] reiserfs_writepage+0xa74/0xe80 [<c1465a27>] ? _raw_spin_unlock_irq+0x27/0x50 [<c11f3d25>] ? radix_tree_gang_lookup_tag_slot+0x95/0xc0 [<c10b5377>] ? find_get_pages_tag+0x127/0x1a0 [<c106fa22>] ? mark_held_locks+0x62/0x80 [<c106fcd4>] ? trace_hardirqs_on_caller+0x124/0x170 [<c10bc1e0>] __writepage+0x10/0x40 [<c10bc9ab>] write_cache_pages+0x16b/0x320 [<c10bc1d0>] ? __writepage+0x0/0x40 [<c10bcb88>] generic_writepages+0x28/0x40 [<c10bcbd5>] do_writepages+0x35/0x40 [<c11059f7>] writeback_single_inode+0xc7/0x330 [<c11067b2>] writeback_inodes_wb+0x2c2/0x490 [<c1106a86>] wb_writeback+0x106/0x1b0 [<c1106cf6>] wb_do_writeback+0x106/0x1e0 [<c1106c18>] ? wb_do_writeback+0x28/0x1e0 [<c1106e0a>] bdi_writeback_task+0x3a/0xb0 [<c10cbb13>] bdi_start_fn+0x63/0xc0 [<c10cbab0>] ? bdi_start_fn+0x0/0xc0 [<c105d1f4>] kthread+0x74/0x80 [<c105d180>] ? kthread+0x0/0x80 [<c100327a>] kernel_thread_helper+0x6/0x10 3 locks held by flush-8:0/2149: #0: (&type->s_umount_key#30){+++++.}, at: [<c110676f>] writeback_inodes_wb+0x27f/0x490 #1: (&journal->j_mutex){+.+...}, at: [<c117199a>] do_journal_end+0xba/0xe70 #2: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178cde>] reiserfs_write_lock+0x2e/0x40 INFO: task fstest:3813 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. fstest D 00000002 0 3813 3812 0x00000000 f5103c94 00000082 f5103c40 00000002 f5ad5450 00000007 f5103c28 011f3000 00000006 f5ad5450 c10bb005 00000480 c1710304 f5ad5450 f5ad55ec c2907c40 00000001 f5ad5450 f5103c74 00000046 00000002 f5ad5450 00000007 f5103c6c Call Trace: [<c10bb005>] ? free_hot_cold_page+0x1d5/0x280 [<c1462d64>] io_schedule+0x74/0xc0 [<c10b5a45>] sync_page+0x35/0x60 [<c146325a>] __wait_on_bit_lock+0x4a/0x90 [<c10b5a10>] ? sync_page+0x0/0x60 [<c10b59e5>] __lock_page+0x85/0x90 [<c105d660>] ? wake_bit_function+0x0/0x60 [<c10bf654>] truncate_inode_pages_range+0x1e4/0x2d0 [<c10bf75f>] truncate_inode_pages+0x1f/0x30 [<c10bf7cf>] truncate_pagecache+0x5f/0xa0 [<c10bf86a>] vmtruncate+0x5a/0x70 [<c10fdb7d>] inode_setattr+0x5d/0x190 [<c1150117>] reiserfs_setattr+0x1f7/0x2f0 [<c1464569>] ? down_write+0x49/0x70 [<c10fde01>] notify_change+0x151/0x330 [<c10e6f3d>] do_truncate+0x6d/0xa0 [<c10f4ce2>] do_filp_open+0x9a2/0xcf0 [<c1465aec>] ? _raw_spin_unlock+0x2c/0x50 [<c10fec50>] ? alloc_fd+0xe0/0x100 [<c10e602d>] do_sys_open+0x6d/0x130 [<c1002cfb>] ? sysenter_exit+0xf/0x16 [<c10e615e>] sys_open+0x2e/0x40 [<c1002ccc>] sysenter_do_call+0x12/0x32 3 locks held by fstest/3813: #0: (&sb->s_type->i_mutex_key#4){+.+.+.}, at: [<c10e6f33>] do_truncate+0x63/0xa0 #1: (&sb->s_type->i_alloc_sem_key#3){+.+.+.}, at: [<c10fdf07>] notify_change+0x257/0x330 #2: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1178c8e>] reiserfs_write_lock_once+0x2e/0x50 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-05 08:00:29 +01:00
Frederic Weisbecker	5fe1533fda	reiserfs: Fix recursive lock on lchown On chown, reiserfs will call reiserfs_setattr() to change the owner of the given inode, but it may also recursively call reiserfs_setattr() to propagate the owner change to the private xattr files for this inode. Hence, the reiserfs lock may be acquired twice which is not wanted as reiserfs_setattr() calls journal_begin() that is going to try to relax the lock in order to safely acquire the journal mutex. Using reiserfs_write_lock_once() from reiserfs_setattr() solves the problem. This fixes the following warning, that precedes a lockdep report. WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3f/0x50() Hardware name: MS-7418 Unwanted recursive reiserfs lock! Pid: 4189, comm: fsstress Not tainted 2.6.33-rc2-tip-atom+ #195 Call Trace: [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50 [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50 [<c103f7ac>] warn_slowpath_common+0x6c/0xc0 [<c1178bff>] ? reiserfs_lock_check_recursive+0x3f/0x50 [<c103f84b>] warn_slowpath_fmt+0x2b/0x30 [<c1178bff>] reiserfs_lock_check_recursive+0x3f/0x50 [<c1172ae3>] do_journal_begin_r+0x83/0x350 [<c1172f2d>] journal_begin+0x7d/0x140 [<c106509a>] ? in_group_p+0x2a/0x30 [<c10fda71>] ? inode_change_ok+0x91/0x140 [<c115007d>] reiserfs_setattr+0x15d/0x2e0 [<c10f9bf3>] ? dput+0xe3/0x140 [<c1465adc>] ? _raw_spin_unlock+0x2c/0x50 [<c117831d>] chown_one_xattr+0xd/0x10 [<c11780a3>] reiserfs_for_each_xattr+0x113/0x2c0 [<c1178310>] ? chown_one_xattr+0x0/0x10 [<c14641e9>] ? mutex_lock_nested+0x2a9/0x350 [<c117826f>] reiserfs_chown_xattrs+0x1f/0x60 [<c106509a>] ? in_group_p+0x2a/0x30 [<c10fda71>] ? inode_change_ok+0x91/0x140 [<c1150046>] reiserfs_setattr+0x126/0x2e0 [<c1177c20>] ? reiserfs_getxattr+0x0/0x90 [<c11b0d57>] ? cap_inode_need_killpriv+0x37/0x50 [<c10fde01>] notify_change+0x151/0x330 [<c10e659f>] chown_common+0x6f/0x90 [<c10e67bd>] sys_lchown+0x6d/0x80 [<c1002ccc>] sysenter_do_call+0x12/0x32 ---[ end trace 7c2b77224c1442fc ]--- Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-05 07:59:38 +01:00
Eric W. Biederman	846f99749a	sysfs: Add lockdep annotations for the sysfs active reference Holding locks over device_del -> kobject_del -> sysfs_deactivate can cause deadlocks if those same locks are grabbed in sysfs show or store methods. The I model s_active count + completion as a sleeping read/write lock. I describe to lockdep sysfs_get_active as a read_trylock, sysfs_put_active as a read_unlock, and sysfs_deactivate as a write_lock and write_unlock pair. This seems to capture the essence for purposes of finding deadlocks, and in my testing gives finds real issues and ignores non-issues. This brings us back to holding locks over kobject_del is a problem that ideally we should find a way of addressing, but at least lockdep can tell us about the problems instead of requiring developers to debug rare strange system deadlocks, that happen when sysfs files are removed while being written to. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-04 12:34:46 -08:00
Linus Torvalds	d4d3b19212	Merge branch 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6 * 'sh/for-2.6.33' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/sh-2.6: binfmt_elf_fdpic: Fix build breakage introduced by coredump changes. sh: update defconfigs. sh: Don't default enable PMB support. sh: Disable PMB for SH4AL-DSP CPUs. sh: Only provide a PCLK definition for legacy CPG CPUs.	2010-01-04 12:32:09 -08:00
Linus Torvalds	e43c259777	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Calculate metadata requirements more accurately ext4: Fix accounting of reserved metadata blocks	2010-01-04 12:31:52 -08:00
Linus Torvalds	a87da40875	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ryusuke/nilfs2: nilfs2: update mailing list address nilfs2: Storage class should be before const qualifier nilfs2: trivial coding style fix	2010-01-04 12:28:26 -08:00
Daisuke HATAYAMA	2f48912d14	binfmt_elf_fdpic: Fix build breakage introduced by coredump changes. Commit `f6151dfea2` introduces build breakage, so this patch fixes it together with some printk formatting cleanup. Signed-off-by: Daisuke HATAYAMA <d.hatayama@jp.fujitsu.com> Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2010-01-04 15:42:14 +09:00
Frederic Weisbecker	f3e22f48f3	reiserfs: Fix mistake in down_write() conversion Fix a mistake in commit `0719d34347` (reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion) that has converted a down_write() into a down_read() accidentally. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-03 03:44:53 +01:00
Linus Torvalds	45d28b0972	Merge branch 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing * 'reiserfs/kill-bkl' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: reiserfs: Safely acquire i_mutex from xattr_rmdir reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr reiserfs: Fix journal mutex <-> inode mutex lock inversion reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() reiserfs: Relax reiserfs lock while freeing the journal reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr reiserfs: Warn on lock relax if taken recursively reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion reiserfs: Fix reiserfs lock <-> inode mutex dependency inversion reiserfs: Fix reiserfs lock and journal lock inversion dependency reiserfs: Fix possible recursive lock	2010-01-02 11:17:05 -08:00
Jaswinder Singh Rajput	4b6764fa9e	writeback: add missing kernel-doc notation Fix the following htmldocs warning: Warning(fs/fs-writeback.c:255): No description found for parameter 'sb' Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com> Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Acked-by: Wu Fengguang <fengguang.wu@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jan Kara <jack@suse.cz> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-02 10:09:44 -08:00
Frederic Weisbecker	835d5247d9	reiserfs: Safely acquire i_mutex from xattr_rmdir Relax the reiserfs lock before taking the inode mutex from xattr_rmdir() to avoid the usual reiserfs lock <-> inode mutex bad dependency. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:59:48 +01:00
Frederic Weisbecker	8b513f56d4	reiserfs: Safely acquire i_mutex from reiserfs_for_each_xattr Relax the reiserfs lock before taking the inode mutex from reiserfs_for_each_xattr() to avoid the usual bad dependencies: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #179 ------------------------------------------------------- rm/3242 is trying to acquire lock: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290 but task is already holding lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&REISERFS_SB(s)->lock){+.+.+.}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401aab>] mutex_lock_nested+0x5b/0x340 [<c1143339>] reiserfs_write_lock_once+0x29/0x50 [<c1117022>] reiserfs_lookup+0x62/0x140 [<c10bd85f>] __lookup_hash+0xef/0x110 [<c10bf21d>] lookup_one_len+0x8d/0xc0 [<c1141e3a>] open_xa_dir+0xea/0x1b0 [<c1142720>] reiserfs_for_each_xattr+0x70/0x290 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10c0b13>] sys_unlinkat+0x23/0x40 [<c1002ec4>] sysenter_do_call+0x12/0x32 -> #0 (&sb->s_type->i_mutex_key#4/3){+.+.+.}: [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401aab>] mutex_lock_nested+0x5b/0x340 [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10c0b13>] sys_unlinkat+0x23/0x40 [<c1002ec4>] sysenter_do_call+0x12/0x32 other info that might help us debug this: 1 lock held by rm/3242: #0: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143389>] reiserfs_write_lock+0x29/0x40 stack backtrace: Pid: 3242, comm: rm Not tainted 2.6.32-atom #179 Call Trace: [<c13ffa13>] ? printk+0x18/0x1a [<c105d33a>] print_circular_bug+0xca/0xd0 [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105c932>] ? mark_held_locks+0x62/0x80 [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10 [<c1401098>] ? mutex_unlock+0x8/0x10 [<c105f2c8>] lock_acquire+0x68/0x90 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290 [<c1401aab>] mutex_lock_nested+0x5b/0x340 [<c11428ef>] ? reiserfs_for_each_xattr+0x23f/0x290 [<c11428ef>] reiserfs_for_each_xattr+0x23f/0x290 [<c1143180>] ? delete_one_xattr+0x0/0x100 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60 [<c1143339>] ? reiserfs_write_lock_once+0x29/0x50 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150 [<c11b0d4f>] ? _atomic_dec_and_lock+0x4f/0x70 [<c111e990>] ? reiserfs_delete_inode+0x0/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c1401098>] ? mutex_unlock+0x8/0x10 [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0 [<c10c3af0>] ? filldir64+0x0/0xf0 [<c1002ef3>] ? sysenter_exit+0xf/0x16 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170 [<c10c0b13>] sys_unlinkat+0x23/0x40 [<c1002ec4>] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:59:14 +01:00
Frederic Weisbecker	4dd859697f	reiserfs: Fix journal mutex <-> inode mutex lock inversion We need to relax the reiserfs lock before locking the inode mutex from xattr_unlink(), otherwise we'll face the usual bad dependencies: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #178 ------------------------------------------------------- rm/3202 is trying to acquire lock: (&journal->j_mutex){+.+...}, at: [<c113c234>] do_journal_begin_r+0x94/0x360 but task is already holding lock: (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&sb->s_type->i_mutex_key#4/2){+.+...}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a7b>] mutex_lock_nested+0x5b/0x340 [<c1142a67>] xattr_unlink+0x57/0xb0 [<c1143179>] delete_one_xattr+0x29/0x100 [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10c0b13>] sys_unlinkat+0x23/0x40 [<c1002ec4>] sysenter_do_call+0x12/0x32 -> #1 (&REISERFS_SB(s)->lock){+.+.+.}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a7b>] mutex_lock_nested+0x5b/0x340 [<c1143359>] reiserfs_write_lock+0x29/0x40 [<c113c23c>] do_journal_begin_r+0x9c/0x360 [<c113c680>] journal_begin+0x80/0x130 [<c1127363>] reiserfs_remount+0x223/0x4e0 [<c10b6dd6>] do_remount_sb+0xa6/0x140 [<c10ce6a0>] do_mount+0x560/0x750 [<c10ce914>] sys_mount+0x84/0xb0 [<c1002ec4>] sysenter_do_call+0x12/0x32 -> #0 (&journal->j_mutex){+.+...}: [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a7b>] mutex_lock_nested+0x5b/0x340 [<c113c234>] do_journal_begin_r+0x94/0x360 [<c113c680>] journal_begin+0x80/0x130 [<c1116d63>] reiserfs_unlink+0x83/0x2e0 [<c1142a74>] xattr_unlink+0x64/0xb0 [<c1143179>] delete_one_xattr+0x29/0x100 [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10c0b13>] sys_unlinkat+0x23/0x40 [<c1002ec4>] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by rm/3202: #0: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c114274b>] reiserfs_for_each_xattr+0x9b/0x290 #1: (&sb->s_type->i_mutex_key#4/2){+.+...}, at: [<c1142a67>] xattr_unlink+0x57/0xb0 stack backtrace: Pid: 3202, comm: rm Not tainted 2.6.32-atom #178 Call Trace: [<c13ff9e3>] ? printk+0x18/0x1a [<c105d33a>] print_circular_bug+0xca/0xd0 [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c1142a67>] ? xattr_unlink+0x57/0xb0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c113c234>] ? do_journal_begin_r+0x94/0x360 [<c113c234>] ? do_journal_begin_r+0x94/0x360 [<c1401a7b>] mutex_lock_nested+0x5b/0x340 [<c113c234>] ? do_journal_begin_r+0x94/0x360 [<c113c234>] do_journal_begin_r+0x94/0x360 [<c10411b6>] ? run_timer_softirq+0x1a6/0x220 [<c103cb00>] ? __do_softirq+0x50/0x140 [<c113c680>] journal_begin+0x80/0x130 [<c103cba2>] ? __do_softirq+0xf2/0x140 [<c104f72f>] ? hrtimer_interrupt+0xdf/0x220 [<c1116d63>] reiserfs_unlink+0x83/0x2e0 [<c105c932>] ? mark_held_locks+0x62/0x80 [<c11b8d08>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c1002fd8>] ? restore_all_notrace+0x0/0x18 [<c1142a67>] ? xattr_unlink+0x57/0xb0 [<c1142a74>] xattr_unlink+0x64/0xb0 [<c1143179>] delete_one_xattr+0x29/0x100 [<c11427bb>] reiserfs_for_each_xattr+0x10b/0x290 [<c1143150>] ? delete_one_xattr+0x0/0x100 [<c1401cb9>] ? mutex_lock_nested+0x299/0x340 [<c11429ba>] reiserfs_delete_xattrs+0x1a/0x60 [<c1143309>] ? reiserfs_write_lock_once+0x29/0x50 [<c111ea2f>] reiserfs_delete_inode+0x9f/0x150 [<c11b0d1f>] ? _atomic_dec_and_lock+0x4f/0x70 [<c111e990>] ? reiserfs_delete_inode+0x0/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c1401068>] ? mutex_unlock+0x8/0x10 [<c10c3e0d>] ? vfs_readdir+0x7d/0xb0 [<c10c3af0>] ? filldir64+0x0/0xf0 [<c1002ef3>] ? sysenter_exit+0xf/0x16 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170 [<c10c0b13>] sys_unlinkat+0x23/0x40 [<c1002ec4>] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:58:32 +01:00
Frederic Weisbecker	c674905ca7	reiserfs: Fix unwanted recursive reiserfs lock in reiserfs_unlink() reiserfs_unlink() may or may not be called under the reiserfs lock. But it also takes the reiserfs lock and can then acquire it recursively which leads to do_journal_begin_r() that fails to relax the reiserfs lock before grabbing the journal mutex, creating an unexpected lock inversion. We need to ensure reiserfs_unlink() won't get the reiserfs lock recursively using reiserfs_write_lock_once(). This fixes the following warning that precedes a lock inversion report (reiserfs lock <-> journal mutex). ------------[ cut here ]------------ WARNING: at fs/reiserfs/lock.c:95 reiserfs_lock_check_recursive+0x3a/0x50() Hardware name: MS-7418 Unwanted recursive reiserfs lock! Pid: 3208, comm: dbench Not tainted 2.6.32-atom #177 Call Trace: [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50 [<c10373a7>] warn_slowpath_common+0x67/0xc0 [<c114327a>] ? reiserfs_lock_check_recursive+0x3a/0x50 [<c1037446>] warn_slowpath_fmt+0x26/0x30 [<c114327a>] reiserfs_lock_check_recursive+0x3a/0x50 [<c113c213>] do_journal_begin_r+0x83/0x360 [<c105eb16>] ? __lock_acquire+0x1296/0x19e0 [<c1142a57>] ? xattr_unlink+0x57/0xb0 [<c113c670>] journal_begin+0x80/0x130 [<c1116d5d>] reiserfs_unlink+0x7d/0x2d0 [<c1142a57>] ? xattr_unlink+0x57/0xb0 [<c1142a57>] ? xattr_unlink+0x57/0xb0 [<c1142a57>] ? xattr_unlink+0x57/0xb0 [<c1142a64>] xattr_unlink+0x64/0xb0 [<c1143169>] delete_one_xattr+0x29/0x100 [<c11427ab>] reiserfs_for_each_xattr+0x10b/0x290 [<c1143140>] ? delete_one_xattr+0x0/0x100 [<c1401ca9>] ? mutex_lock_nested+0x299/0x340 [<c11429aa>] reiserfs_delete_xattrs+0x1a/0x60 [<c11432f9>] ? reiserfs_write_lock_once+0x29/0x50 [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150 [<c11b0d0f>] ? _atomic_dec_and_lock+0x4f/0x70 [<c111e980>] ? reiserfs_delete_inode+0x0/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10505c6>] ? up_read+0x16/0x30 [<c1022ab7>] ? do_page_fault+0x187/0x330 [<c1002fd8>] ? restore_all_notrace+0x0/0x18 [<c1022930>] ? do_page_fault+0x0/0x330 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170 [<c10c0a00>] sys_unlink+0x10/0x20 [<c1002ec4>] sysenter_do_call+0x12/0x32 ---[ end trace 2e35d71a6cc69d0c ]--- Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:57:32 +01:00
Frederic Weisbecker	3f14fea6bb	reiserfs: Relax lock before open xattr dir in reiserfs_xattr_set_handle() We call xattr_lookup() from reiserfs_xattr_get(). We then hold the reiserfs lock when we grab the i_mutex. But later, we may relax the reiserfs lock, creating dependency inversion between both locks. The lookups and creation jobs ar already protected by the inode mutex, so we can safely relax the reiserfs lock, dropping the unwanted reiserfs lock -> i_mutex dependency, as shown in the following lockdep report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #173 ------------------------------------------------------- cp/3204 is trying to acquire lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50 but task is already holding lock: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a2b>] mutex_lock_nested+0x5b/0x340 [<c1141d83>] open_xa_dir+0x43/0x1b0 [<c1142722>] reiserfs_for_each_xattr+0x62/0x260 [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60 [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10c0a00>] sys_unlink+0x10/0x20 [<c1002ec4>] sysenter_do_call+0x12/0x32 -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a2b>] mutex_lock_nested+0x5b/0x340 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50 [<c1117012>] reiserfs_lookup+0x62/0x140 [<c10bd85f>] __lookup_hash+0xef/0x110 [<c10bf21d>] lookup_one_len+0x8d/0xc0 [<c1141e2a>] open_xa_dir+0xea/0x1b0 [<c1141fe5>] xattr_lookup+0x15/0x160 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0 [<c1144042>] reiserfs_get_acl+0xa2/0x360 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0 [<c10bea96>] vfs_mkdir+0xd6/0x180 [<c10c0c10>] sys_mkdirat+0xc0/0xd0 [<c10c0c40>] sys_mkdir+0x20/0x30 [<c1002ec4>] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by cp/3204: #0: (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0 #1: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0 stack backtrace: Pid: 3204, comm: cp Not tainted 2.6.32-atom #173 Call Trace: [<c13ff993>] ? printk+0x18/0x1a [<c105d33a>] print_circular_bug+0xca/0xd0 [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105d3aa>] ? check_usage+0x6a/0x460 [<c105f2c8>] lock_acquire+0x68/0x90 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50 [<c1401a2b>] mutex_lock_nested+0x5b/0x340 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50 [<c1117012>] reiserfs_lookup+0x62/0x140 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170 [<c10bd85f>] __lookup_hash+0xef/0x110 [<c10bf21d>] lookup_one_len+0x8d/0xc0 [<c1141e2a>] open_xa_dir+0xea/0x1b0 [<c1141fe5>] xattr_lookup+0x15/0x160 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0 [<c1144042>] reiserfs_get_acl+0xa2/0x360 [<c10ca2e7>] ? new_inode+0x27/0xa0 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160 [<c1402eb7>] ? _spin_unlock+0x27/0x40 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0 [<c10c7cb8>] ? __d_lookup+0x108/0x190 [<c105c932>] ? mark_held_locks+0x62/0x80 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340 [<c10bd17a>] ? generic_permission+0x1a/0xa0 [<c11788fe>] ? security_inode_permission+0x1e/0x20 [<c10bea96>] vfs_mkdir+0xd6/0x180 [<c10c0c10>] sys_mkdirat+0xc0/0xd0 [<c10505c6>] ? up_read+0x16/0x30 [<c1002fd8>] ? restore_all_notrace+0x0/0x18 [<c10c0c40>] sys_mkdir+0x20/0x30 [<c1002ec4>] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:57:01 +01:00
Frederic Weisbecker	0523676d3f	reiserfs: Relax reiserfs lock while freeing the journal Keeping the reiserfs lock while freeing the journal on umount path triggers a lock inversion between bdev->bd_mutex and the reiserfs lock. We don't need the reiserfs lock at this stage. The filesystem is not usable anymore, and there are no more pending commits, everything got flushed (even this operation was done in parallel and didn't required the reiserfs lock from the current process). This fixes the following lockdep report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #172 ------------------------------------------------------- umount/3904 is trying to acquire lock: (&bdev->bd_mutex){+.+.+.}, at: [<c10de2c2>] __blkdev_put+0x22/0x160 but task is already holding lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&REISERFS_SB(s)->lock){+.+.+.}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c140199b>] mutex_lock_nested+0x5b/0x340 [<c1143229>] reiserfs_write_lock_once+0x29/0x50 [<c111c485>] reiserfs_get_block+0x85/0x1620 [<c10e1040>] do_mpage_readpage+0x1f0/0x6d0 [<c10e1640>] mpage_readpages+0xc0/0x100 [<c1119b89>] reiserfs_readpages+0x19/0x20 [<c108f1ec>] __do_page_cache_readahead+0x1bc/0x260 [<c108f2b8>] ra_submit+0x28/0x40 [<c1087e3e>] filemap_fault+0x40e/0x420 [<c109b5fd>] __do_fault+0x3d/0x430 [<c109d47e>] handle_mm_fault+0x12e/0x790 [<c1022a65>] do_page_fault+0x135/0x330 [<c1403663>] error_code+0x6b/0x70 [<c10ef9ca>] load_elf_binary+0x82a/0x1a10 [<c10ba130>] search_binary_handler+0x90/0x1d0 [<c10bb70f>] do_execve+0x1df/0x250 [<c1001746>] sys_execve+0x46/0x70 [<c1002fa5>] syscall_call+0x7/0xb -> #2 (&mm->mmap_sem){++++++}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c109b1ab>] might_fault+0x8b/0xb0 [<c11b8f52>] copy_to_user+0x32/0x70 [<c10c3b94>] filldir64+0xa4/0xf0 [<c1109116>] sysfs_readdir+0x116/0x210 [<c10c3e1d>] vfs_readdir+0x8d/0xb0 [<c10c3ea9>] sys_getdents64+0x69/0xb0 [<c1002ec4>] sysenter_do_call+0x12/0x32 -> #1 (sysfs_mutex){+.+.+.}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c140199b>] mutex_lock_nested+0x5b/0x340 [<c110951c>] sysfs_addrm_start+0x2c/0xb0 [<c1109aa0>] create_dir+0x40/0x90 [<c1109b1b>] sysfs_create_dir+0x2b/0x50 [<c11b2352>] kobject_add_internal+0xc2/0x1b0 [<c11b2531>] kobject_add_varg+0x31/0x50 [<c11b25ac>] kobject_add+0x2c/0x60 [<c1258294>] device_add+0x94/0x560 [<c11036ea>] add_partition+0x18a/0x2a0 [<c110418a>] rescan_partitions+0x33a/0x450 [<c10de5bf>] __blkdev_get+0x12f/0x2d0 [<c10de76a>] blkdev_get+0xa/0x10 [<c11034b8>] register_disk+0x108/0x130 [<c11a87a9>] add_disk+0xd9/0x130 [<c12998e5>] sd_probe_async+0x105/0x1d0 [<c10528af>] async_thread+0xcf/0x230 [<c104bfd4>] kthread+0x74/0x80 [<c1003aab>] kernel_thread_helper+0x7/0x3c -> #0 (&bdev->bd_mutex){+.+.+.}: [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c140199b>] mutex_lock_nested+0x5b/0x340 [<c10de2c2>] __blkdev_put+0x22/0x160 [<c10de40a>] blkdev_put+0xa/0x10 [<c113ce22>] free_journal_ram+0xd2/0x130 [<c113ea18>] do_journal_release+0x98/0x190 [<c113eb2a>] journal_release+0xa/0x10 [<c1128eb6>] reiserfs_put_super+0x36/0x130 [<c10b776f>] generic_shutdown_super+0x4f/0xe0 [<c10b7825>] kill_block_super+0x25/0x40 [<c11255df>] reiserfs_kill_sb+0x7f/0x90 [<c10b7f4a>] deactivate_super+0x7a/0x90 [<c10cccd8>] mntput_no_expire+0x98/0xd0 [<c10ccfcc>] sys_umount+0x4c/0x310 [<c10cd2a9>] sys_oldumount+0x19/0x20 [<c1002ec4>] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by umount/3904: #0: (&type->s_umount_key#30){+++++.}, at: [<c10b7f45>] deactivate_super+0x75/0x90 #1: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c1143279>] reiserfs_write_lock+0x29/0x40 stack backtrace: Pid: 3904, comm: umount Not tainted 2.6.32-atom #172 Call Trace: [<c13ff903>] ? printk+0x18/0x1a [<c105d33a>] print_circular_bug+0xca/0xd0 [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c108b66f>] ? free_pcppages_bulk+0x1f/0x250 [<c105f2c8>] lock_acquire+0x68/0x90 [<c10de2c2>] ? __blkdev_put+0x22/0x160 [<c10de2c2>] ? __blkdev_put+0x22/0x160 [<c140199b>] mutex_lock_nested+0x5b/0x340 [<c10de2c2>] ? __blkdev_put+0x22/0x160 [<c105c932>] ? mark_held_locks+0x62/0x80 [<c10afe12>] ? kfree+0x92/0xd0 [<c10de2c2>] __blkdev_put+0x22/0x160 [<c105cc3b>] ? trace_hardirqs_on+0xb/0x10 [<c10de40a>] blkdev_put+0xa/0x10 [<c113ce22>] free_journal_ram+0xd2/0x130 [<c113ea18>] do_journal_release+0x98/0x190 [<c113eb2a>] journal_release+0xa/0x10 [<c1128eb6>] reiserfs_put_super+0x36/0x130 [<c1050596>] ? up_write+0x16/0x30 [<c10b776f>] generic_shutdown_super+0x4f/0xe0 [<c10b7825>] kill_block_super+0x25/0x40 [<c10f41e0>] ? vfs_quota_off+0x0/0x20 [<c11255df>] reiserfs_kill_sb+0x7f/0x90 [<c10b7f4a>] deactivate_super+0x7a/0x90 [<c10cccd8>] mntput_no_expire+0x98/0xd0 [<c10ccfcc>] sys_umount+0x4c/0x310 [<c10cd2a9>] sys_oldumount+0x19/0x20 [<c1002ec4>] sysenter_do_call+0x12/0x32 Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:56:54 +01:00
Frederic Weisbecker	27026a05bb	reiserfs: Fix reiserfs lock <-> i_mutex dependency inversion on xattr While deleting the xattrs of an inode, we hold the reiserfs lock and grab the inode->i_mutex of the targeted inode and the root private xattr directory. Later on, we may relax the reiserfs lock for various reasons, this creates inverted dependencies. We can remove the reiserfs lock -> i_mutex dependency by relaxing the former before calling open_xa_dir(). This is fine because the lookup and creation of xattr private directories done in open_xa_dir() are covered by the targeted inode mutexes. And deeper operations in the tree are still done under the write lock. This fixes the following lockdep report: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-atom #173 ------------------------------------------------------- cp/3204 is trying to acquire lock: (&REISERFS_SB(s)->lock){+.+.+.}, at: [<c11432b9>] reiserfs_write_lock_once+0x29/0x50 but task is already holding lock: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&sb->s_type->i_mutex_key#4/3){+.+.+.}: [<c105ea7f>] __lock_acquire+0x11ff/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a2b>] mutex_lock_nested+0x5b/0x340 [<c1141d83>] open_xa_dir+0x43/0x1b0 [<c1142722>] reiserfs_for_each_xattr+0x62/0x260 [<c114299a>] reiserfs_delete_xattrs+0x1a/0x60 [<c111ea1f>] reiserfs_delete_inode+0x9f/0x150 [<c10c9c32>] generic_delete_inode+0xa2/0x170 [<c10c9d4f>] generic_drop_inode+0x4f/0x70 [<c10c8b07>] iput+0x47/0x50 [<c10c0965>] do_unlinkat+0xd5/0x160 [<c10c0a00>] sys_unlink+0x10/0x20 [<c1002ec4>] sysenter_do_call+0x12/0x32 -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105f2c8>] lock_acquire+0x68/0x90 [<c1401a2b>] mutex_lock_nested+0x5b/0x340 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50 [<c1117012>] reiserfs_lookup+0x62/0x140 [<c10bd85f>] __lookup_hash+0xef/0x110 [<c10bf21d>] lookup_one_len+0x8d/0xc0 [<c1141e2a>] open_xa_dir+0xea/0x1b0 [<c1141fe5>] xattr_lookup+0x15/0x160 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0 [<c1144042>] reiserfs_get_acl+0xa2/0x360 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0 [<c10bea96>] vfs_mkdir+0xd6/0x180 [<c10c0c10>] sys_mkdirat+0xc0/0xd0 [<c10c0c40>] sys_mkdir+0x20/0x30 [<c1002ec4>] sysenter_do_call+0x12/0x32 other info that might help us debug this: 2 locks held by cp/3204: #0: (&sb->s_type->i_mutex_key#4/1){+.+.+.}, at: [<c10bd8d6>] lookup_create+0x26/0xa0 #1: (&sb->s_type->i_mutex_key#4/3){+.+.+.}, at: [<c1141e18>] open_xa_dir+0xd8/0x1b0 stack backtrace: Pid: 3204, comm: cp Not tainted 2.6.32-atom #173 Call Trace: [<c13ff993>] ? printk+0x18/0x1a [<c105d33a>] print_circular_bug+0xca/0xd0 [<c105f176>] __lock_acquire+0x18f6/0x19e0 [<c105d3aa>] ? check_usage+0x6a/0x460 [<c105f2c8>] lock_acquire+0x68/0x90 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50 [<c1401a2b>] mutex_lock_nested+0x5b/0x340 [<c11432b9>] ? reiserfs_write_lock_once+0x29/0x50 [<c11432b9>] reiserfs_write_lock_once+0x29/0x50 [<c1117012>] reiserfs_lookup+0x62/0x140 [<c105ccca>] ? debug_check_no_locks_freed+0x8a/0x140 [<c105cbe4>] ? trace_hardirqs_on_caller+0x124/0x170 [<c10bd85f>] __lookup_hash+0xef/0x110 [<c10bf21d>] lookup_one_len+0x8d/0xc0 [<c1141e2a>] open_xa_dir+0xea/0x1b0 [<c1141fe5>] xattr_lookup+0x15/0x160 [<c1142476>] reiserfs_xattr_get+0x56/0x2a0 [<c1144042>] reiserfs_get_acl+0xa2/0x360 [<c10ca2e7>] ? new_inode+0x27/0xa0 [<c114461a>] reiserfs_cache_default_acl+0x3a/0x160 [<c1402eb7>] ? _spin_unlock+0x27/0x40 [<c111789c>] reiserfs_mkdir+0x6c/0x2c0 [<c10c7cb8>] ? __d_lookup+0x108/0x190 [<c105c932>] ? mark_held_locks+0x62/0x80 [<c1401c8d>] ? mutex_lock_nested+0x2bd/0x340 [<c10bd17a>] ? generic_permission+0x1a/0xa0 [<c11788fe>] ? security_inode_permission+0x1e/0x20 [<c10bea96>] vfs_mkdir+0xd6/0x180 [<c10c0c10>] sys_mkdirat+0xc0/0xd0 [<c10505c6>] ? up_read+0x16/0x30 [<c1002fd8>] ? restore_all_notrace+0x0/0x18 [<c10c0c40>] sys_mkdir+0x20/0x30 [<c1002ec4>] sysenter_do_call+0x12/0x32 v2: Don't drop reiserfs_mutex_lock_nested_safe() as we'll still need it later Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Tested-by: Christian Kujau <lists@nerdbynature.de> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:54:47 +01:00
Frederic Weisbecker	c4a62ca362	reiserfs: Warn on lock relax if taken recursively When we relax the reiserfs lock to avoid creating unwanted dependencies against others locks while grabbing these, we want to ensure it has not been taken recursively, otherwise the lock won't be really relaxed. Only its depth will be decreased. The unwanted dependency would then actually happen. To prevent from that, add a reiserfs_lock_check_recursive() call in the places that need it. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:54:37 +01:00
Frederic Weisbecker	0719d34347	reiserfs: Fix reiserfs lock <-> i_xattr_sem dependency inversion i_xattr_sem depends on the reiserfs lock. But after we grab i_xattr_sem, we may relax/relock the reiserfs lock while waiting on a freezed filesystem, creating a dependency inversion between the two locks. In order to avoid the i_xattr_sem -> reiserfs lock dependency, let's create a reiserfs_down_read_safe() that acts like reiserfs_mutex_lock_safe(): relax the reiserfs lock while grabbing another lock to avoid undesired dependencies induced by the heivyweight reiserfs lock. This fixes the following warning: [ 990.005931] ======================================================= [ 990.012373] [ INFO: possible circular locking dependency detected ] [ 990.013233] 2.6.33-rc1 #1 [ 990.013233] ------------------------------------------------------- [ 990.013233] dbench/1891 is trying to acquire lock: [ 990.013233] (&REISERFS_SB(s)->lock){+.+.+.}, at: [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50 [ 990.013233] [ 990.013233] but task is already holding lock: [ 990.013233] (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [ 990.013233] which lock already depends on the new lock. [ 990.013233] [ 990.013233] [ 990.013233] the existing dependency chain (in reverse order) is: [ 990.013233] [ 990.013233] -> #1 (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}: [ 990.013233] [<ffffffff81063afc>] __lock_acquire+0xf9c/0x1560 [ 990.013233] [<ffffffff8106414f>] lock_acquire+0x8f/0xb0 [ 990.013233] [<ffffffff814ac194>] down_write+0x44/0x80 [ 990.013233] [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150 [ 990.013233] [<ffffffff8115a6aa>] user_set+0x8a/0x90 [ 990.013233] [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0 [ 990.013233] [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0 [ 990.013233] [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0 [ 990.013233] [<ffffffff810e2780>] setxattr+0xc0/0x150 [ 990.013233] [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0 [ 990.013233] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b [ 990.013233] [ 990.013233] -> #0 (&REISERFS_SB(s)->lock){+.+.+.}: [ 990.013233] [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560 [ 990.013233] [<ffffffff8106414f>] lock_acquire+0x8f/0xb0 [ 990.013233] [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0 [ 990.013233] [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50 [ 990.013233] [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50 [ 990.013233] [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180 [ 990.013233] [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470 [ 990.013233] [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150 [ 990.013233] [<ffffffff8115a6aa>] user_set+0x8a/0x90 [ 990.013233] [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0 [ 990.013233] [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0 [ 990.013233] [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0 [ 990.013233] [<ffffffff810e2780>] setxattr+0xc0/0x150 [ 990.013233] [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0 [ 990.013233] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b [ 990.013233] [ 990.013233] other info that might help us debug this: [ 990.013233] [ 990.013233] 2 locks held by dbench/1891: [ 990.013233] #0: (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff810e2678>] vfs_setxattr+0x78/0xc0 [ 990.013233] #1: (&REISERFS_I(inode)->i_xattr_sem){+.+.+.}, at: [<ffffffff8115899a>] reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [ 990.013233] stack backtrace: [ 990.013233] Pid: 1891, comm: dbench Not tainted 2.6.33-rc1 #1 [ 990.013233] Call Trace: [ 990.013233] [<ffffffff81061639>] print_circular_bug+0xe9/0xf0 [ 990.013233] [<ffffffff81063e30>] __lock_acquire+0x12d0/0x1560 [ 990.013233] [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [<ffffffff8106414f>] lock_acquire+0x8f/0xb0 [ 990.013233] [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50 [ 990.013233] [<ffffffff8115899a>] ? reiserfs_xattr_set_handle+0x8a/0x470 [ 990.013233] [<ffffffff814aba77>] __mutex_lock_common+0x47/0x3b0 [ 990.013233] [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50 [ 990.013233] [<ffffffff81159505>] ? reiserfs_write_lock+0x35/0x50 [ 990.013233] [<ffffffff81062592>] ? mark_held_locks+0x72/0xa0 [ 990.013233] [<ffffffff814ab81d>] ? __mutex_unlock_slowpath+0xbd/0x140 [ 990.013233] [<ffffffff810628ad>] ? trace_hardirqs_on_caller+0x14d/0x1a0 [ 990.013233] [<ffffffff814abebe>] mutex_lock_nested+0x3e/0x50 [ 990.013233] [<ffffffff81159505>] reiserfs_write_lock+0x35/0x50 [ 990.013233] [<ffffffff811340e5>] reiserfs_prepare_write+0x45/0x180 [ 990.013233] [<ffffffff81158bb6>] reiserfs_xattr_set_handle+0x2a6/0x470 [ 990.013233] [<ffffffff81158e30>] reiserfs_xattr_set+0xb0/0x150 [ 990.013233] [<ffffffff814abcb4>] ? __mutex_lock_common+0x284/0x3b0 [ 990.013233] [<ffffffff8115a6aa>] user_set+0x8a/0x90 [ 990.013233] [<ffffffff8115901a>] reiserfs_setxattr+0xaa/0xb0 [ 990.013233] [<ffffffff810e2596>] __vfs_setxattr_noperm+0x36/0xa0 [ 990.013233] [<ffffffff810e26bc>] vfs_setxattr+0xbc/0xc0 [ 990.013233] [<ffffffff810e2780>] setxattr+0xc0/0x150 [ 990.013233] [<ffffffff81056018>] ? sched_clock_cpu+0xb8/0x100 [ 990.013233] [<ffffffff8105eded>] ? trace_hardirqs_off+0xd/0x10 [ 990.013233] [<ffffffff810560a3>] ? cpu_clock+0x43/0x50 [ 990.013233] [<ffffffff810c6820>] ? fget+0xb0/0x110 [ 990.013233] [<ffffffff810c6770>] ? fget+0x0/0x110 [ 990.013233] [<ffffffff81002ddc>] ? sysret_check+0x27/0x62 [ 990.013233] [<ffffffff810e289d>] sys_fsetxattr+0x8d/0xa0 [ 990.013233] [<ffffffff81002dab>] system_call_fastpath+0x16/0x1b Reported-and-tested-by: Christian Kujau <lists@nerdbynature.de> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alexander Beregalov <a.beregalov@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2010-01-02 01:54:04 +01:00
Theodore Ts'o	9d0be50230	ext4: Calculate metadata requirements more accurately In the past, ext4_calc_metadata_amount(), and its sub-functions ext4_ext_calc_metadata_amount() and ext4_indirect_calc_metadata_amount() badly over-estimated the number of metadata blocks that might be required for delayed allocation blocks. This didn't matter as much when functions which managed the reserved metadata blocks were more aggressive about dropping reserved metadata blocks as delayed allocation blocks were written, but unfortunately they were too aggressive. This was fixed in commit `0637c6f`, but as a result the over-estimation by ext4_calc_metadata_amount() would lead to reserving 2-3 times the number of pending delayed allocation blocks as potentially required metadata blocks. So if there are 1 megabytes of blocks which have been not yet been allocation, up to 3 megabytes of space would get reserved out of the user's quota and from the file system free space pool until all of the inode's data blocks have been allocated. This commit addresses this problem by much more accurately estimating the number of metadata blocks that will be required. It will still somewhat over-estimate the number of blocks needed, since it must make a worst case estimate not knowing which physical blocks will be needed, but it is much more accurate than before. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-01-01 02:41:30 -05:00
Theodore Ts'o	ee5f4d9cdf	ext4: Fix accounting of reserved metadata blocks Commit `0637c6f` had a typo which caused the reserved metadata blocks to not be released correctly. Fix this. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2010-01-01 02:36:15 -05:00
Linus Torvalds	c03f6bfc9f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6: [CIFS] Enable mmap on forcedirectio mounts cifs: NULL out tcon, pSesInfo, and srvTcp pointers when chasing DFS referrals	2009-12-31 15:17:26 -08:00
Tao Ma	86470e98cc	ocfs2: Handle O_DIRECT when writing to a refcounted cluster. In case of writing to a refcounted cluster with O_DIRECT, we need to fall back to buffer write. And when it is finished, we need to flush the page and the journal as we did for other O_DIRECT writes. This patch fix oss bug 1191. http://oss.oracle.com/bugzilla/show_bug.cgi?id=1191 Signed-off-by: Tao Ma <tao.ma@oracle.com> Tested-by: Tristan Ye <tristan.ye@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-30 19:53:35 -08:00
Linus Torvalds	1f11abc966	Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: Patch up how we claim metadata blocks for quota purposes ext4: Ensure zeroout blocks have no dirty metadata ext4: return correct wbc.nr_to_write in ext4_da_writepages ext4: Update documentation to correct the inode_readahead_blks option name jbd2: don't use __GFP_NOFAIL in journal_init_common() ext4: flush delalloc blocks when space is low fs-writeback: Add helper function to start writeback if idle ext4: Eliminate potential double free on error path ext4: fix unsigned long long printk warning in super.c ext4, jbd2: Add barriers for file systems with exernal journals ext4: replace BUG() with return -EIO in ext4_ext_get_blocks ext4: add module aliases for ext2 and ext3 ext4: Don't ask about supporting ext2/3 in ext4 if ext4 is not configured ext4: remove unused #include <linux/version.h>	2009-12-30 13:25:56 -08:00
Serge E. Hallyn	7ea6600148	generic_permission: MAY_OPEN is not write access generic_permission was refusing CAP_DAC_READ_SEARCH-enabled processes from opening DAC-protected files read-only, because do_filp_open adds MAY_OPEN to the open mask. Ignore MAY_OPEN. After this patch, CAP_DAC_READ_SEARCH is again sufficient to open(fname, O_RDONLY) on a file to which DAC otherwise refuses us read permission. Reported-by: Mike Kazantsev <mk.fraggod@gmail.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Tested-by: Mike Kazantsev <mk.fraggod@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-30 12:35:44 -08:00
Theodore Ts'o	0637c6f413	ext4: Patch up how we claim metadata blocks for quota purposes As reported in Kernel Bugzilla #14936, commit `d21cd8f` triggered a BUG in the function ext4_da_update_reserve_space() found in fs/ext4/inode.c. The root cause of this BUG() was caused by the fact that ext4_calc_metadata_amount() can severely over-estimate how many metadata blocks will be needed, especially when using direct block-mapped files. In addition, it can also badly under estimate how much space is needed, since ext4_calc_metadata_amount() assumes that the blocks are contiguous, and this is not always true. If the application is writing blocks to a sparse file, the number of metadata blocks necessary can be severly underestimated by the functions ext4_da_reserve_space(), ext4_da_update_reserve_space() and ext4_da_release_space(). This was the cause of the dq_claim_space reports found on kerneloops.org. Unfortunately, doing this right means that we need to massively over-estimate the amount of free space needed. So in some cases we may need to force the inode to be written to disk asynchronously in to avoid spurious quota failures. http://bugzilla.kernel.org/show_bug.cgi?id=14936 Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-30 14:20:45 -05:00
Aneesh Kumar K.V	515f41c33a	ext4: Ensure zeroout blocks have no dirty metadata This fixes a bug (found by Curt Wohlgemuth) in which new blocks returned from an extent created with ext4_ext_zeroout() can have dirty metadata still associated with them. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Curt Wohlgemuth <curtw@google.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-29 23:39:06 -05:00
Frederic Weisbecker	98ea3f50bc	reiserfs: Fix remaining in-reclaim-fs <-> reclaim-fs-on locking inversion Commit `500f5a0bf5` (reiserfs: Fix possible recursive lock) fixed a vmalloc under reiserfs lock that triggered a lockdep warning because of a IN-FS-RECLAIM <-> RECLAIM-FS-ON locking dependency inversion. But this patch has ommitted another vmalloc call in the same path that allocates the journal. Relax the lock for this one too. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Chris Mason <chris.mason@oracle.com> Cc: Ingo Molnar <mingo@elte.hu>	2009-12-29 22:34:59 +01:00
Richard Kennedy	2faf2e19dd	ext4: return correct wbc.nr_to_write in ext4_da_writepages When ext4_da_writepages increases the nr_to_write in writeback_control then it must always re-base the return value. Originally there was a (misguided) attempt prevent wbc.nr_to_write from going negative. In fact, it's necessary to allow nr_to_write to be negative so that wb_writeback() can correctly calculate how many pages were actually written. Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-25 15:46:07 -05:00
Tobias Klauser	33e189bd57	nilfs2: Storage class should be before const qualifier The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-12-25 13:01:50 +09:00
Jiro SEKIBA	5ee5814832	nilfs2: trivial coding style fix This is a trivial style fix patch to mend errors/warnings reported by "checkpatch.pl --file". Signed-off-by: Jiro SEKIBA <jir@unicus.jp> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>	2009-12-25 13:01:50 +09:00
Linus Torvalds	45e62974fb	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c ocfs2/trivial: Use proper mask for 2 places in hearbeat.c Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink. Ocfs2: Should ocfs2 support fiemap for S_IFDIR inode? ocfs2: Use FIEMAP_EXTENT_SHARED fiemap: Add new extent flag FIEMAP_EXTENT_SHARED ocfs2: replace u8 by __u8 in ocfs2_fs.h ocfs2: explicit declare uninitialized var in user_cluster_connect() ocfs2-devel: remove redundant OCFS2_MOUNT_POSIX_ACL check in ocfs2_get_acl_nolock() ocfs2: return -EAGAIN instead of EAGAIN in dlm ocfs2/cluster: Make fence method configurable - v2 ocfs2: Set MS_POSIXACL on remount ocfs2: Make acl use the default ocfs2: Always include ACL support	2009-12-24 12:59:11 -08:00
Tao Ma	8ff6af881d	ocfs2/trivial: Use le16_to_cpu for a disk value in xattr.c In ocfs2_value_metas_in_xattr_header, we should Use le16_to_cpu for ocfs2_extent_list.l_next_free_rec. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-23 17:52:16 -08:00
Tao Ma	b31d308ddc	ocfs2/trivial: Use proper mask for 2 places in hearbeat.c I just noticed today that there are 2 places of "mlog(0,...)" in fs/ocfs2/cluster/heartbeat.c, but actually have no default mask prefix in that file. So change them to mlog(ML_HEARTBEAT,...). Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-23 17:52:13 -08:00
Tristan Ye	86239d59e2	Ocfs2: Let ocfs2 support fiemap for symlink and fast symlink. For fast symlink, it can be treated the same as inlined files since the data extent we want to return of both case all were stored in metadata block. For symlink, it can be simply treated the same as we did for regular files. Signed-off-by: Tristan Ye <tristan.ye@oracle.com> Acked-by: Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-23 17:52:09 -08:00
Linus Torvalds	f793067eb9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: devtmpfs: unlock mutex in case of string allocation error Driver core: export platform_device_register_data as a GPL symbol driver core: Prevent reference to freed memory on error path Driver-core: Fix bogus 0 error return in device_add() Driver core: driver_attribute parameters can often be const* Driver core: bin_attribute parameters can often be const* Driver core: device_attribute parameters can often be const* Doc/stable rules: add new cherry-pick logic vfs: get_sb_single() - do not pass options twice devtmpfs: Convert dirlock to a mutex	2009-12-23 13:35:03 -08:00
Linus Torvalds	849254e9bb	Merge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2 * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: ocfs2: Set i_nlink properly during reflink. ocfs2: Add reflinked file's inode to inode hash eariler. ocfs2: refcounttree.c cleanup. ocfs2: Find proper end cpos for a leaf refcount block.	2009-12-23 13:33:07 -08:00
Phil Carmody	66ecb92be9	Driver core: bin_attribute parameters can often be const* Many struct bin_attribute descriptors are purely read-only structures, and there's no need to change them. Therefore make the promise not to, which will let those descriptors be put in a ro section. Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-12-23 11:23:43 -08:00
Kay Sievers	9329d1beae	vfs: get_sb_single() - do not pass options twice Filesystem code usually destroys the option buffer while parsing it. This leads to errors when the same buffer is passed twice. In case we fill a new superblock do not call remount. This is needed to quite a warning that the debugfs code causes every boot. Cc: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2009-12-23 11:23:43 -08:00
Andrew Morton	3ebfdf885a	jbd2: don't use __GFP_NOFAIL in journal_init_common() It triggers the warning in get_page_from_freelist(), and it isn't appropriate to use __GFP_NOFAIL here anyway. Addresses http://bugzilla.kernel.org/show_bug.cgi?id=14843 Reported-by: Christian Casteyde <casteyde.christian@free.fr> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-23 08:05:15 -05:00
Eric Sandeen	c8afb44682	ext4: flush delalloc blocks when space is low Creating many small files in rapid succession on a small filesystem can lead to spurious ENOSPC; on a 104MB filesystem: for i in `seq 1 22500`; do echo -n > $SCRATCH_MNT/$i echo XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > $SCRATCH_MNT/$i done leads to ENOSPC even though after a sync, 40% of the fs is free again. This is because we reserve worst-case metadata for delalloc writes, and when data is allocated that worst-case reservation is not usually needed. When freespace is low, kicking off an async writeback will start converting that worst-case space usage into something more realistic, almost always freeing up space to continue. This resolves the testcase for me, and survives all 4 generic ENOSPC tests in xfstests. We'll still need a hard synchronous sync to squeeze out the last bit, but this fixes things up to a large degree. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-23 07:58:12 -05:00
Eric Sandeen	17bd55d037	fs-writeback: Add helper function to start writeback if idle ext4, at least, would like to start pushing on writeback if it starts to get close to ENOSPC when reserving worst-case blocks for delalloc writes. Writing out delalloc data will convert those worst-case predictions into usually smaller actual usage, freeing up space before we hit ENOSPC based on this speculation. Thanks to Jens for the suggestion for the helper function, & the naming help. I've made the helper return status on whether writeback was started even though I don't plan to use it in the ext4 patch; it seems like it would be potentially useful to test this in some cases. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Acked-by: Jan Kara <jack@suse.cz>	2009-12-23 07:57:07 -05:00
Julia Lawall	d3533d72e7	ext4: Eliminate potential double free on error path b_entry_name and buffer are initially NULL, are initialized within a loop to the result of calling kmalloc, and are freed at the bottom of this loop. The loop contains gotos to cleanup, which also frees b_entry_name and buffer. Some of these gotos are before the reinitializations of b_entry_name and buffer. To maintain the invariant that b_entry_name and buffer are NULL at the top of the loop, and thus acceptable arguments to kfree, these variables are now set to NULL after the kfrees. This seems to be the simplest solution. A more complicated solution would be to introduce more labels in the error handling code at the end of the function. A simplified version of the semantic match that finds this problem is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ identifier E; expression E1; iterator I; statement S; @@ kfree(E); ... when != E = E1 when != I(E,...) S when != &E kfree(E); // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-23 07:52:31 -05:00
Andrew Morton	a6b43e3825	ext4: fix unsigned long long printk warning in super.c sparc64 allmodconfig: fs/ext4/super.c: In function `lifetime_write_kbytes_show': fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4) fs/ext4/super.c:2174: warning: long long unsigned int format, long unsigned int arg (arg 4) Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-23 07:48:08 -05:00
Jan Kara	869835dfad	quota: Improve checking of quota file header When we are asked for vfsv0 quota format and the file is in vfsv1 format (or vice versa), refuse to use the quota file. Also return with error when we don't like the header of quota file. Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:13 +01:00
Yin Kangkai	765f836190	jbd: jbd-debug and jbd2-debug should be writable jbd-debug and jbd2-debug is currently read-only (S_IRUGO), which is not correct. Make it writable so that we can start debuging. Signed-off-by: Yin Kangkai <kangkai.yin@intel.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:13 +01:00
Dmitry Monakhov	39bc680a81	ext4: fix sleep inside spinlock issue with quota and dealloc (#14739 ) Unlock i_block_reservation_lock before vfs_dq_reserve_block(). This patch fixes http://bugzilla.kernel.org/show_bug.cgi?id=14739 CC: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:12 +01:00
Dmitry Monakhov	d21cd8f163	ext4: Fix potential quota deadlock We have to delay vfs_dq_claim_space() until allocation context destruction. Currently we have following call-trace: ext4_mb_new_blocks() /* task is already holding ac->alloc_semp / ->ext4_mb_mark_diskspace_used ->vfs_dq_claim_space() / acquire dqptr_sem here. Possible deadlock / ->ext4_mb_release_context() / drop ac->alloc_semp here */ Let's move quota claiming to ext4_da_update_reserve_space() ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.32-rc7 #18 ------------------------------------------------------- write-truncate-/3465 is trying to acquire lock: (&s->s_dquot.dqptr_sem){++++..}, at: [<c025e73b>] dquot_claim_space+0x3b/0x1b0 but task is already holding lock: (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&meta_group_info[i]->alloc_sem){++++..}: [<c017d04b>] __lock_acquire+0xd7b/0x1260 [<c017d5ea>] lock_acquire+0xba/0xd0 [<c0527191>] down_read+0x51/0x90 [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370 [<c02d0c1c>] ext4_mb_free_blocks+0x46c/0x870 [<c029c9d3>] ext4_free_blocks+0x73/0x130 [<c02c8cfc>] ext4_ext_truncate+0x76c/0x8d0 [<c02a8087>] ext4_truncate+0x187/0x5e0 [<c01e0f7b>] vmtruncate+0x6b/0x70 [<c022ec02>] inode_setattr+0x62/0x190 [<c02a2d7a>] ext4_setattr+0x25a/0x370 [<c022ee81>] notify_change+0x151/0x340 [<c021349d>] do_truncate+0x6d/0xa0 [<c0221034>] may_open+0x1d4/0x200 [<c022412b>] do_filp_open+0x1eb/0x910 [<c021244d>] do_sys_open+0x6d/0x140 [<c021258e>] sys_open+0x2e/0x40 [<c0103100>] sysenter_do_call+0x12/0x32 -> #2 (&ei->i_data_sem){++++..}: [<c017d04b>] __lock_acquire+0xd7b/0x1260 [<c017d5ea>] lock_acquire+0xba/0xd0 [<c0527191>] down_read+0x51/0x90 [<c02a5787>] ext4_get_blocks+0x47/0x450 [<c02a74c1>] ext4_getblk+0x61/0x1d0 [<c02a7a7f>] ext4_bread+0x1f/0xa0 [<c02bcddc>] ext4_quota_write+0x12c/0x310 [<c0262d23>] qtree_write_dquot+0x93/0x120 [<c0261708>] v2_write_dquot+0x28/0x30 [<c025d3fb>] dquot_commit+0xab/0xf0 [<c02be977>] ext4_write_dquot+0x77/0x90 [<c02be9bf>] ext4_mark_dquot_dirty+0x2f/0x50 [<c025e321>] dquot_alloc_inode+0x101/0x180 [<c029fec2>] ext4_new_inode+0x602/0xf00 [<c02ad789>] ext4_create+0x89/0x150 [<c0221ff2>] vfs_create+0xa2/0xc0 [<c02246e7>] do_filp_open+0x7a7/0x910 [<c021244d>] do_sys_open+0x6d/0x140 [<c021258e>] sys_open+0x2e/0x40 [<c0103100>] sysenter_do_call+0x12/0x32 -> #1 (&sb->s_type->i_mutex_key#7/4){+.+...}: [<c017d04b>] __lock_acquire+0xd7b/0x1260 [<c017d5ea>] lock_acquire+0xba/0xd0 [<c0526505>] mutex_lock_nested+0x65/0x2d0 [<c0260c9d>] vfs_load_quota_inode+0x4bd/0x5a0 [<c02610af>] vfs_quota_on_path+0x5f/0x70 [<c02bc812>] ext4_quota_on+0x112/0x190 [<c026345a>] sys_quotactl+0x44a/0x8a0 [<c0103100>] sysenter_do_call+0x12/0x32 -> #0 (&s->s_dquot.dqptr_sem){++++..}: [<c017d361>] __lock_acquire+0x1091/0x1260 [<c017d5ea>] lock_acquire+0xba/0xd0 [<c0527191>] down_read+0x51/0x90 [<c025e73b>] dquot_claim_space+0x3b/0x1b0 [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380 [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530 [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0 [<c02a5966>] ext4_get_blocks+0x226/0x450 [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0 [<c02a6ed6>] ext4_da_writepages+0x506/0x790 [<c01de272>] do_writepages+0x22/0x50 [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80 [<c01d7b9b>] filemap_flush+0x2b/0x30 [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60 [<c029e595>] ext4_release_file+0x75/0xb0 [<c0216b59>] __fput+0xf9/0x210 [<c0216c97>] fput+0x27/0x30 [<c02122dc>] filp_close+0x4c/0x80 [<c014510e>] put_files_struct+0x6e/0xd0 [<c01451b7>] exit_files+0x47/0x60 [<c0146a24>] do_exit+0x144/0x710 [<c0147028>] do_group_exit+0x38/0xa0 [<c0159abc>] get_signal_to_deliver+0x2ac/0x410 [<c0102849>] do_notify_resume+0xb9/0x890 [<c01032d2>] work_notifysig+0x13/0x21 other info that might help us debug this: 3 locks held by write-truncate-/3465: #0: (jbd2_handle){+.+...}, at: [<c02e1f8f>] start_this_handle+0x38f/0x5c0 #1: (&ei->i_data_sem){++++..}, at: [<c02a57f6>] ext4_get_blocks+0xb6/0x450 #2: (&meta_group_info[i]->alloc_sem){++++..}, at: [<c02ce962>] ext4_mb_load_buddy+0xb2/0x370 stack backtrace: Pid: 3465, comm: write-truncate- Not tainted 2.6.32-rc7 #18 Call Trace: [<c0524cb3>] ? printk+0x1d/0x22 [<c017ac9a>] print_circular_bug+0xca/0xd0 [<c017d361>] __lock_acquire+0x1091/0x1260 [<c016bca2>] ? sched_clock_local+0xd2/0x170 [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0 [<c017d5ea>] lock_acquire+0xba/0xd0 [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0 [<c0527191>] down_read+0x51/0x90 [<c025e73b>] ? dquot_claim_space+0x3b/0x1b0 [<c025e73b>] dquot_claim_space+0x3b/0x1b0 [<c02cb95f>] ext4_mb_mark_diskspace_used+0x36f/0x380 [<c02d210a>] ext4_mb_new_blocks+0x34a/0x530 [<c02c601d>] ? ext4_ext_find_extent+0x25d/0x280 [<c02c83fb>] ext4_ext_get_blocks+0x122b/0x13c0 [<c016bca2>] ? sched_clock_local+0xd2/0x170 [<c016be60>] ? sched_clock_cpu+0x120/0x160 [<c016beef>] ? cpu_clock+0x4f/0x60 [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0 [<c052712c>] ? down_write+0x8c/0xa0 [<c02a5966>] ext4_get_blocks+0x226/0x450 [<c016be60>] ? sched_clock_cpu+0x120/0x160 [<c016beef>] ? cpu_clock+0x4f/0x60 [<c017908b>] ? trace_hardirqs_off+0xb/0x10 [<c02a5ff3>] mpage_da_map_blocks+0xc3/0xaa0 [<c01d69cc>] ? find_get_pages_tag+0x16c/0x180 [<c01d6860>] ? find_get_pages_tag+0x0/0x180 [<c02a73bd>] ? __mpage_da_writepage+0x16d/0x1a0 [<c01dfc4e>] ? pagevec_lookup_tag+0x2e/0x40 [<c01ddf1b>] ? write_cache_pages+0xdb/0x3d0 [<c02a7250>] ? __mpage_da_writepage+0x0/0x1a0 [<c02a6ed6>] ext4_da_writepages+0x506/0x790 [<c016beef>] ? cpu_clock+0x4f/0x60 [<c016bca2>] ? sched_clock_local+0xd2/0x170 [<c016be60>] ? sched_clock_cpu+0x120/0x160 [<c016be60>] ? sched_clock_cpu+0x120/0x160 [<c02a69d0>] ? ext4_da_writepages+0x0/0x790 [<c01de272>] do_writepages+0x22/0x50 [<c01d766d>] __filemap_fdatawrite_range+0x6d/0x80 [<c01d7b9b>] filemap_flush+0x2b/0x30 [<c02a40ac>] ext4_alloc_da_blocks+0x5c/0x60 [<c029e595>] ext4_release_file+0x75/0xb0 [<c0216b59>] __fput+0xf9/0x210 [<c0216c97>] fput+0x27/0x30 [<c02122dc>] filp_close+0x4c/0x80 [<c014510e>] put_files_struct+0x6e/0xd0 [<c01451b7>] exit_files+0x47/0x60 [<c0146a24>] do_exit+0x144/0x710 [<c017b163>] ? lock_release_holdtime+0x33/0x210 [<c0528137>] ? _spin_unlock_irq+0x27/0x30 [<c0147028>] do_group_exit+0x38/0xa0 [<c017babb>] ? trace_hardirqs_on+0xb/0x10 [<c0159abc>] get_signal_to_deliver+0x2ac/0x410 [<c0102849>] do_notify_resume+0xb9/0x890 [<c0178fd0>] ? trace_hardirqs_off_caller+0x20/0xd0 [<c017b163>] ? lock_release_holdtime+0x33/0x210 [<c0165b50>] ? autoremove_wake_function+0x0/0x50 [<c017ba54>] ? trace_hardirqs_on_caller+0x134/0x190 [<c017babb>] ? trace_hardirqs_on+0xb/0x10 [<c0300ba4>] ? security_file_permission+0x14/0x20 [<c0215761>] ? vfs_write+0x131/0x190 [<c0214f50>] ? do_sync_write+0x0/0x120 [<c0103115>] ? sysenter_do_call+0x27/0x32 [<c01032d2>] work_notifysig+0x13/0x21 CC: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:12 +01:00
Jan Kara	82fdfa928c	quota: Fix 64-bit limits setting on 32-bit archs Fix warnings: fs/quota/quota_v2.c: In function ‘v2_read_file_info’: fs/quota/quota_v2.c:123: warning: integer constant is too large for ‘long’ type fs/quota/quota_v2.c:124: warning: integer constant is too large for ‘long’ type Reported-by: Jerry Leo <jerryleo860202@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:12 +01:00
Eric Sandeen	96d2a495c2	ext3: Replace lock/unlock_super() with an explicit lock for resizing Use a separate lock to protect s_groups_count and the other block group descriptors which get changed via an on-line resize operation, so we can stop overloading the use of lock_super(). Port of ext4 commit `32ed5058ce` by Theodore Ts'o <tytso@mit.edu>. CC: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:12 +01:00
Eric Sandeen	b8a052d016	ext3: Replace lock/unlock_super() with an explicit lock for the orphan list Use a separate lock to protect the orphan list, so we can stop overloading the use of lock_super(). Port of ext4 commit `3b9d4ed266` by Theodore Ts'o <tytso@mit.edu>. CC: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:11 +01:00
Eric Sandeen	4854a5f0cb	ext3: ext3_mark_recovery_complete() doesn't need to use lock_super The function ext3_mark_recovery_complete() is called from two call paths: either (a) while mounting the filesystem, in which case there's no danger of any other CPU calling write_super() until the mount is completed, and (b) while remounting the filesystem read-write, in which case the fs core has already locked the superblock. This also allows us to take out a very vile unlock_super()/lock_super() pair in ext3_remount(). Port of ext4 commit `a63c9eb2ce` by Theodore Ts'o <tytso@mit.edu>. CC: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:44:11 +01:00
Eric Sandeen	ed505ee454	ext3: Remove outdated comment about lock_super() ext3_fill_super() is no longer called by read_super(), and it is no longer called with the superblock locked. The unlock_super()/lock_super() is no longer present, so this comment is entirely superfluous. Port of ext4 commit `32ed5058ce` by Theodore Ts'o <tytso@mit.edu>. CC: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:43:50 +01:00
Dmitry Monakhov	dc52dd3a3a	quota: Move duplicated code to separate functions - for(..) { mark_dquot_dirty(); } -> mark_all_dquot_dirty() - for(..) { dput(); } -> dqput_all() Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:55 +01:00
Dmitry Monakhov	a9e7f44720	ext4: Convert to generic reserved quota's space management. This patch also fixes write vs chown race condition. Acked-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:55 +01:00
Dmitry Monakhov	fd8fbfc170	quota: decouple fs reserved space from quota reservation Currently inode_reservation is managed by fs itself and this reservation is transfered on dquot_transfer(). This means what inode_reservation must always be in sync with dquot->dq_dqb.dqb_rsvspace. Otherwise dquot_transfer() will result in incorrect quota(WARN_ON in dquot_claim_reserved_space() will be triggered) This is not easy because of complex locking order issues for example http://bugzilla.kernel.org/show_bug.cgi?id=14739 The patch introduce quota reservation field for each fs-inode (fs specific inode is used in order to prevent bloating generic vfs inode). This reservation is managed by quota code internally similar to i_blocks/i_bytes and may not be always in sync with internal fs reservation. Also perform some code rearrangement: - Unify dquot_reserve_space() and dquot_reserve_space() - Unify dquot_release_reserved_space() and dquot_free_space() - Also this patch add missing warning update to release_rsv() dquot_release_reserved_space() must call flush_warnings() as dquot_free_space() does. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:54 +01:00
Dmitry Monakhov	b462707e7c	Add unlocked version of inode_add_bytes() function Quota code requires unlocked version of this function. Off course we can just copy-paste the code, but copy-pasting is always an evil. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:54 +01:00
Dmitry Monakhov	c459001fa4	ext3: quota macros cleanup [V2] Currently all quota block reservation macros contains hardcoded "2" aka MAXQUOTAS value. This is no good because in some places it is not obvious to understand what does this digit represent. Let's introduce new macro with self descriptive name. Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org> Signed-off-by: Jan Kara <jack@suse.cz>	2009-12-23 13:33:54 +01:00
Theodore Ts'o	cc3e1bea5d	ext4, jbd2: Add barriers for file systems with exernal journals This is a bit complicated because we are trying to optimize when we send barriers to the fs data disk. We could just throw in an extra barrier to the data disk whenever we send a barrier to the journal disk, but that's not always strictly necessary. We only need to send a barrier during a commit when there are data blocks which are must be written out due to an inode written in ordered mode, or if fsync() depends on the commit to force data blocks to disk. Finally, before we drop transactions from the beginning of the journal during a checkpoint operation, we need to guarantee that any blocks that were flushed out to the data disk are firmly on the rust platter before we drop the transaction from the journal. Thanks to Oleg Drokin for pointing out this flaw in ext3/ext4. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>	2009-12-23 06:52:08 -05:00
Alan Cox	28ba0ec64c	jfs: Fix 32bit build warning loff_t is a type that isn't entirely dependant upon 32 v 64bit choice Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:35 -05:00
Al Viro	5300990c03	Sanitize f_flags helpers * pull ACC_MODE to fs.h; we have several copies all over the place * nightmarish expression calculating f_mode by f_flags deserves a helper too (OPEN_FMODE(flags)) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Al Viro	482928d59d	Fix f_flags/f_mode in case of lookup_instantiate_filp() from open(pathname, 3) Just set f_flags when shoving struct file into nameidata; don't postpone that until __dentry_open(). do_filp_open() has correct value; lookup_instantiate_filp() doesn't - we lose the difference between O_RDWR and 3 by that point. We still set .intent.open.flags, so no fs code needs to be changed. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Roland Dreier	628ff7c1d8	anonfd: Allow making anon files read-only It seems a couple places such as arch/ia64/kernel/perfmon.c and drivers/infiniband/core/uverbs_main.c could use anon_inode_getfile() instead of a private pseudo-fs + alloc_file(), if only there were a way to get a read-only file. So provide this by having anon_inode_getfile() create a read-only file if we pass O_RDONLY in flags. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:34 -05:00
Arnd Bergmann	ed2617585f	fs/compat_ioctl.c: fix build error when !BLOCK No driver uses SG_SET_TRANSFORM any more in Linux, since the ide-scsi driver was removed in 2.6.29. The compat-ioctl cleanup series moved the handling for this around, which broke building without CONFIG_BLOCK. Just remove the code handling it for compat mode. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:33 -05:00
Roland Dreier	385e3ed4f0	alloc_file(): simplify handling of mnt_clone_write() errors When alloc_file() and init_file() were combined, the error handling of mnt_clone_write() was taken into alloc_file() in a somewhat obfuscated way. Since we don't use the error code for anything except warning, we might as well warn directly without an extra variable. Signed-off-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2009-12-22 12:27:33 -05:00
J. Bruce Fields	3d354cbc43	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-20 20:19:51 -08:00
J. Bruce Fields	f69ac2f5a3	nfsd: fix "insecure" export option A typo in `12045a6ee9` "nfsd: let "insecure" flag vary by pseudoflavor" reversed the sense of the "insecure" flag. Reported-by: Michael Guntsche <mike@it-loops.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>	2009-12-20 10:22:58 -05:00
Linus Torvalds	aac3d39693	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix broken assertion sched: Assert task state bits at build time sched: Update task_state_arraypwith new states sched: Add missing state chars to TASK_STATE_TO_CHAR_STR sched: Move TASK_STATE_TO_CHAR_STR near the TASK_state bits sched: Teach might_sleep() about preemptible RCU sched: Make warning less noisy sched: Simplify set_task_cpu() sched: Remove the cfs_rq dependency from set_task_cpu() sched: Add pre and post wakeup hooks sched: Move kthread_bind() back to kthread.c sched: Fix select_task_rq() vs hotplug issues sched: Fix sched_exec() balancing sched: Ensure set_task_cpu() is never called on blocked tasks sched: Use TASK_WAKING for fork wakups sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE sched: Fix task_hot() test order sched: Fix set_cpu_active() in cpu_down() sched: Mark boot-cpu active before smp_init() sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK ...	2009-12-19 09:47:49 -08:00
Tao Ma	10cf1a02f4	ocfs2: Set i_nlink properly during reflink. We create a file in orphan dir for reflink so that if there is any error, we don't create any wrong dentry in the dir. But actually the file in orphan dir should be i_nlink = 0 so that it can be replayed and freed successfully. This patch first set i_nlink to 0 when creating the file in orphan dir and then set it to 1(reflink now only works for regular file) when we move it to the dest dir. Signed-off-by: Tao Ma <tao.ma@oracle.com> Signed-off-by: Joel Becker <joel.becker@oracle.com>	2009-12-18 13:32:28 -08:00

1 2 3 4 5 ...

16494 Commits