linux

Author	SHA1	Message	Date
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Trond Myklebust	9f2fa46638	Merge branch 'master' of /home/trondmy/kernel/linux-2.6/	2006-06-28 23:27:48 -04:00
Christoph Hellwig	f5e54d6e53	[PATCH] mark address_space_operations const Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-28 14:59:04 -07:00
Trond Myklebust	607f31e80b	Revert "Merge branch 'odirect'" This reverts `ccf01ef7aa` commit. No idea how git managed this one: when I asked it to merge the odirect topic branch it actually generated a patch which reverted the change. Reverting the 'merge' will once again reveal Chuck's recent NFS/O_DIRECT work to the world. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-28 16:52:45 -04:00
David Brownell	266bee8869	[PATCH] fix static linking of NFS Builds on ARM report link problems with common configurations like statically linked NFS (for nfsroot). The symptom is that __init section code references __exit section code; that won't work since the exit sections are discarded (since they can never be called). The best fix for these particular cases would be an "__init_or_exit" section annotation. Signed-off-by: David Brownell <dbrownell@users.sourceforge.net> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-27 14:07:19 -07:00
Linus Torvalds	f36f44de72	Fix NFS2 compile error Trond had apparently merged the same patch twice, causing a duplicate include of the "internal.h" file, with resulting obvious confusion. Tssk. I'm the only one allowed to send out trees that don't even compile! Who does this Trond guy think he is? Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-25 12:30:33 -07:00
Alexey Dobriyan	9bf2aa129a	nfs: remove nfs_put_link() Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-25 06:39:35 -04:00
Andrew Morton	6ab86aa130	nfs-build-fix-99 fs/built-in.o:(__param+0x20): undefined reference to `nfs_idmap_cache_timeout' fs/built-in.o:(__param+0x48): undefined reference to `nfs_callback_set_tcpport' Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andreas Gruenbacher <agruen@suse.de> Cc: Andy Adamson <andros@citi.umich.edu> Cc: Chuck Lever <cel@netapp.com> Cc: David Howells <dhowells@redhat.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Manoj Naik <manoj@almaden.ibm.com> Cc: Marc Eshel <eshel@almaden.ibm.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-25 06:38:47 -04:00
Andrew Morton	d75d54147d	git-nfs-build-fixes Fix various problems with nfs4 disabled. And various other things. In file included from fs/nfs/inode.c:50: fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want fs/nfs/internal.h: In function 'nfs4_path': fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path' fs/nfs/inode.c: In function 'init_once': fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'open_states' fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation' fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'delegation_state' fs/nfs/inode.c:1116: error: 'struct nfs_inode' has no member named 'rwsem' distcc[26452] ERROR: compile fs/nfs/inode.c on g5/64 failed make[1]: * [fs/nfs/inode.o] Error 1 make: * [fs/nfs/inode.o] Error 2 make: * Waiting for unfinished jobs.... In file included from fs/nfs/nfs3xdr.c:26: fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want fs/nfs/internal.h: In function 'nfs4_path': fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path' distcc[26486] ERROR: compile fs/nfs/nfs3xdr.c on g5/64 failed make[1]: * [fs/nfs/nfs3xdr.o] Error 1 make: * [fs/nfs/nfs3xdr.o] Error 2 In file included from fs/nfs/nfs3proc.c:24: fs/nfs/internal.h:24: error: static declaration of 'nfs_do_refmount' follows non-static declaration include/linux/nfs_fs.h:320: error: previous declaration of 'nfs_do_refmount' was here fs/nfs/internal.h:65: warning: 'struct nfs4_fs_locations' declared inside parameter list fs/nfs/internal.h:65: warning: its scope is only this definition or declaration, which is probably not what you want fs/nfs/internal.h: In function 'nfs4_path': fs/nfs/internal.h:97: error: 'struct nfs_server' has no member named 'mnt_path' distcc[26469] ERROR: compile fs/nfs/nfs3proc.c on bix/32 failed make[1]: * [fs/nfs/nfs3proc.o] Error 1 make: * [fs/nfs/nfs3proc.o] Error 2 FAILED** Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andreas Gruenbacher <agruen@suse.de> Cc: Andy Adamson <andros@citi.umich.edu> Cc: Chuck Lever <cel@netapp.com> Cc: David Howells <dhowells@redhat.com> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Manoj Naik <manoj@almaden.ibm.com> Cc: Marc Eshel <eshel@almaden.ibm.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-25 06:38:11 -04:00
Trond Myklebust	ccf01ef7aa	Merge branch 'odirect'	2006-06-25 06:27:31 -04:00
Chuck Lever	82b145c5a5	NFS: alloc nfs_read/write_data as direct I/O is scheduled Re-arrange the logic in the NFS direct I/O path so that nfs_read/write_data structs are allocated just before they are scheduled, rather than allocating them all at once before we start scheduling requests. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-24 13:11:39 -04:00
Chuck Lever	06cf6f2ed0	NFS: Eliminate nfs_get_user_pages() Neil Brown observed that the kmalloc() in nfs_get_user_pages() is more likely to fail if the I/O is large enough to require the allocation of more than a single page to keep track of all the pinned pages in the user's buffer. Instead of tracking one large page array per dreq/iocb, track pages per nfs_read/write_data, just like the cached I/O path does. An array for pages is already allocated for us by nfs_readdata_alloc() (and the write and commit equivalents). This is also required for adding support for vectored I/O to the NFS direct I/O path. The original reason to pin the user buffer and allocate all the NFS data structures before trying to schedule I/O was to ensure all needed resources are allocated on the client before starting to send requests. This reduces the chance that resource exhaustion on the client will cause a short read or write. On the other hand, for an application making very large application I/O requests, this means that it will be nearly impossible for the application to make forward progress on a resource-limited client. Thus, moving the buffer pinning functionality into the I/O scheduling loops should be good for scalability. The next patch will do the same for NFS data structure allocation. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-24 13:11:39 -04:00
Chuck Lever	9c93ab7dff	NFS: refactor nfs_direct_free_user_pages Clean-up and fix a minor bug: the logic was dirtying page cache pages on both read and write operations. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-24 13:11:39 -04:00
Chuck Lever	51a7bc6cae	NFS: remove user_addr, user_count, and pos from nfs_direct_req Make the user_addr, user_count, and pos parameters explicit to the scheduler routines, and remove the fields from nfs_direct_req. The iovec API will be passing in a series of these, not just one set. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-24 13:11:39 -04:00
Chuck Lever	fedb595c66	NFS: "open code" the NFS direct write rescheduler An NFSv3/v4 client must reschedule on-the-wire writes if the writes are UNSTABLE, and the server reboots before the client can complete a subsequent COMMIT request. To support direct asynchronous scatter-gather writes, the write rescheduler in fs/nfs/direct.c must not depend on the I/O parameters in the controlling nfs_direct_req structure. iovecs can be somewhat arbitrarily complex, so there could be an unbounded amount of information to save for a rarely encountered requirement. Refactor the direct write rescheduler so it uses information from each nfs_write_data structure to reschedule writes, instead of caching that information in the controlling nfs_direct_req structure. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-24 13:11:38 -04:00
Chuck Lever	b1c5921c5b	NFS: Separate functions for counting outstanding NFS direct I/Os Factor out the logic that increments and decrements the outstanding I/O count. This will be a commonly used bit of code in upcoming patches. Also make this an atomic_t again, since it will be very often manipulated outside dreq->spin lock. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-24 13:11:38 -04:00
Trond Myklebust	816724e65c	Merge branch 'master' of /home/trondmy/kernel/linux-2.6/ Conflicts: fs/nfs/inode.c fs/super.c Fix conflicts between patch 'NFS: Split fs/nfs/inode.c' and patch 'VFS: Permit filesystem to override root dentry on mount'	2006-06-24 13:07:53 -04:00
Miklos Szeredi	75e1fcc0b1	[PATCH] vfs: add lock owner argument to flush operation Pass the POSIX lock owner ID to the flush operation. This is useful for filesystems which don't want to store any locking state in inode->i_flock but want to handle locking/unlocking POSIX locks internally. FUSE is one such filesystem but I think it possible that some network filesystems would need this also. Also add a flag to indicate that a POSIX locking request was generated by close(), so filesystems using the above feature won't send an extra locking request in this case. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:43:02 -07:00
David Howells	726c334223	[PATCH] VFS: Permit filesystem to perform statfs with a known root dentry Give the statfs superblock operation a dentry pointer rather than a superblock pointer. This complements the get_sb() patch. That reduced the significance of sb->s_root, allowing NFS to place a fake root there. However, NFS does require a dentry to use as a target for the statfs operation. This permits the root in the vfsmount to be used instead. linux/mount.h has been added where necessary to make allyesconfig build successfully. Interest has also been expressed for use with the FUSE and XFS filesystems. Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
David Howells	454e2398be	[PATCH] VFS: Permit filesystem to override root dentry on mount Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: () the get_sb_() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. () If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). () generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[] dentries with child trees. [] Anonymous until discovered from another tree. () The documentation has been adjusted, including the additional bit of changing ext2_ into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
Trond Myklebust	81039f1f20	NFS: Display the chosen RPCSEC_GSS security flavour in /proc/mounts Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:34 -04:00
David Howells	f7b422b17e	NFS: Split fs/nfs/inode.c As fs/nfs/inode.c is rather large, heterogenous and unwieldy, the attached patch splits it up into a number of files: () fs/nfs/inode.c Strictly inode specific functions. () fs/nfs/super.c Superblock management functions for NFS and NFS4, normal access, clones and referrals. The NFS4 superblock functions _could_ move out into a separate conditionally compiled file, but it's probably not worth it as there're so many common bits. () fs/nfs/namespace.c Some namespace-specific functions have been moved here. () fs/nfs/nfs4namespace.c NFS4-specific namespace functions (this could be merged into the previous file). This file is conditionally compiled. () fs/nfs/internal.h Inter-file declarations, plus a few simple utility functions moved from fs/nfs/inode.c. Additionally, all the in-.c-file externs have been moved here, and those files they were moved from now includes this file. For the most part, the functions have not been changed, only some multiplexor functions have changed significantly. I've also: () Added some extra banner comments above some functions. () Rearranged the function order within the files to be more logical and better grouped (IMO), though someone may prefer a different order. () Reduced the number of #ifdefs in .c files. (*) Added missing __init and __exit directives. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-06-09 09:34:33 -04:00
Trond Myklebust	4e5ccf60c5	NFS: Fix typo in nfs_do_clone_mount() Doh! Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:32 -04:00
Trond Myklebust	860de07139	NFS: Fix compile errors introduced by referrals patches Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:31 -04:00
Trond Myklebust	87e4ba1a62	NFSv4: Ensure that referral mounts bind to a reserved port Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:30 -04:00
Andy Adamson	33a43f2802	NFSv4: A root pathname is sent as a zero component4 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:30 -04:00
Manoj Naik	6b97fd3da1	NFSv4: Follow a referral Respond to a moved error on NFS lookup by setting up the referral. Note: We don't actually follow the referral during lookup/getattr, but later when we detect fsid mismatch in inode revalidation (similar to the processing done for cloning submounts). Referrals will have fake attributes until they are actually followed or traversed. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:29 -04:00
Manoj Naik	9cdb3883c3	NFSv4: Ensure client submounts when following a referral Set up mountpoint when hitting a referral on moved error by getting fs_locations. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:28 -04:00
Manoj Naik	61f5164cab	NFS: Expand clone mounts to include other servers Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:27 -04:00
Manoj Naik	c818ba43f9	NFSv4: Create NFSv4 transport and client Move existing code into a separate function so that it can be also used by referral code. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:26 -04:00
Manoj Naik	830b8e33fe	NFSv4: Define an fs_locations bitmap This is (similar to getattr bitmap) but includes fs_locations and mounted_on_fileid attributes. Use this bitmap for encoding in fs_locations requests. Note: We can probably do better by requesting locations as part of fsinfo itself. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:25 -04:00
Manoj Naik	361e624f6d	NFSv4: GETATTR attributes on referral Per referral draft, only fs_locations, fsid, and mounted_on_fileid can be requested in a GETATTR on referrals. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:24 -04:00
Manoj Naik	99baf625d3	NFSv4: Decode mounted_on_fileid attribute in getattr. It is ignored if fileid is also requested. This will be used on referrals (fs_locations). Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:24 -04:00
Manoj Naik	7aaa0b3bd4	NFSv4: convert fs-locations-components to conform to RFC3530 Use component4-style formats for decoding list of servers and pathnames in fs_locations. Signed-off-by: Manoj Naik <manoj@almaden.ibm.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:23 -04:00
Trond Myklebust	683b57b435	NFSv4: Implement the fs_locations function call NFSv4 allows for the fact that filesystems may be replicated across several servers or that they may be migrated to a backup server in case of failure of the primary server. fs_locations is an NFSv4 operation for retrieving information about the location of migrated and/or replicated filesystems. Based on an initial implementation by Jiaying Zhang <jiayingz@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:22 -04:00
Trond Myklebust	51d8fa6a10	NFS: Add timeout to submounts Make automounted partitions expire using the mark_mounts_for_expiry() function. The timeout is controlled via a sysctl. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:20 -04:00
Trond Myklebust	55a975937d	NFS: Ensure the client submounts, when it crosses a server mountpoint. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:19 -04:00
Trond Myklebust	8b4bdcf899	NFS: Store the file system "fsid" value in the NFS super block. This should enable us to detect if we are crossing a mountpoint in the case where the server is exporting "nohide" mounts. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:19 -04:00
Trond Myklebust	8b512d9a88	VFS: Remove dependency of ->umount_begin() call on MNT_FORCE Allow filesystems to decide to perform pre-umount processing whether or not MNT_FORCE is set. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:18 -04:00
Trond Myklebust	da6d503aa0	NFS: Remove nfs_delete_inode() Now that we have a real nfs_invalidate_page() to ensure that truncate_inode_pages() does the right thing when there are pending dirty pages, we can get rid of nfs_delete_inode(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:14 -04:00
Trond Myklebust	d2ccddf042	NFS: Flesh out nfs_invalidate_page() In the case of a call to truncate_inode_pages(), we should really try to cancel any pending writes on the page. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:14 -04:00
J. Bruce Fields	c04871e634	NFSv4: remove obviously bogus comparison from decode_getacl We just set *acl_len to zero, and attrlen is unsigned, so this comparison is clearly bogus. I have no idea what I was thinking. Fixes a bug that caused getacl to fail over krb5p. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:13 -04:00
Alexey Dobriyan	3873bc50e2	NFSv4: really return status from decode_recall_args() Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:12 -04:00
Andreas Gruenbacher	4814f56d19	NFSv3: Client-side nfsacl caching fix Fix two errors in the client-side acl cache: First, when nfs3_proc_getacl requests only the default acl of a file and the access acl is not cached already, a NULL access acl entry is cached instead of ERR_PTR(-EAGAIN) ("not cached"). Second, update the cached acls in nfs3_proc_setacls: nfs_refresh_inode does not always invalidate the cached acls, and when it does not, the cached acls get out of sync. Signed-off-by: Andreas Gruenbacher <agruen@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:11 -04:00
Trond Myklebust	1842bfb447	NFS: Fix up inode revalidation accounting Currently, we are accounting for all calls to nfs_revalidate_inode(), but not to nfs_revalidate_mapping(), or nfs_lookup_verify_inode(), etc... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:10 -04:00
Trond Myklebust	44b11874ff	NFS: Separate metadata and page cache revalidation mechanisms Separate out the function of revalidating the inode metadata, and revalidating the mapping. The former may be called by lookup(), and only really needs to check that permissions, ctime, etc haven't changed whereas the latter needs only done when we want to read data from the page cache, and may need to sync and then invalidate the mapping. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:09 -04:00
Trond Myklebust	38478b24e3	NFS: More page cache revalidation fixups Whenever the directory changes, we want to make sure that we always invalidate its page cache. Fix up update_changeattr() and nfs_mark_for_revalidate() so that they do so. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:09 -04:00
Trond Myklebust	f1bb0b92ba	NFS: Fix page cache revalidation Fix up a bug in the handling of NFS_INO_REVAL_PAGECACHE: make sure that nfs_update_inode() clears it when we're sure we're not racing with other updates. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:08 -04:00
Chuck Lever	0d0b5cb36f	NFS: Optimize allocation of nfs_read/write_data structures Clean up use of page_array, and fix an off-by-one error noticed by Tom Talpey which causes kmalloc calls in cases where using the page_array is sufficient. Test plan: Normal client functional testing with r/wsize=32768. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:07 -04:00
Trond Myklebust	73a3d07c10	NFS: Clean up inode metadata updates Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:04 -04:00
Trond Myklebust	9d1e923222	NFSv4: Some NFSv4 servers have broken behaviour for the change attribute The Linux NFSv4 server violates RFC3530 in that the change attribute is not guaranteed to be updated for every change to the inode. Our optimisation for checking whether or not the inode metadata has changed or not is broken too. Grr.... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:04 -04:00
Trond Myklebust	1de3fc12ea	NFS: Clean up and fix page zeroing when we have short reads The code that is supposed to zero the uninitialised partial pages when the server returns a short read is currently broken: it looks at the nfs_page wb_pgbase and wb_bytes fields instead of the equivalent nfs_read_data values when deciding where to start truncating the page. Also ensure that we are more careful about setting PG_uptodate before retrying a short read: the retry will change the nfs_read_data args.pgbase and args.count. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-06-09 09:34:03 -04:00
Carsten Otte	7451c4f0ee	NFS: remove needless check in nfs_opendir() Local variable res was initialized to 0 - no check needed here. Signed-off-by: Carsten Otte <cotte@de.ibm.com> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-04-19 13:06:37 -04:00
John Hawkes	b9d9506d94	NFS: nfs_show_stats; for_each_possible_cpu(), not NR_CPUS Convert a for-loop that explicitly references "NR_CPUS" into the potentially more efficient for_each_possible_cpu() construct. Signed-off-by: John Hawkes <hawkes@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-04-19 13:06:20 -04:00
Trond Myklebust	e99170ff3b	NFS,SUNRPC: Fix compiler warnings if CONFIG_PROC_FS & CONFIG_SYSCTL are unset Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-04-19 12:43:47 -04:00
Trond Myklebust	95cf959b24	VFS: Fix another open intent Oops If the call to nfs_intent_set_file() fails to open a file in nfs4_proc_create(), we should return an error. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-04-19 12:43:46 -04:00
Arjan van de Ven	4b6f5d20b0	[PATCH] Make most file operations structs in fs/ const This is a conversion to make the various file_operations structs in fs/ const. Basically a regexp job, with a few manual fixups The goal is both to increase correctness (harder to accidentally write to shared datastructures) and reducing the false sharing of cachelines with things that get dirty in .data (while .rodata is nicely read only and thus cache clean) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-28 09:16:06 -08:00
Matthew Dobson	93d2341c75	[PATCH] mempool: use mempool_create_slab_pool() Modify well over a dozen mempool users to call mempool_create_slab_pool() rather than calling mempool_create() with extra arguments, saving about 30 lines of code and increasing readability. Signed-off-by: Matthew Dobson <colpatch@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:57:00 -08:00
NeilBrown	2ff28e22bd	[PATCH] Make address_space_operations->invalidatepage return void The return value of this function is never used, so let's be honest and declare it as void. Some places where invalidatepage returned 0, I have inserted comments suggesting a BUG_ON. [akpm@osdl.org: JBD BUG fix] [akpm@osdl.org: rework for git-nfs] [akpm@osdl.org: don't go BUG in block_invalidate_page()] Signed-off-by: Neil Brown <neilb@suse.de> Acked-by: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:56:55 -08:00
Ingo Molnar	353ab6e97b	[PATCH] sem2mutex: fs/ Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Ingo Molnar <mingo@elte.hu> Cc: Eric Van Hensbergen <ericvh@ericvh.myip.org> Cc: Robert Love <rml@tech9.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-26 08:56:55 -08:00
Linus Torvalds	53846a21c1	Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 * git://git.linux-nfs.org/pub/linux/nfs-2.6: (103 commits) SUNRPC,RPCSEC_GSS: spkm3--fix config dependencies SUNRPC,RPCSEC_GSS: spkm3: import contexts using NID_cast5_cbc LOCKD: Make nlmsvc_traverse_shares return void LOCKD: nlmsvc_traverse_blocks return is unused SUNRPC,RPCSEC_GSS: fix krb5 sequence numbers. NFSv4: Dont list system.nfs4_acl for filesystems that don't support it. SUNRPC,RPCSEC_GSS: remove unnecessary kmalloc of a checksum SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release() SUNRPC: Fix memory barriers for req->rq_received NFS: Fix a race in nfs_sync_inode() NFS: Clean up nfs_flush_list() NFS: Fix a race with PG_private and nfs_release_page() NFSv4: Ensure the callback daemon flushes signals SUNRPC: Fix a 'Busy inodes' error in rpc_pipefs NFS, NLM: Allow blocking locks to respect signals NFS: Make nfs_fhget() return appropriate error values NFSv4: Fix an oops in nfs4_fill_super lockd: blocks should hold a reference to the nlm_file NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE NFSv4: Send the delegation stateid for SETATTR calls ...	2006-03-25 09:18:27 -08:00
Paul Jackson	fffb60f93c	[PATCH] cpuset memory spread: slab cache format Rewrap the overly long source code lines resulting from the previous patch's addition of the slab cache flag SLAB_MEM_SPREAD. This patch contains only formatting changes, and no function change. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-24 07:33:23 -08:00
Paul Jackson	4b6a9316fa	[PATCH] cpuset memory spread: slab cache filesystems Mark file system inode and similar slab caches subject to SLAB_MEM_SPREAD memory spreading. If a slab cache is marked SLAB_MEM_SPREAD, then anytime that a task that's in a cpuset with the 'memory_spread_slab' option enabled goes to allocate from such a slab cache, the allocations are spread evenly over all the memory nodes (task->mems_allowed) allowed to that task, instead of favoring allocation on the node local to the current cpu. The following inode and similar caches are marked SLAB_MEM_SPREAD: file cache ==== ===== fs/adfs/super.c adfs_inode_cache fs/affs/super.c affs_inode_cache fs/befs/linuxvfs.c befs_inode_cache fs/bfs/inode.c bfs_inode_cache fs/block_dev.c bdev_cache fs/cifs/cifsfs.c cifs_inode_cache fs/coda/inode.c coda_inode_cache fs/dquot.c dquot fs/efs/super.c efs_inode_cache fs/ext2/super.c ext2_inode_cache fs/ext2/xattr.c (fs/mbcache.c) ext2_xattr fs/ext3/super.c ext3_inode_cache fs/ext3/xattr.c (fs/mbcache.c) ext3_xattr fs/fat/cache.c fat_cache fs/fat/inode.c fat_inode_cache fs/freevxfs/vxfs_super.c vxfs_inode fs/hpfs/super.c hpfs_inode_cache fs/isofs/inode.c isofs_inode_cache fs/jffs/inode-v23.c jffs_fm fs/jffs2/super.c jffs2_i fs/jfs/super.c jfs_ip fs/minix/inode.c minix_inode_cache fs/ncpfs/inode.c ncp_inode_cache fs/nfs/direct.c nfs_direct_cache fs/nfs/inode.c nfs_inode_cache fs/ntfs/super.c ntfs_big_inode_cache_name fs/ntfs/super.c ntfs_inode_cache fs/ocfs2/dlm/dlmfs.c dlmfs_inode_cache fs/ocfs2/super.c ocfs2_inode_cache fs/proc/inode.c proc_inode_cache fs/qnx4/inode.c qnx4_inode_cache fs/reiserfs/super.c reiser_inode_cache fs/romfs/inode.c romfs_inode_cache fs/smbfs/inode.c smb_inode_cache fs/sysv/inode.c sysv_inode_cache fs/udf/super.c udf_inode_cache fs/ufs/super.c ufs_inode_cache net/socket.c sock_inode_cache net/sunrpc/rpc_pipe.c rpc_inode_cache The choice of which slab caches to so mark was quite simple. I marked those already marked SLAB_RECLAIM_ACCOUNT, except for fs/xfs, dentry_cache, inode_cache, and buffer_head, which were marked in a previous patch. Even though SLAB_RECLAIM_ACCOUNT is for a different purpose, it marks the same potentially large file system i/o related slab caches as we need for memory spreading. Given that the rule now becomes "wherever you would have used a SLAB_RECLAIM_ACCOUNT slab cache flag before (usually the inode cache), use the SLAB_MEM_SPREAD flag too", this should be easy enough to maintain. Future file system writers will just copy one of the existing file system slab cache setups and tend to get it right without thinking. Signed-off-by: Paul Jackson <pj@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-24 07:33:23 -08:00
Tobias Klauser	e8c96f8c29	[PATCH] fs: Use ARRAY_SIZE macro Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) and remove a duplicate of ARRAY_SIZE. Some trailing whitespaces are also deleted. Signed-off-by: Tobias Klauser <tklauser@nuerscht.ch> Cc: David Howells <dhowells@redhat.com> Cc: Dave Kleikamp <shaggy@austin.ibm.com> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Chris Mason <mason@suse.com> Cc: Jeff Mahoney <jeffm@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-24 07:33:19 -08:00
Theodore Ts'o	9b04c997b1	[PATCH] vfs: MS_VERBOSE should be MS_SILENT The meaning of MS_VERBOSE is backwards; if the bit is set, it really means, "don't be verbose". This is confusing and counter-intuitive. In addition, there is also no way to set the MS_VERBOSE flag in the mount(8) program in util-linux, but interesting, it does define options which would do the right thing if MS_SILENT were defined, which unfortunately we do not: #ifdef MS_SILENT { "quiet", 0, 0, MS_SILENT }, /* be quiet / { "loud", 0, 1, MS_SILENT }, / print out messages. */ #endif So the obvious fix is to deprecate the use of MS_VERBOSE and replace it with MS_SILENT. Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-24 07:33:15 -08:00
J. Bruce Fields	096455a22a	NFSv4: Dont list system.nfs4_acl for filesystems that don't support it. Thanks to Frank Filz for pointing out that we list system.nfs4_acl extended attribute even on filesystems where we don't actually support nfs4_acl. This is inconsistent with the e.g. ext3 POSIX ACL behaviour, and seems to annoy cp. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 23:23:42 -05:00
Trond Myklebust	7a1218a277	SUNRPC: Ensure rpc_call_async() always calls tk_ops->rpc_release() Currently this will not happen if we exit before rpc_new_task() was called. Also fix up rpc_run_task() to do the same (for consistency). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 18:11:10 -05:00
Trond Myklebust	c42de9dd67	NFS: Fix a race in nfs_sync_inode() Kudos to Neil Brown for spotting the problem: "in nfs_sync_inode, there is effectively the sequence: nfs_wait_on_requests nfs_flush_inode nfs_commit_inode This seems a bit racy to me as if the only requests are on the ->commit list, and nfs_commit_inode is called separately after nfs_wait_on_requests completes, and before nfs_commit_inode start (say: by nfs_write_inode) then none of these function will return >0, yet there will be some pending request that aren't waited for." The solution is to search for requests to wait upon, search for dirty requests, and search for uncommitted requests while holding the nfsi->req_lock The patch also cleans up nfs_sync_inode(), getting rid of the redundant FLUSH_WAIT flag. It turns out that we were always setting it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:51 -05:00
Trond Myklebust	7d46a49f51	NFS: Clean up nfs_flush_list() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:50 -05:00
Trond Myklebust	deb7d63826	NFS: Fix a race with PG_private and nfs_release_page() We don't need to set PG_private for readahead pages, since they never get unlocked while I/O is in progress. However there is a small race in nfs_readpage_release() whereby the page may be unlocked, and have PG_private set. Fix is to have PG_private set only for the case of writes... Also fix a bug in nfs_clear_page_writeback(): Don't attempt to clear the radix_tree tag if we've already deleted the radix tree entry. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:50 -05:00
Trond Myklebust	1dd761e907	NFSv4: Ensure the callback daemon flushes signals If the callback daemon is signalled, but is unable to exit because it still has users, then we need to flush signals. If not, then svc_recv() can never sleep, and so we hang. If we flush signals, then we also have to be prepared to resend them when we want the thread to exit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:49 -05:00
Trond Myklebust	a9a801787a	NFS, NLM: Allow blocking locks to respect signals Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:48 -05:00
Trond Myklebust	03f28e3a20	NFS: Make nfs_fhget() return appropriate error values Currently it returns NULL, which usually gets interpreted as ENOMEM. In fact it can mean a host of issues. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:48 -05:00
Trond Myklebust	01d0ae8bea	NFSv4: Fix an oops in nfs4_fill_super The mount statistics patches introduced a call to nfs_free_iostats that is not only redundant, but actually causes an oops. Also fix a memory leak due to the lack of a call to nfs_free_iostats on unmount. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:48 -05:00
Trond Myklebust	51581f3bf9	NFSv4: SETCLIENTID_CONFIRM should handle NFS4ERR_DELAY/NFS4ERR_RESOURCE Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:47 -05:00
Trond Myklebust	3e4f6290ca	NFSv4: Send the delegation stateid for SETATTR calls In the case where we hold a delegation stateid, use that in for inside SETATTR calls. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:46 -05:00
Trond Myklebust	f25bc34967	NFSv4: Ensure nfs_callback_down() calls svc_destroy() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:46 -05:00
Trond Myklebust	e4cd038a45	NLM: Fix nlmclnt_test to not copy private part of locks The struct file_lock does not carry a properly initialised lock, so don't copy it as if it were. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:44 -05:00
Trond Myklebust	d72b7a6b26	NFS: O_DIRECT needs to use a completion Now that we have aio writes, it is possible for dreq->outstanding to be zero, but for the I/O not to have completed. Convert struct nfs_direct_req to use a completion to signal when the I/O is done. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:43 -05:00
Trond Myklebust	6b45d858ed	NFS: Clean up nfs_get_user_pages Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:43 -05:00
Chuck Lever	606bbba06b	NFS: fix compiler warnings on 64-bit platforms Introduced by NFS aio+dio patches. Test plan: Compile kernel with CONFIG_NFS enabled on 64-bit hardware. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:42 -05:00
Trond Myklebust	3feb2d4939	NFS: Uninline nfs_writedata_(alloc\|free) and nfs_readdata_(alloc\|free) Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:37 -05:00
Trond Myklebust	5db3a7b2ca	NFS: Debugging code for nfs_direct_(read\|write)_schedule() Make sure that we're doing our list accounting correctly. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:37 -05:00
Trond Myklebust	a8881f5a5c	NFS: O_DIRECT async IO may lose context The struct nfs_direct_req currently keeps a pointer to the file descriptor without referencing it. This may cause problems if the parent process is killed. The nfs_open_context should normally have all the information that we're currently using the filp for, and unlike fput(), is safe to release from an rpciod process context. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:36 -05:00
Trond Myklebust	fad6149041	nfs: Use UNSTABLE + COMMIT for NFS O_DIRECT writes Currently NFS O_DIRECT writes use FILE_SYNC so that a COMMIT is not necessary. This simplifies the internal logic, but this could be a difficult workload for some servers. Instead, let's send UNSTABLE writes, and after they all complete, send a COMMIT for the dirty range. After the COMMIT returns successfully, then do the wake_up or fire off aio_complete(). Test plan: Async direct I/O tests against Solaris (or any server that requires committed unstable writes). Reboot server during test. Based on an earlier patch by Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:36 -05:00
Trond Myklebust	e17b1fc4b3	NFS: Make nfs_commit_alloc() extern We need to use nfs_commit_alloc() in fs/nfs/direct.c. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:35 -05:00
Chuck Lever	a37ec012d7	NFS: fix data_update accounting in NFS direct I/O path ^C against "iozone -I" is hitting the assertion in nfs_clear_inode(). Test plan: "iozone -i0 -I -a -c" against a slow server, then control C. This should not cause an oops. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:35 -05:00
Chuck Lever	15ce4a0c1c	NFS: Replace atomic_t variables in nfs_direct_req with a single spin lock Three atomic_t variables cause a lot of bus locking. Because they are all used in the same places in the code, just use a single spin lock. Now that the atomic_t variables are gone, we can remove the request size limitation since the code no longer depends on the limited width of atomic_t on some platforms. Test plan: Compile with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx operations, iozone, OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:34 -05:00
Chuck Lever	88467055f7	NFS: clean up comments and tab damage in direct.c Clean up tab damage and comments. Replace "file_offset" with more commonly used "pos". Test plan: Compile with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:34 -05:00
Chuck Lever	9eafa8cc52	NFS: support EIOCBQUEUED return in direct write path For async iocb's, the NFS direct write path now returns EIOCBQUEUED, and calls aio_complete when all the requested writes are finished. The synchronous part of the NFS direct write path behaves exactly as it was before. Shared mapped NFS files will have some coherency difficulties when accessed concurrently with aio+dio. Will need to explore how this is handled in the local file system case. Test plan: aio-stress with "-O". OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:33 -05:00
Chuck Lever	c89f2ee5f9	NFS: make iocb available everywhere in direct write path Pass the iocb argument all the way down to the direct write request scheduler, and make it available in nfs_direct_write_result. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:33 -05:00
Chuck Lever	47989d7454	NFS: remove support for multi-segment iovs in the direct write path Eliminate the persistent use of automatic storage in all parts of the NFS client's direct write path to pave the way for introducing support for aio against files opened with the O_DIRECT flag. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:32 -05:00
Chuck Lever	462d5b3296	NFS: make direct write path generate write requests concurrently Duplicate infrastructure from direct read path that will allow write path to generate multiple write requests concurrently. This will enable us to add support for aio in this path. Temporarily we will lose the ability to do UNSTABLE writes followed by a COMMIT in the direct write path. However, all applications I am aware of that use NFS O_DIRECT currently write in relatively small chunks, so this should not be inconvenient in any way. Test plan: Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:32 -05:00
Chuck Lever	63ab46abc7	NFS: create common routine for handling direct I/O completion Factor out the common piece of completing an NFS direct I/O request. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:31 -05:00
Chuck Lever	93619e5989	NFS: create common routine for allocating nfs_direct_req Factor out a small common piece of the path that allocate nfs_direct_req structures. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:31 -05:00
Chuck Lever	bc0fb201b3	NFS: create common routine for waiting for direct I/O to complete We're about to add asynchrony to the NFS direct write path. Begin by abstracting out the common pieces in the read path. The first piece is nfs_direct_read_wait, which works the same whether the process is waiting for a read or a write. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:31 -05:00
Chuck Lever	487b83723e	NFS: support EIOCBQUEUED return in direct read path For async iocb's, the NFS direct read path should return EIOCBQUEUED and call aio_complete when all the requested reads are finished. The synchronous part of the NFS direct read path behaves exactly as it was before. Test plan: aio-stress with "-O". OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:30 -05:00
Chuck Lever	99514f8fdd	NFS: make iocb available everywhere in direct read path Pass the iocb argument all the way down to the direct read request scheduler, and make it available in nfs_direct_read_result. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:30 -05:00
Chuck Lever	0cdd80d07f	NFS: remove support for multi-segment iovs in the direct read path Eliminate the persistent use of automatic storage in all parts of the NFS client's direct read path to pave the way for introducing support for aio against files opened with the O_DIRECT flag. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:29 -05:00
Chuck Lever	5dd602f206	NFS: use size_t type for holding rsize bytes in NFS O_DIRECT read path size_t is used for holding byte counts, so use it for variables storing rsize. Note that the write path will be updated as we add support for async O_DIRECT writes. Test plan: Need to verify that existing comparisons against new size_t variables behave correctly. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:29 -05:00
Chuck Lever	d4cc948ba9	NFS: update comments and function definitions in fs/nfs/direct.c Update to latest coding style standards. Remove block comments on statically defined functions, and place function definitions all on one line. Test plan: Compile kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:28 -05:00
Chuck Lever	b8a32e2b8b	NFS: clean up NFS client's a_ops->direct_IO method The NFS client's a_ops->direct_IO method, nfs_direct_IO, is required to be present to allow NFS files to be opened with O_DIRECT, but is never called because the NFS client shunts reads and writes to files opened with O_DIRECT directly to its own routines. Gut the nfs_direct_IO function. This eliminates the only part of the NFS client's direct I/O path that requires support for multi-segment iovs, allowing further simplification in subsequent patches. Test plan: Compile the kernel with CONFIG_NFS and CONFIG_NFS_DIRECTIO enabled. Millions of fsx-odirect ops. OraSim. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:28 -05:00
Trond Myklebust	ec06c096ed	NFS: Cleanup of NFS read code Same callback hierarchy inversion as for the NFS write calls. This patch is not strictly speaking needed by the O_DIRECT code, but avoids confusing differences between the asynchronous read and write code. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:27 -05:00
Trond Myklebust	788e7a89a0	NFS: Cleanup of NFS write code in preparation for asynchronous o_direct This patch inverts the callback hierarchy for NFS write calls. Instead of having the NFSv2/v3/v4-specific code set up the RPC callback ops, we allow the original caller to do so. This allows for more flexibility w.r.t. how to set up and tear down the nfs_write_data structure while still allowing the NFSv3/v4 code to perform error handling. The greater flexibility is needed by the asynchronous O_DIRECT code, which wants to be able to hold on to the original nfs_write_data structures after the WRITE RPC call has completed in order to be able to replay them if the COMMIT call determines that the server has rebooted. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:27 -05:00
Andy Adamson	8dc7c3115b	locks,lockd: fix race in nlmsvc_testlock posix_test_lock() returns a pointer to a struct file_lock which is unprotected and can be removed while in use by the caller. Move the conflicting lock from the return to a parameter, and copy the conflicting lock. In most cases the caller ends up putting the copy of the conflicting lock on the stack. On i386, sizeof(struct file_lock) appears to be about 100 bytes. We're assuming that's reasonable. Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:26 -05:00
Chuck Lever	1e7cb3dc12	NFS: directory trace messages Reuse NFSDBG_DIRCACHE and NFSDBG_LOOKUPCACHE to provide additional diagnostic messages that trace the operation of the NFS client's directory behavior. A few new messages are now generated when NFSDBG_VFS is active, as well, to trace normal VFS activity. This compromise provides better trace debugging for those who use pre-built kernels, without adding a lot of extra noise to the standard debug settings. Test-plan: Enable NFS trace debugging with flags 1, 2, or 4. You should be able to see different types of trace messages with each flag setting. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:24 -05:00
Chuck Lever	dead28da8e	SUNRPC: eliminate rpc_call() Clean-up: replace rpc_call() helper with direct call to rpc_call_sync. This makes NFSv2 and NFSv3 synchronous calls more computationally efficient, and reduces stack consumption in functions that used to invoke rpc_call more than once. Test plan: Compile kernel with CONFIG_NFS enabled. Connectathon on NFS version 2, version 3, and version 4 mount points. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:23 -05:00
Chuck Lever	cc0175c1dc	SUNRPC: display human-readable procedure name in rpc_iostats output Add fields to the rpc_procinfo struct that allow the display of a human-readable name for each procedure in the rpc_iostats output. Also fix it so that the NFSv4 stats are broken up correctly by sub-procedure number. NFSv4 uses only two real RPC procedures: NULL, and COMPOUND. Test plan: Mount with NFSv2, NFSv3, and NFSv4, and do "cat /proc/self/mountstats". Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:22 -05:00
Chuck Lever	4ece3a2d18	NFS: add RPC I/O statistics to /proc/self/mountstats NFS client now shows various RPC I/O metrics in /proc/self/mountstats. Test plan: Mount/umount while doing "cat /proc/self/mountstats", multiple iterations of connectathon locking suite. Test with NFS version 2, 3, and 4. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:22 -05:00
Chuck Lever	67ec9f46b8	NFS: report how long an NFS file system has been mounted Add a field in nfs_server to record a timestamp when a mount succeeds. Report the number of seconds the file system has been mounted via nfs_show_stats(). Test plan: Mount an NFS file system, watch the mountstats reports and compare with clock time. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:15 -05:00
Chuck Lever	006ea73e5f	NFS: add hooks to account for NFSERR_JUKEBOX errors Make an inode or an nfs_server struct available in the logic that handles JUKEBOX/DELAY type errors so the NFS client can account for them. This patch is split out from the main nfs iostat patch to highlight minor architectural changes required to support this statistic. Test plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:14 -05:00
Chuck Lever	91d5b47023	NFS: add I/O performance counters Invoke the byte and event counter macros where we want to count bytes and events. Clean-up: fix a possible NULL dereference in nfs_lock, and simplify nfs_file_open. Test-plan: fsx and iozone on UP and SMP systems, with and without pre-emption. Watch for memory overwrite bugs, and performance loss (significantly more CPU required per op). Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:14 -05:00
Chuck Lever	d9ef5a8c26	NFS: introduce mechanism for tracking NFS client metrics Add a per-superblock performance counter facility to the NFS client. This facility mimics the counters available for block devices and for networking. Expose these new counters via the new /proc/self/mountstats interface. Thanks to Andrew Morton and Trond Myklebust for their review and comments. Test plan: fsx and iozone on UP and SMP systems, with and without pre-emption. Watch for memory overwrite bugs, and performance loss (significantly more CPU required per op). Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:13 -05:00
Chuck Lever	c8bded96aa	NFS: clean up some mount options Get rid of "lock" and "posix", and spell out "vers=". Test plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:13 -05:00
Chuck Lever	7a480e250c	NFS: show retransmit settings when displaying mount options Sometimes it's important to know the exact RPC retransmit settings the kernel is using for an NFS mount point. Add this facility to the NFS client's show_options method. Test plan: Set various retransmit settings via the mount command, and check that the settings are reflected in /proc/mounts. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:12 -05:00
Ingo Molnar	c9d5128a10	NFS: sem2mutex idmap.c semaphore to mutex conversion. the conversion was generated via scripts, and the result was validated automatically via a script as well. build and boot tested. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:11 -05:00
Eric Sesterhenn	bd6475454c	NFS: kzalloc conversion in fs/nfs this converts fs/nfs to kzalloc() usage. compile tested with make allyesconfig Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:10 -05:00
Trond Myklebust	a162a6b804	NFSv4: Kill braindead gcc warnings nfs4_open_revalidate: 'res' may be used uninitialized nfs4_callback_compound: ‘hdr_res.nops’ may be used uninitialized 'op_nr’ may be used uninitialized encode_getattr_res: ‘savep’ may be used uninitialized Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:10 -05:00
Trond Myklebust	967b928136	NFSv4: Do not call rpciod_down() before call to destroy_nfsv4_state() The reason is that the idmapper cleanup may call flush_workqueue() on rpciod_workqueue. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:09 -05:00
Trond Myklebust	12de3b35ea	SUNRPC: Ensure that rpc_mkpipe returns a refcounted dentry If not, we cannot guarantee that idmap->idmap_dentry, gss_auth->dentry and clnt->cl_dentry are valid dentries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:09 -05:00
Trond Myklebust	fb374d24f2	NFS: reduce the number of false cache invalidations. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:08 -05:00
Jesper Juhl	c8d149f3db	NFS: "const static" vs "static const" in nfs4 My previous "const static" vs "static const" cleanup missed a single case, patch below takes care of it. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:07 -05:00
Trond Myklebust	ca62b9c3f7	NFSv4: Don't invalidate cached attributes if change attribute is unchanged Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:07 -05:00
Trond Myklebust	755c1e20cd	NFS: writes should not clobber utimes() calls Ensure that we flush out writes in the case when someone calls utimes() in order to set the file times. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:06 -05:00
Neil Brown	1dd594b21b	NFS: Fix buglet in fs/nfs/write.c I've been reading through fs/nfs/write.c trying to track down a bug that seems to be related to pages loosing a refcount and getting freed too early (you interested in detail??) and I spotted a little bug which the following patch should fix. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:04 -05:00
Trond Myklebust	cd52ed3553	NFS: Avoid races between writebacks and truncation Currently, there is no serialisation between NFS asynchronous writebacks and truncation at the page level due to the fact that nfs_sync_inode() cannot lock the pages that it is about to write out. This means that it is possible to be flushing out data (and calling something like set_page_writeback()) while the page cache is busy evicting the page. Oops... Use the hooks provided in try_to_release_page() to ensure that dirty pages are always written back to storage before we evict them. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:04 -05:00
Trond Myklebust	b92dccf65b	NFS: Fix a busy inodes issue... The nfs_open_context may live longer than the file descriptor that spawned it, so it needs to carry a reference to the vfsmount. If not, then generic_shutdown_super() may end up being called before reads and writes have been flushed out. Make a couple of functions static while we're at it... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-03-20 13:44:03 -05:00
Trond Myklebust	c12e87f465	[PATCH] NFSv4: fix mount segfault on errors returned that are < -1000 It turns out that nfs4_proc_get_root() may return raw NFSv4 errors instead of mapping them to kernel errors. Problem spotted by Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-14 07:57:18 -08:00
Trond Myklebust	143f412eb4	[PATCH] NFS: Fix a potential panic in O_DIRECT Based on an original patch by Mike O'Connor and Greg Banks of SGI. Mike states: A normal user can panic an NFS client and cause a local DoS with 'judicious'(?) use of O_DIRECT. Any O_DIRECT write to an NFS file where the user buffer starts with a valid mapped page and contains an unmapped page, will crash in this way. I haven't followed the code, but O_DIRECT reads with similar user buffers will probably also crash albeit in different ways. Details: when nfs_get_user_pages() calls get_user_pages(), it detects and correctly handles get_user_pages() returning an error, which happens if the first page covered by the user buffer's address range is unmapped. However, if the first page is mapped but some subsequent page isn't, get_user_pages() will return a positive number which is less than the number of pages requested (this behaviour is sort of analagous to a short write() call and appears to be intentional). nfs_get_user_pages() doesn't detect this and hands off the array of pages (whose last few elements are random rubbish from the newly allocated array memory) to it's caller, whence they go to nfs_direct_write_seg(), which then totally ignores the nr_pages it's given, and calculates its own idea of how many pages are in the array from the user buffer length. Needless to say, when it comes to transmit those uninitialised page* pointers, we see a crash in the network stack. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-03-14 07:57:17 -08:00
Al Viro	8854eddbdb	[PATCH] nfsroot port= parameter fix [backport of 2.4 fix] Direct backport of 2.4 fix that didn't get propagated to 2.6; original comment follows: <quote> When I specify the NFS port for nfsroot (e.g., nfsroot=<dir>,port=2049), the kernel uses the wrong port. In my case it tries to use 264 (0x108) instead of 2049 (0x801). This patch adds the missing htons(). Eric </quote> Patch got applied in 2.4.21-pre6. Author: Eric Lammerts (<eric@lammerts.org>, AFAICS). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2006-02-07 21:00:42 -05:00
Dirk Mueller	1935245655	NFSv3: fix sync_retry in direct i/o NFS Only do a sync_retry if the memcmp failed. Signed-off-by: Dirk Mueller <dmueller@suse.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-02-01 12:52:25 -05:00
Christoph Hellwig	fc33a7bb9c	[PATCH] per-mountpoint noatime/nodiratime Turn noatime and nodiratime into per-mount instead of per-sb flags. After all the preparations this is a rather trivial patch. The mount code needs to treat the two options as per-mount instead of per-superblock, and touch_atime needs to be changed to check the new MNT_ flags in addition to the MS_ flags that are kept for filesystems that are always noatime/nodiratime but not user settable anymore. Besides that core code only nfs needed an update because it's leaving atime updates to the server and thus sets the S_NOATIME flag on every inode, but needs to know whether it's a real noatime mount for an getattr optimization. While we're at it I've killed the IS_NOATIME/IS_NODIRATIME macros that were only used by touch_atime. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-10 08:01:34 -08:00
Jes Sorensen	1b1dcc1b57	[PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem This patch converts the inode semaphore to a mutex. I have tested it on XFS and compiled as much as one can consider on an ia64. Anyway your luck with it might be different. Modified-by: Ingo Molnar <mingo@elte.hu> (finished the conversion) Signed-off-by: Jes Sorensen <jes@sgi.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2006-01-09 15:59:24 -08:00
Jorn Dreyer	21b6bf143d	[PATCH] nfsroot: do not silently stop parsing on an unknown option It would be helpful if the kernel did not silently stop parsing nfs options, but instead warned about any he does not recognize. The attached patch adds one printk to do just that. It took me a couple of hours to find my configuration mistake. Cc: "David S. Miller" <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:57 -08:00
OGAWA Hirofumi	28fd129827	[PATCH] Fix and add EXPORT_SYMBOL(filemap_write_and_wait) This patch add EXPORT_SYMBOL(filemap_write_and_wait) and use it. See mm/filemap.c: And changes the filemap_write_and_wait() and filemap_write_and_wait_range(). Current filemap_write_and_wait() doesn't wait if filemap_fdatawrite() returns error. However, even if filemap_fdatawrite() returned an error, it may have submitted the partially data pages to the device. (e.g. in the case of -ENOSPC) <quotation> Andrew Morton writes, If filemap_fdatawrite() returns an error, this might be due to some I/O problem: dead disk, unplugged cable, etc. Given the generally crappy quality of the kernel's handling of such exceptions, there's a good chance that the filemap_fdatawait() will get stuck in D state forever. </quotation> So, this patch doesn't wait if filemap_fdatawrite() returns the -EIO. Trond, could you please review the nfs part? Especially I'm not sure, nfs must use the "filemap_fdatawrite(inode->i_mapping) == 0", or not. Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-01-08 20:13:47 -08:00
Trond Myklebust	26c78e156b	NFSv4: Fix an Oops in nfs_do_expire_all_delegations If the loop errors, we need to exit. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:58 -05:00
Trond Myklebust	58df095b73	NFSv4: Allow entries in the idmap cache to expire If someone changes the uid/gid mapping in userland, then we do eventually want those changes to be propagated to the kernel. Currently the kernel assumes that it may cache entries forever. Add an expiration time + garbage collector for idmap entries. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:58 -05:00
Trond Myklebust	eadb8c1471	NFS: get rid of some needless code obfuscation in xdr_encode_sattr(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:57 -05:00
Trond Myklebust	cf3fff54a4	NFS: Send valid mode bits to the server inode->i_mode contains a lot more than just the mode bits. Make sure that we mask away this extra stuff in SETATTR calls to the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:57 -05:00
Chuck Lever	f518e35aec	SUNRPC: get rid of cl_chatty Clean up: Every ULP that uses the in-kernel RPC client, except the NLM client, sets cl_chatty. There's no reason why NLM shouldn't set it, so just get rid of cl_chatty and always be verbose. Test-plan: Compile with CONFIG_NFS enabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:56 -05:00
J. Bruce Fields	03c2173393	NFSv3: try get_root user-supplied security_flavor Thanks to Ed Keizer for bug and root cause. He says: "... we could only mount the top-level Solaris share. We could not mount deeper into the tree. Investigation showed that Solaris allows UNIX authenticated FSINFO only on the top level of the share. This is a problem because we share/export our home directories one level higher than we mount them. I.e. we share the partition and not the individual home directories. This prevented access to home directories." We still may need to try auth_sys for the case where the client doesn't have appropriate credentials. Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:55 -05:00
Trond Myklebust	a72b44222d	NFSv4: Allow user to set the port used by the NFSv4 callback channel Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:52 -05:00
Trond Myklebust	a895b4a198	NFS: Clean up weak cache consistency code ...and ensure that nfs_update_inode() respects wcc Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:52 -05:00
Trond Myklebust	fa178f29c0	NFSv4: Ensure DELEGRETURN returns attributes Upon return of a write delegation, the server will almost always bump the change attribute. Ensure that we pick up that change so that we don't invalidate our data cache unnecessarily. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:51 -05:00
Trond Myklebust	beb2a5ec38	NFSv4: Ensure change attribute returned by GETATTR callback conforms to spec According to RFC3530 we're supposed to cache the change attribute at the time the client receives a write delegation. If the inode is clean, a CB_GETATTR callback by the server to the client is supposed to return the cached change attribute. If, OTOH, the inode is dirty, the client should bump the cached change attribute by 1. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:51 -05:00
Trond Myklebust	566dd6064e	NFS: Make directIO aware of compound pages... ...and avoid calling set_page_dirty on them Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:50 -05:00
Trond Myklebust	70b9ecbdb9	NFS: Make stat() return updated mtimes after a write() The SuS states that a call to write() will cause mtime to be updated on the file. In order to satisfy that requirement, we need to flush out any cached writes in nfs_getattr(). Speed things up slightly by not committing the writes. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:50 -05:00
Trond Myklebust	24174119c7	NFSv4: Ensure that we return the delegation on the target of a rename too. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:50 -05:00
Chuck Lever	40859d7ee6	NFS: support large reads and writes on the wire Most NFS server implementations allow up to 64KB reads and writes on the wire. The Solaris NFS server allows up to a megabyte, for instance. Now the Linux NFS client supports transfer sizes up to 1MB, too. This will help reduce protocol and context switch overhead on read/write intensive NFS workloads, and support larger atomic read and write operations on servers that support them. Test-plan: Connectathon and iozone on mount point with wsize=rsize>32768 over TCP. Tests with NFS over UDP to verify the maximum RPC payload size cap. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:49 -05:00
Chuck Lever	325cfed9ae	NFS: make "inode number mismatch" message more useful To help NFS users and server developers, make the "inode number mismatch" message display more useful information. Test-plan: None. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:49 -05:00
Chuck Lever	dc20f80390	NFS: get rid of useless kernel log message nfs_statfs() generates a log message when GETATTR returns an error. This is usually a useless message. Make it a dprintk. Test plan: None Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:48 -05:00
Chuck Lever	6b59a75460	NFS: Fix error recovery code in fs/nfs/inode.c:__init_nfs() Red Hat found a problem in the error recovery logic in __init_nfs. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:48 -05:00
Chuck Lever	ce1a8e6796	NFS: use generic_write_checks() to sanity check direct writes Replace ad hoc write parameter sanity checking in nfs_file_direct_write() with a call to generic_write_checks(). This should make the proper checks modulo the O_LARGEFILE flag, and should catch NFSv2-specific limitations by virtue of i_sb->s_maxbytes. Test plan: Posix compliance testing with both NFSv2 and NFSv3. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:47 -05:00
Trond Myklebust	286d7d6a0c	NFSv4: Remove requirement for machine creds for the "setclientid" operation Use a cred from the nfs4_client->cl_state_owners list. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:47 -05:00
Trond Myklebust	b4454fe1a7	NFSv4: Remove requirement for machine creds for the "renew" operation In RFC3530, the RENEW operation is allowed to use either the same principal, RPC security flavour and (if RPCSEC_GSS), the same mechanism and service that was used for SETCLIENTID_CONFIRM OR Any principal, RPC security flavour and service combination that currently has an OPEN file on the server. Choose the latter since that doesn't require us to keep credentials for the same principal for the entire duration of the mount. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:47 -05:00
Trond Myklebust	58d9714a44	NFSv4: Send RENEW requests to the server only when we're holding state Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:46 -05:00
Trond Myklebust	5043e900f5	NFS: Convert instances of kernel_thread() to kthread() Convert private implementations in NFSv4 state recovery and delegation code to use kthreads. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:46 -05:00
Trond Myklebust	433fbe4c88	NFSv4: State recovery cleanup Use wait_on_bit() when waiting for state recovery to complete. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:45 -05:00
Trond Myklebust	26e976a884	NFSv4: OPEN/LOCK/LOCKU/CLOSE will automatically renew the NFSv4 lease Cut down on the number of unnecessary RENEW requests on the wire. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:45 -05:00
Trond Myklebust	fe650407a8	NFSv4: Make DELEGRETURN an interruptible operation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:44 -05:00
Trond Myklebust	a5d16a4d09	NFSv4: Convert LOCK rpc call into an asynchronous RPC call In order to allow users to interrupt/cancel it. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:44 -05:00
Trond Myklebust	911d1aaf26	NFSv4: locking XDR cleanup Get rid of some unnecessary intermediate structures Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:44 -05:00
Trond Myklebust	864472e9b8	NFSv4: Make open recovery track O_RDWR, O_RDONLY and O_WRONLY correctly When recovering from a delegation recall or a network partition, we need to replay open(O_RDWR), open(O_RDONLY) and open(O_WRONLY) separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:43 -05:00
Trond Myklebust	e761692381	NFSv4: Make nfs4_state track O_RDWR, O_RDONLY and O_WRONLY separately A closer reading of RFC3530 reveals that OPEN_DOWNGRADE must always specify a access modes that have been the argument of a previous OPEN operation. IOW: doing OPEN(O_RDWR) and then OPEN_DOWNGRADE(O_WRONLY) is forbidden unless the user called OPEN(O_WRONLY) In order to fix that, we really need to track the three possible open states separately. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:43 -05:00
Trond Myklebust	cdd4e68b5f	NFSv4: Make open_confirm() asynchronous too Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:42 -05:00
Trond Myklebust	24ac23ab88	NFSv4: Convert open() into an asynchronous RPC call OPEN is a stateful operation, so we must ensure that it always completes. In order to allow users to interrupt the operation, we need to make the RPC call asynchronous, and then wait on completion (or cancel). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:42 -05:00
Trond Myklebust	e56e0b78eb	NFSv4: Allocate OPEN call RPC arguments using kmalloc() Cleanup in preparation for making OPEN calls interruptible by the user. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:41 -05:00
Trond Myklebust	06f814a3ad	NFSv4: Make locku use the new RPC "wait on completion" interface. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:40 -05:00
Trond Myklebust	44c288732f	NFSv4: stateful NFSv4 RPC call interface The NFSv4 model requires us to complete all RPC calls that might establish state on the server whether or not the user wants to interrupt it. We may also need to schedule new work (including new RPC calls) in order to cancel the new state. The asynchronous RPC model will allow us to ensure that RPC calls always complete, but in order to allow for "synchronous" RPC, we want to add the ability to wait for completion. The waits are, of course, interruptible. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:40 -05:00
Trond Myklebust	4ce70ada1f	SUNRPC: Further cleanups Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:40 -05:00
Trond Myklebust	963d8fe533	RPC: Clean up RPC task structure Shrink the RPC task structure. Instead of storing separate pointers for task->tk_exit and task->tk_release, put them in a structure. Also pass the user data pointer as a parameter instead of passing it via task->tk_calldata. This enables us to nest callbacks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:39 -05:00
Trond Myklebust	abd3e641d5	NFS: Work correctly with single-page ->writepage() calls Ensure that we always initiate flushing of data before we exit a single-page ->writepage() call. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2006-01-06 14:58:39 -05:00
Arnaldo Carvalho de Melo	14c850212e	[INET_SOCK]: Move struct inet_sock & helper functions to net/inet_sock.h To help in reducing the number of include dependencies, several files were touched as they were getting needed headers indirectly for stuff they use. Thanks also to Alan Menegotto for pointing out that net/dccp/proto.c had linux/dccp.h include twice. Signed-off-by: Arnaldo Carvalho de Melo <acme@mandriva.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-01-03 13:11:21 -08:00
ASANO Masahiro	0800c5f7a4	[PATCH] fix posix lock on NFS NFS client prevents mandatory lock, but there is a flaw on it; Locks are possibly left if the mode is changed while locking. This permits unlocking even if the mandatory lock bits are set. Signed-off-by: ASANO Masahiro <masano@tnes.nec.co.jp> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-12-22 09:24:05 -08:00
Trond Myklebust	29884df0d8	NFS: Fix another O_DIRECT race Ensure we call unmap_mapping_range() and sync dirty pages to disk before doing an NFS direct write. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-19 23:12:09 -05:00
Trond Myklebust	3b6efee923	NFSv4: Fix an Oops in the synchronous write path - Missing initialisation of attribute bitmask in _nfs4_proc_write() - On success, _nfs4_proc_write() must return number of bytes written. - Missing post_op_update_inode() in _nfs4_proc_write() - Missing initialisation of attribute bitmask in _nfs4_proc_commit() - Missing post_op_update_inode() in _nfs4_proc_commit() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-03 15:20:21 -05:00
Trond Myklebust	5ba7cc4801	NFS: Fix post-op attribute revalidation... - Missing nfs_mark_for_revalidate in nfs_proc_link() - Missing nfs_mark_for_revalidate in nfs_rename() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-03 15:20:17 -05:00
Trond Myklebust	bb713d6d38	NFS: use set_page_writeback() in the appropriate places Ensure that we use set_page_writeback() in the appropriate places to help the VM in keeping its page radix_tree in sync. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-03 15:20:14 -05:00
Trond Myklebust	24aa1fe677	NFS: Fix a few further cache consistency regressions Steve Dickson writes: Doing the following: 1. On server: $ mkdir ~/t $ echo Hello > ~/t/tmp 2. On client, wait for a string to appear in this file: $ until grep -q foo t/tmp ; do echo -n . ; sleep 1 ; done 3. On server, create a new file with the same name containing that string: $ mv ~/t/tmp ~/t/tmp.old; echo foo > ~/t/tmp will show how the client will never (and I mean never ;-) ) see the updated file. The problem is that we do not update nfsi->cache_change_attribute when the file changes on the server (we only update it when our client makes the changes). This again means that functions like nfs_check_verifier() will fail to register when the parent directory has changed and should trigger a dentry lookup revalidation. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-03 15:20:07 -05:00
Steve Dickson	223db122bf	NFS: Fix cache consistency regression Make sure cache_change_attribute is initialized to jiffies so when the mtime changes on directory, the directory will be refreshed. Signed-off by: Steve Dickson <steved@redhat.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-12-03 15:20:03 -05:00
Trond Myklebust	b37b03b705	NFS: Fix a spinlock recursion inside nfs_update_inode() In cases where the server has gone insane, nfs_update_inode() may end up calling nfs_invalidate_inode(), which again calls stuff that takes the inode->i_lock that we're already holding. In addition, given the sort of things we have in NFS these days that need to be cleaned up on inode release, I'm not sure we should ever be calling make_bad_inode(). Fix up spinlock recursion, and limit nfs_invalidate_inode() to clearing the caches, and marking the inode as being stale. Thanks to Steve Dickson <SteveD@redhat.com> for spotting this. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-25 17:11:29 -05:00
Trond Myklebust	ff6040667a	NFSv4: Fix typo in lock caching When caching locks due to holding a file delegation, we must always check against local locks before sending anything to the server. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-25 17:11:29 -05:00
Trond Myklebust	36f20c6df7	NFSv4: Fix buggy nfs_wait_on_sequence() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-25 17:11:28 -05:00
Jesper Juhl	f99d49adf5	[PATCH] kfree cleanup: fs This is the fs/ part of the big kfree cleanup patch. Remove pointless checks for NULL prior to calling kfree() in fs/. Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-11-07 07:54:06 -08:00
Chuck Lever	0bbacc402e	NFS,SUNRPC,NLM: fix unused variable warnings when CONFIG_SYSCTL is disabled Fix some dprintk's so that NLM, NFS client, and RPC client compile cleanly if CONFIG_SYSCTL is disabled. Test plan: Compile kernel with CONFIG_NFS enabled and CONFIG_SYSCTL disabled. Signed-off-by: Chuck Lever <cel@netapp.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:39:48 -05:00
Trond Myklebust	6bfc93ef98	NFSv4: Teach NFSv4 to cache locks when we hold a delegation Now that we have a method of dealing with delegation recalls, actually enable the caching of posix and BSD locks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:39:36 -05:00
Trond Myklebust	888e694c16	NFSv4: Recover locks too when returning a delegation Delegations allow us to cache posix and BSD locks, however when the delegation is recalled, we need to "flush the cache" and send the cached LOCK requests to the server. This patch sets up the mechanism for doing so. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:38:11 -05:00
Trond Myklebust	43b2a33aa8	NFSv4: Fix recovery of flock() locks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:35:30 -05:00
Trond Myklebust	34ea818846	NFSv4: Return any delegations before sillyrenaming the file I missed this one... Any form of rename will result in a delegation recall, so it is more efficient to return the one we hold before trying the rename. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:35:02 -05:00
Trond Myklebust	2c56617d76	NFSv4: Fix the handling of the error NFS4ERR_OLD_STATEID Ensure that we retry the failed operation... Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:33:50 -05:00
Trond Myklebust	d530838bfa	NFSv4: Fix problem with OPEN_DOWNGRADE RFC 3530 states that for OPEN_DOWNGRADE "The share_access and share_deny bits specified must be exactly equal to the union of the share_access and share_deny bits specified for some subset of the OPENs in effect for current openowner on the current file. Setattr is currently violating the NFSv4 rules for OPEN_DOWNGRADE in that it may cause a downgrade from OPEN4_SHARE_ACCESS_BOTH to OPEN4_SHARE_ACCESS_WRITE despite the fact that there exists no open file with O_WRONLY access mode. Fix the problem by replacing nfs4_find_state() with a modified version of nfs_find_open_context(). Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:33:38 -05:00
Trond Myklebust	4cecb76ff8	NFSv4: Fix a race between open() and close() We must not remove the nfs4_state structure from the inode open lists before we are in sequence lock. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-11-04 15:32:58 -05:00
Trond Myklebust	d3f8cf4899	[PATCH] NFS: Remove unbalanced spin_unlock() calls from nfs_refresh_inode() Doh! Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2005-10-30 14:46:47 -08:00
Trond Myklebust	bec273b491	NFS: Allow files that are open for write to invalidate caches Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:45 -04:00
Trond Myklebust	16c32b71bc	NFSv4: Convert unnecessary XDR warning messages into dprintk() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:45 -04:00
Trond Myklebust	4f9838c7ec	NFSv4: Add post-op attributes to NFSv4 write and commit callbacks. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:44 -04:00
Trond Myklebust	16e429596d	NFSv4: Add post-op attributes to nfs4_proc_remove() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:44 -04:00
Trond Myklebust	6caf2c8276	NFSv4: Add post-op attributes to nfs4_proc_rename() Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:43 -04:00
Trond Myklebust	91ba2eeec5	NFSv4: Add post-op attributes to nfs4_proc_link() Optimise attribute revalidation when hardlinking. Add post-op attributes for the directory and the original inode. Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:42 -04:00
Trond Myklebust	cf80955614	NFS: Ensure that nfs_link() instantiates the dentry correctly Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2005-10-27 22:12:42 -04:00

... 2 3 4 5 6 ...

442 Commits