1
Commit Graph

3449 Commits

Author SHA1 Message Date
Adrian Bunk
66f37509fc [PATCH] fs/nfs/: make code static
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:20 -07:00
Martin Bligh
b7b52630de [PATCH] add newline to nfs dprintk
Add missing \n to dprintk

Signed-off-by: Martin Bligh <mbligh@google.com>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Eric W. Biederman
c18258c6f0 [PATCH] pid: Implement transfer_pid and use it to simplify de_thread
In de_thread we move pids from one process to another, a rather ugly case.
The function transfer_pid makes it clear what we are doing, and makes the
action atomic.  This is useful we ever want to atomically traverse the
process group and session lists, in a rcu safe manner.

Even if the atomic properties this change should be a win as transfer_pid
should be less code to execute than executing both attach_pid and
detach_pid, and this should make de_thread slightly smaller as only a
single function call needs to be emitted.  The only downside is that the
code might be slower to execute as the odds are against transfer_pid being
in cache.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Eric W. Biederman
b89a81712f [PATCH] sysctl: Allow /proc/sys without sys_sysctl
Since sys_sysctl is deprecated start allow it to be compiled out.  This
should catch any remaining user space code that cares, and paves the way
for further sysctl cleanups.

[akpm@osdl.org: If sys_sysctl() is not compiled-in, emit a warning]
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Andrew Morton
8b0e330b77 [PATCH] alloc_fdtable() cleanup
free_fdset(NULL, ...) is legal.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:19 -07:00
Adrian Bunk
36b756f2b5 [PATCH] reiserfs: warn about the useless nolargeio option
Since the nolargeio option no longer has any effect, print a warning
instead of setting a write-only variable.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Cc: Jeff Mahoney <jeffm@suse.com>
Cc: Chris Mason <mason@suse.com>
Cc: Hans Reiser <reiser@namesys.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Theodore Ts'o
ba52de123d [PATCH] inode-diet: Eliminate i_blksize from the inode structure
This eliminates the i_blksize field from struct inode.  Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.

Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.

[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:18 -07:00
Theodore Ts'o
577c4eb09d [PATCH] inode-diet: Move i_cdev into a union
Move the i_cdev pointer in struct inode into a union.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o
eaf796e7ef [PATCH] inode-diet: Move i_bdev into a union
Move the i_bdev pointer in struct inode into a union.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Theodore Ts'o
8e18e2941c [PATCH] inode_diet: Replace inode.u.generic_ip with inode.i_private
The following patches reduce the size of the VFS inode structure by 28 bytes
on a UP x86.  (It would be more on an x86_64 system).  This is a 10% reduction
in the inode size on a UP kernel that is configured in a production mode
(i.e., with no spinlock or other debugging functions enabled; if you want to
save memory taken up by in-core inodes, the first thing you should do is
disable the debugging options; they are responsible for a huge amount of bloat
in the VFS inode structure).

This patch:

The filesystem or device-specific pointer in the inode is inside a union,
which is pretty pointless given that all 30+ users of this field have been
using the void pointer.  Get rid of the union and rename it to i_private, with
a comment to explain who is allowed to use the void pointer.  This is just a
cleanup, but it allows us to reuse the union 'u' for something something where
the union will actually be used.

[judith@osdl.org: powerpc build fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Judith Lebzelter <judith@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
OGAWA Hirofumi
6a1d9805ec [PATCH] fat: cleanup fat_get_block(s)
get_blocks() was removed.  So, this removes it on fat, and will take
advantage of the multi block mapping.

Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
Ian Kent
bcdc5e019d [PATCH] autofs4 needs to force fail return revalidate
For a long time now I have had a problem with not being able to return a
lookup failure on an existsing directory.  In autofs this corresponds to a
mount failure on a autofs managed mount entry that is browsable (and so the
mount point directory exists).

While this problem has been present for a long time I've avoided resolving
it because it was not very visible.  But now that autofs v5 has "mount and
expire on demand" of nested multiple mounts, such as is found when mounting
an export list from a server, solving the problem cannot be avoided any
longer.

I've tried very hard to find a way to do this entirely within the autofs4
module but have not been able to find a satisfactory way to achieve it.

So, I need to propose a change to the VFS.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:17 -07:00
David Howells
f269fdd182 [PATCH] NOMMU: move the fallback arch_vma_name() to a sensible place
Move the fallback arch_vma_name() to a sensible place (kernel/signal.c).

Currently it's in fs/proc/task_mmu.c, a file that is dependent on both
CONFIG_PROC_FS and CONFIG_MMU being enabled, but it's used from
kernel/signal.c from where it is called unconditionally.

[akpm@osdl.org: build fix]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:15 -07:00
David Howells
dbf8685c8e [PATCH] NOMMU: Implement /proc/pid/maps for NOMMU
Implement /proc/pid/maps for NOMMU by reading the vm_area_list attached to
current->mm->context.vmlist.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:14 -07:00
David Howells
5da6185bca [PATCH] NOMMU: Set BDI capabilities for /dev/mem and /dev/kmem
Set the backing device info capabilities for /dev/mem and /dev/kmem to
permit direct sharing under no-MMU conditions and full mapping capabilities
under MMU conditions.  Make the BDI used by these available to all directly
mappable character devices.

Also comment the capabilities for /dev/zero.

[akpm@osdl.org: ifdef reductions]
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:14 -07:00
Alexey Dobriyan
1a1d92c10d [PATCH] Really ignore kmem_cache_destroy return value
* Rougly half of callers already do it by not checking return value
* Code in drivers/acpi/osl.c does the following to be sure:

	(void)kmem_cache_destroy(cache);

* Those who check it printk something, however, slab_error already printed
  the name of failed cache.
* XFS BUGs on failed kmem_cache_destroy which is not the decision
  low-level filesystem driver should make. Converted to ignore.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Panagiotis Issaris
f52720ca5f [PATCH] fs: Removing useless casts
* Removing useless casts
* Removing useless wrapper
* Conversion from kmalloc+memset to kzalloc

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Acked-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Panagiotis Issaris
f8314dc60c [PATCH] fs: Conversions from kmalloc+memset to k(z|c)alloc
Conversions from kmalloc+memset to kzalloc.

Signed-off-by: Panagiotis Issaris <takis@issaris.org>
Jffs2-bit-acked-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Eric Sandeen
32c2d2bc4b [PATCH] more ext3 16T overflow fixes
Some of the changes in balloc.c are just cosmetic, as Andreas pointed out -
if they overflow they'll then underflow and things are fine.

5th hunk actually fixes an overflow problem.

Also check for potential overflows in inode & block counts when resizing.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Dave Kleikamp
a4e4de36dc [PATCH] ext3: Fix sparse warnings
Fixing up some endian-ness warnings in preparation to clone ext4 from ext3.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Dave Kleikamp
e9ad5620bf [PATCH] ext3: More whitespace cleanups
More white space cleanups in preparation of cloning ext4 from ext3.
Removing spaces that precede a tab.

Signed-off-by: Dave Kleikamp <shaggy@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:10 -07:00
Vasily Averin
7543fc7b3a [PATCH] ext3: wrong error behavior
SWsoft Virtuozzo/OpenVZ Linux kernel team has discovered that ext3 error
behavior was broken in linux kernels since 2.5.x versions by the following
patch:

2002/10/31 02:15:26-05:00 tytso@snap.thunk.org
Default mount options from superblock for ext2/3 filesystems
http://linux.bkbits.net:8080/linux-2.6/gnupatch@3dc0d88eKbV9ivV4ptRNM8fBuA3JBQ

In case ext3 file system is mounted with errors=continue
(EXT3_ERRORS_CONTINUE) errors should be ignored when possible.  However at
present in case of any error kernel aborts journal and remounts filesystem
to read-only.  Such behavior was hit number of times and noted to differ
from that of 2.4.x kernels.

This patch fixes this:
- do nothing in case of EXT3_ERRORS_CONTINUE,
- set EXT3_MOUNT_ABORT and call journal_abort() in all other cases
- panic() should be called after ext3_commit_super() to save
 sb marked as EXT3_ERROR_FS

Signed-off-by: Vasily Averin <vvs@sw.ru>
Acked-by: Kirill Korotaev <dev@sw.ru>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: "Stephen C. Tweedie" <sct@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao
36faadc144 [PATCH] ext3: more comments about block allocation/reservation code
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao
321fb9e818 [PATCH] ext3: turn on reservation dump on block allocation errors
In the past there were a few kernel panics related to block reservation
tree operations failure (insert/remove etc).  It would be very useful to
get the block allocation reservation map info when such error happens.

Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen
37ed322290 [PATCH] JBD: 16T fixes
These are a few places I've found in jbd that look like they may not be
16T-safe, or consistent with the use of unsigned longs for block
containers.  Problems here would be somewhat hard to hit, would require
journal blocks past the 8T boundary, which would not be terribly common.
Still, should fix.

(some of these have come from the ext4 work on jbd as well).

I think there's one more possibility that the wrap() function may not be
safe IF your last block in the journal butts right up against the 232 block
boundary, but that seems like a VERY remote possibility, and I'm not
worrying about it at this point.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen
eee194e76c [PATCH] ext3: inode numbers are unsigned long
This is primarily format string fixes, with changes to ialloc.c where large
inode counts could overflow, and also pass around journal_inum as an
unsigned long, just to be pedantic about it....

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen
41f04d852e [PATCH] ext2: fix mounts at 16T
Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Eric Sandeen
855565e81a [PATCH] fix ext3 mounts at 16T
I need to do some actual IO testing now, but this gets things mounting for
a 16T ext3 filesystem.  (patched up e2fsprogs is needed too, I'll send that
off the kernel list)

This patch fixes these issues in the kernel:

o sbi->s_groups_count overflows in ext3_fill_super()

	sbi->s_groups_count = (le32_to_cpu(es->s_blocks_count) -
			       le32_to_cpu(es->s_first_data_block) +
			       EXT3_BLOCKS_PER_GROUP(sb) - 1) /
			      EXT3_BLOCKS_PER_GROUP(sb);

  at 16T, s_blocks_count is already maxed out; adding
  EXT3_BLOCKS_PER_GROUP(sb) overflows it and groups_count comes out to 0.
  Not really what we want, and causes a failed mount.

  Feel free to check my math (actually, please do!), but changing it this
  way should work & avoid the overflow:

  (A + B - 1)/B changed to: ((A - 1)/B) + 1

o ext3_check_descriptors() overflows range checks

  ext3_check_descriptors() iterates over all block groups making sure
  that various bits are within the right block ranges...  on the last pass
  through, it is checking the error case

   [item] >= block + EXT3_BLOCKS_PER_GROUP(sb)

  where "block" is the first block in the last block group.  The last
  block in this group (and the last one that will fit in 32 bits) is block
  + EXT3_BLOCKS_PER_GROUP(sb)- 1.  block + EXT3_BLOCKS_PER_GROUP(sb) wraps
  back around to 0.

  so, make things clearer with "first_block" and "last_block" where those
  are first and last, inclusive, and use <, > rather than <, >=.

  Finally, the last block group may be smaller than the rest, so account
  for this on the last pass through: last_block = sb->s_blocks_count - 1;

(a similar patch could be done for ext2; does anyone in their right mind
use ext2 at 16T?  I'll send an ext2 patch doing the same thing if that's
warranted)

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Cc: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Alexey Dobriyan
2aed348469 [PATCH] jbd: use BUILD_BUG_ON in journal init
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Stephen Tweedie <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Mingming Cao
ae6ddcc5f2 [PATCH] ext3 and jbd cleanup: remove whitespace
Remove whitespace from ext3 and jbd, before we clone ext4.

Signed-off-by: Mingming Cao<cmm@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:09 -07:00
Josh Triplett
e7ab8d6505 [PATCH] jbd: add lock annotation to jbd_sync_bh
jbd_sync_bh releases journal->j_list_lock.  Add a lock annotation to this
function so that sparse can check callers for lock pairing, and so that
sparse will not complain about this function since it intentionally uses
the lock in this manner.

Signed-off-by: Josh Triplett <josh@freedesktop.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-27 08:26:08 -07:00
Linus Torvalds
b278240839 Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
  [PATCH] Don't set calgary iommu as default y
  [PATCH] i386/x86-64: New Intel feature flags
  [PATCH] x86: Add a cumulative thermal throttle event counter.
  [PATCH] i386: Make the jiffies compares use the 64bit safe macros.
  [PATCH] x86: Refactor thermal throttle processing
  [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
  [PATCH] Fix unwinder warning in traps.c
  [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
  [PATCH] x86: Move direct PCI scanning functions out of line
  [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
  [PATCH] Don't leak NT bit into next task
  [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
  [PATCH] Fix some broken white space in ia32_signal.c
  [PATCH] Initialize argument registers for 32bit signal handlers.
  [PATCH] Remove all traces of signal number conversion
  [PATCH] Don't synchronize time reading on single core AMD systems
  [PATCH] Remove outdated comment in x86-64 mmconfig code
  [PATCH] Use string instructions for Core2 copy/clear
  [PATCH] x86: - restore i8259A eoi status on resume
  [PATCH] i386: Split multi-line printk in oops output.
  ...
2006-09-26 13:07:55 -07:00
Linus Torvalds
dd77a4ee0f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (47 commits)
  Driver core: Don't call put methods while holding a spinlock
  Driver core: Remove unneeded routines from driver core
  Driver core: Fix potential deadlock in driver core
  PCI: enable driver multi-threaded probe
  Driver Core: add ability for drivers to do a threaded probe
  sysfs: add proper sysfs_init() prototype
  drivers/base: check errors
  drivers/base: Platform notify needs to occur before drivers attach to the device
  v4l-dev2: handle __must_check
  add CONFIG_ENABLE_MUST_CHECK
  add __must_check to device management code
  Driver core: fixed add_bind_files() definition
  Driver core: fix comments in drivers/base/power/resume.c
  sysfs_remove_bin_file: no return value, dump_stack on error
  kobject: must_check fixes
  Driver core: add ability for devices to create and remove bin files
  Class: add support for class interfaces for devices
  Driver core: create devices/virtual/ tree
  Driver core: add device_rename function
  Driver core: add ability for classes to handle devices properly
  ...
2006-09-26 11:49:46 -07:00
Andrew Morton
8d6b5eeea5 [PATCH] binfmt_elf: consistently use loff_t
As David Howells <dhowells@redhat.com> points out, binfmt_elf sometimes uses
off_t, sometimes uses loff_t.  Use loff_t throughout.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:53 -07:00
Christoph Lameter
972d1a7b14 [PATCH] ZVC: Support NR_SLAB_RECLAIMABLE / NR_SLAB_UNRECLAIMABLE
Remove the atomic counter for slab_reclaim_pages and replace the counter
and NR_SLAB with two ZVC counter that account for unreclaimable and
reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.

Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE.  The
intend seems to be to check for slab pages that could be freed.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:51 -07:00
Christoph Lameter
182e8e2373 [PATCH] reduce MAX_NR_ZONES: make display of highmem counters conditional on CONFIG_HIGHMEM
Do not display HIGHMEM memory sizes if CONFIG_HIGHMEM is not set.

Make HIGHMEM dependent texts and make display of highmem counters optional

Some texts are depending on CONFIG_HIGHMEM.

Remove those strings and remove the display of highmem counter values if
CONFIG_HIGHMEM is not set.

[akpm@osdl.org: remove some ifdefs]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:46 -07:00
Peter Zijlstra
d08b3851da [PATCH] mm: tracking shared dirty pages
Tracking of dirty pages in shared writeable mmap()s.

The idea is simple: write protect clean shared writeable pages, catch the
write-fault, make writeable and set dirty.  On page write-back clean all the
PTE dirty bits and write protect them once again.

The implementation is a tad harder, mainly because the default
backing_dev_info capabilities were too loosely maintained.  Hence it is not
enough to test the backing_dev_info for cap_account_dirty.

The current heuristic is as follows, a VMA is eligible when:
 - its shared writeable
    (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
 - it is not a 'special' mapping
    (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
 - the backing_dev_info is cap_account_dirty
    mapping_cap_account_dirty(vma->vm_file->f_mapping)
 - f_op->mmap() didn't change the default page protection

Page from remap_pfn_range() are explicitly excluded because their COW
semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
because they don't have a backing store anyway.

mprotect() is taught about the new behaviour as well.  However it overrides
the last condition.

Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
It can be called on any page, but is currently only implemented for mapped
pages, if the page is found the be of a VMA that accounts dirty pages it will
also wrprotect the PTE.

Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
under ->private_lock.  This seems to be safe, since ->private_lock is used to
serialize access to the buffers, not the page itself.  This is needed because
clear_page_dirty() will call into page_mkclean() and would thereby violate
locking order.

[dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:44 -07:00
Jan Kara
3998b9301d [PATCH] jbd: fix commit of ordered data buffers
Original commit code assumes, that when a buffer on BJ_SyncData list is
locked, it is being written to disk.  But this is not true and hence it can
lead to a potential data loss on crash.  Also the code didn't count with
the fact that journal_dirty_data() can steal buffers from committing
transaction and hence could write buffers that no longer belong to the
committing transaction.  Finally it could possibly happen that we tried
writing out one buffer several times.

The patch below tries to solve these problems by a complete rewrite of the
data commit code.  We go through buffers on t_sync_datalist, lock buffers
needing write out and store them in an array.  Buffers are also immediately
refiled to BJ_Locked list or unfiled (if the write out is completed).  When
the array is full or we have to block on buffer lock, we submit all
accumulated buffers for IO.

[suitable for 2.6.18.x around the 2.6.19-rc2 timeframe]

Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-26 08:48:44 -07:00
Andi Kleen
758333458a [PATCH] Check return value of copy_to_user in compat_sys_pselect7
Fix

linux/fs/compat.c: In function compat_sys_pselect7
linux/fs/compat.c:1869: warning: ignoring return value of copy_to_user, declared with attribute warn_unused_result

To make it easier to handle I changed to semantics to not try to
write out a timespec if an error occurred. I hope that's ok.

Cc: dwmw2@infradead.org

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:39 +02:00
Andi Kleen
c16b63e09d [PATCH] i386/x86-64: Don't randomize stack top when no randomization personality is set
Based on patch from Frank van Maarseveen <frankvm@frankvm.com>, but
extended.

Signed-off-by: Andi Kleen <ak@suse.de>
2006-09-26 10:52:28 +02:00
Andrew Morton
f20a9ead0d sysfs: add proper sysfs_init() prototype
Don't be crufty.  Mark it __must_check too.

Cc: "Randy.Dunlap" <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:39 -07:00
Randy.Dunlap
995982ca79 sysfs_remove_bin_file: no return value, dump_stack on error
Make sysfs_remove_bin_file() void.  If it detects an error,
printk the file name and call dump_stack().

sysfs_hash_and_remove() now returns an error code indicating
its success or failure so that sysfs_remove_bin_file() can
know success/failure.

Convert the only driver that checked the return value of
sysfs_remove_bin_file().

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:39 -07:00
Greg Kroah-Hartman
ceeee1fb28 SYSFS: allow sysfs_create_link to create symlinks in the root of sysfs
This is needed to make the compatible link for /sys/block in the future.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Randy Dunlap
6468b3afa7 Debugfs: kernel-doc fixes for debugfs
Fix kernel-doc and typos/spellos in fs/debugfs/.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Juha Yrjl
eea3f8911f sysfs: Make poll behaviour consistent
When no events have been reported by sysfs_notify(), sd->s_events
was previously set to zero.  The initial value for new readers is
also zero, so poll was blocking, regardless of whether the attribute
was read by the process or not.

Make poll behave consistently by setting the initial value of
sd->s_events to non-zero.

Signed-off-by: Juha Yrjola <juha.yrjola@solidboot.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-25 21:08:36 -07:00
Ian Kent
c0ba7e5147 [PATCH] autofs4: zero timeout prevents shutdown
If the timeout of an autofs mount is set to zero then umounts are disabled.
 This works fine, however the kernel module checks the expire timeout and
goes no further if it is zero.  This is not the right thing to do at
shutdown as the module is passed an option to expire mounts regardless of
their timeout setting.

This patch allows autofs to honor the force expire option.

Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-09-25 17:38:35 -07:00
Mark Fasheh
0d5dc6c2dd ocfs2: Teach ocfs2_drop_lock() to use ->set_lvb() callback
With this, we don't need to pass an additional struct with function pointer.

Now that the callbacks are fully used, comment the remaining API.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:48 -07:00
Mark Fasheh
b5e500e23e ocfs2: Remove ->unblock lockres operation
Have ocfs2_process_blocked_lock() call ocfs2_generic_unblock_lock(), which
gets to be ocfs2_unblock_lock() now that it's the only possible unblock
function.

Remove the ->unblock() callback from the structure, and all lock type
specific unblock functions.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:48 -07:00
Mark Fasheh
cc567d89b3 ocfs2: move downconvert worker to lockres ops
This way lock types don't have to manually pass it to
ocfs2_generic_unblock_lock().

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:48 -07:00
Mark Fasheh
08280f11de ocfs2: Remove unused dlmglue functions
The meta data unblocking code no longer needs ocfs2_do_unblock_meta() or
ocfs2_can_downconvert_meta_lock(), so remove them.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:48 -07:00
Mark Fasheh
810d5aeba1 ocfs2: Have the metadata lock use generic dlmglue functions
Fill in the ->check_downconvert and ->set_lvb callbacks with meta data
specific operations and switch ocfs2_unblock_meta() to call
ocfs2_generic_unblock_lock()

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:47 -07:00
Mark Fasheh
5ef0d4ea08 ocfs2: Add ->set_lvb callback in dlmglue
This allows a lock type to set the value block before downconvert.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:47 -07:00
Mark Fasheh
16d5b9567a ocfs2: Add ->check_downconvert callback in dlmglue
This will allow lock types to force a requeue of a lock downconvert.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:47 -07:00
Mark Fasheh
f7fbfdd1fc ocfs2: Check for refreshing locks in generic unblock function
Tidy up the exit path a bit too.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:47 -07:00
Mark Fasheh
b80fc012e0 ocfs2: don't unconditionally pass LVB flags
Allow a lock type to specifiy whether it makes use of the LVB. The only type
which does this right now is the meta data lock. This should save us some
space on network messages since they won't have to needlessly transmit value
blocks.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:47 -07:00
Mark Fasheh
aa2623ad80 ocfs2: combine inode and generic blocking AST functions
There is extremely little difference between the two now. We can remove the
callback from ocfs2_lock_res_ops as well.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:46 -07:00
Mark Fasheh
54a7e7552e ocfs2: Add ->get_osb() dlmglue locking operation
Will be used to find the ocfs2_super structure from a given lockres.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:46 -07:00
Mark Fasheh
2a45f2d13e ocfs2: remove ->unlock_ast() callback from ocfs2_lock_res_ops
This was always defined to the same function in all locks, so clean things
up by removing and passing ocfs2_unlock_ast() directly to the DLM.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:46 -07:00
Mark Fasheh
e92d57df27 ocfs2: combine inode and generic AST functions
There is extremely little difference between the two now. We can remove the
callback from ocfs2_lock_res_ops as well.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:46 -07:00
Mark Fasheh
f625c9793b ocfs2: Clean up lock resource refresh flags
Use of the refresh mechanism is lock-type wide, so move knowledge of that to
the ocfs2_lock_res_ops structure.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:46 -07:00
Mark Fasheh
24c19ef404 ocfs2: Remove i_generation from inode lock names
OCFS2 puts inode meta data in the "lock value block" provided by the DLM.
Typically, i_generation is encoded in the lock name so that a deleted inode
on and a new one in the same block don't share the same lvb.

Unfortunately, that scheme means that the read in ocfs2_read_locked_inode()
is potentially thrown away as soon as the meta data lock is taken - we
cannot encode the lock name without first knowing i_generation, which
requires a disk read.

This patch encodes i_generation in the inode meta data lvb, and removes the
value from the inode meta data lock name. This way, the read can be covered
by a lock, and at the same time we can distinguish between an up to date and
a stale LVB.

This will help cold-cache stat(2) performance in particular.

Since this patch changes the protocol version, we take the opportunity to do
a minor re-organization of two of the LVB fields.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:46 -07:00
Mark Fasheh
f9e2d82e63 ocfs2: Encode i_generation in the meta data lvb
When i_generation is removed from the lockname, this will help us determine
whether a meta data lvb has information that is in sync with the local
struct inode.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:45 -07:00
Mark Fasheh
4d3b83f736 ocfs2: Free up some space in the lvb
lvb_version doesn't need to be a whole 32 bits. Make it an 8 bit field to
free up some space. This should be backwards compatible until we use one of
the fields, in which case we'd bump the lvb version anyway.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:45 -07:00
Mark Fasheh
0027dd5bc2 ocfs2: Remove special casing for inode creation in ocfs2_dentry_attach_lock()
We can't use LKM_LOCAL for new dentry locks because an unlink and subsequent
re-create of a name/inode pair may result in the lock still being mastered
somewhere in the cluster.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:45 -07:00
Mark Fasheh
1ba9da2ffa ocfs2: manually d_move() during ocfs2_rename()
Make use of FS_RENAME_DOES_D_MOVE to avoid a race condition that can occur
during ->rename() if we d_move() outside of the parent directory cluster
locks, and another node discovers the new name (created during the rename)
and unlinks it. d_move() will unconditionally rehash a dentry - which will
leave stale data in the system.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:45 -07:00
Mark Fasheh
349457ccf2 [PATCH] Allow file systems to manually d_move() inside of ->rename()
Some file systems want to manually d_move() the dentries involved in a
rename.  We can do this by making use of the FS_ODD_RENAME flag if we just
have nfs_rename() unconditionally do the d_move().  While there, we rename
the flag to be more descriptive.

OCFS2 uses this to protect that part of the rename operation with a cluster
lock.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
2006-09-24 13:50:45 -07:00
Mark Fasheh
1390334b4c ocfs2: Remove the dentry vote
This is unused now.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:43 -07:00
Mark Fasheh
379dfe9d0d ocfs2: Hook rest of the file system into dentry locking API
Actually replace the vote calls with the new dentry operations. Make any
necessary adjustments to get the scheme to work.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:43 -07:00
Mark Fasheh
80c05846f6 ocfs2: Add dentry tracking API
Replace the dentry vote mechanism with a cluster lock which covers a set
of dentries. This allows us to force d_delete() only on nodes which actually
care about an unlink.

Every node that does a ->lookup() gets a read only lock on the dentry, until
an unlink during which the unlinking node, will request an exclusive lock,
forcing the other nodes who care about that dentry to d_delete() it. The
effect is that we retain a very lightweight ->d_revalidate(), and at the
same time get to make large improvements to the average case performance of
the ocfs2 unlink and rename operations.

This patch adds the higher level API and the dentry manipulation code.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:43 -07:00
Mark Fasheh
d680efe9d8 ocfs2: Add new cluster lock type
Replace the dentry vote mechanism with a cluster lock which covers a set
of dentries. This allows us to force d_delete() only on nodes which actually
care about an unlink.

Every node that does a ->lookup() gets a read only lock on the dentry, until
an unlink during which the unlinking node, will request an exclusive lock,
forcing the other nodes who care about that dentry to d_delete() it. The
effect is that we retain a very lightweight ->d_revalidate(), and at the
same time get to make large improvements to the average case performance of
the ocfs2 unlink and rename operations.

This patch adds the cluster lock type which OCFS2 can attach to
dentries.  A small number of fs/ocfs2/dcache.c functions are stubbed
out so that this change can compile.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:42 -07:00
Mark Fasheh
f0681062b8 ocfs2: Update dlmglue for new dlmlock() API
File system lock names are very regular right now, so we really only need to
pass an extra parameter to dlmlock().

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:42 -07:00
Mark Fasheh
ea5b3a187e ocfs2: Update dlmfs for new dlmlock() API
We just need to add a namelen field to the user_lock_res structure, and
update a few debug prints. Instead of updating all debug prints, I took the
opportunity to remove a few that are likely unnecessary these days.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:42 -07:00
Mark Fasheh
3384f3df5e ocfs2: Allow binary names in the DLM
The OCFS2 DLM uses strlen() to determine lock name length, which excludes
the possibility of putting binary values in the name string. Fix this by
requiring that string length be passed in as a parameter.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:42 -07:00
Mark Fasheh
e2c73698af ocfs2: Silence dlm error print
An AST can be delivered via the network after a lock has been removed, so no
need to print an error when we see that.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
2006-09-24 13:50:41 -07:00
Jeff Garzik
e18fa700c9 Move several *_SUPER_MAGIC symbols to include/linux/magic.h.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
2006-09-24 11:13:19 -04:00
Chuck Lever
026ed5c918 NFS: unmark NFS direct I/O as experimental
Remove the EXPERIMENTAL flag from the NFS_DIRECTIO option.

Test plan:
Unset the EXPERIMENTAL kernel build option and check to see that the NFS
direct I/O option is still available.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:06 -04:00
Chuck Lever
f551e44ff1 NFS: add comments clarifying the use of nfs_post_op_update()
Comments-only change to clarify a detail of the NFS protocol and how it is
implemented in Linux.

Test plan:
None.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:05 -04:00
Josef 'Jeff' Sipek
aec5e17528 NFS: Use SEEK_END instead of hardcoded value
Signed-off-by: Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:04 -04:00
Trond Myklebust
51b6ded4d9 NFSv4: When mounting with a port=0 argument, substitute port=2049
RFC3530 states that the registered port 2049 for the NFS protocol should be
the default configuration in order to allow clients not to use the RPC
binding protocols.
If the mount program sends us a port=0, we therefore substitute port=2049.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:04 -04:00
Trond Myklebust
2066fe89b4 NFSv4: Poll more aggressively when handling NFS4ERR_DELAY
Change the initial retry delay from 1s to 0.1s (and then back off
exponentially).

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:04 -04:00
Trond Myklebust
c514983d8d NFSv4: Handle the condition NFS4ERR_FILE_OPEN
Retry a few times before we give up: the error is usually due to ordering
issues with asynchronous RPC calls.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Trond Myklebust
6b30954ebb NFSv4: Retry lease recovery if it failed during a synchronous operation.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Trond Myklebust
97db8f4179 NFS: Don't invalidate the symlink we just stuffed into the cache
And slight optimisation of nfs_end_data_update(): directories never have
delegations anyway.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:03 -04:00
Trond Myklebust
5f004cf2aa NFS: Make read() return an ESTALE if the file has been deleted
Currently, a read() request will return EIO even if the file has been
deleted on the server, simply because that is what the VM will return
if the call to readpage() fails to update the page.

Ensure that readpage() marks the inode as stale if it receives an ESTALE.
Then return that error to userland.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:02 -04:00
J. Bruce Fields
2dec51466a NFSv4: It's perfectly legal for clp to be NULL here....
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:02 -04:00
Trond Myklebust
fd6840714d NFS: nfs_lookup - don't hash dentry when optimising away the lookup
If the open intents tell us that a given lookup is going to result in a,
exclusive create, we currently optimize away the lookup call itself. The
reason is that the lookup would not be atomic with the create RPC call, so
why do it in the first place?

A problem occurs, however, if the VFS aborts the exclusive create operation
after the lookup, but before the call to create the file/directory: in this
case we will end up with a hashed negative dentry in the dcache that has
never been looked up.
Fix this by only actually hashing the dentry once the create operation has
been successfully completed.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:25:01 -04:00
andros@citi.umich.edu
297de4f656 Fix a referral error Oops
Fix an oops when the referral server is not responding.
Check the error return from nfs4_set_client() in nfs4_create_referral_server.

Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:56 -04:00
Chuck Lever
058ad9cbf1 NFS: NFS_ROOT should use the new rpc_create API
Teach NFS_ROOT to use the new rpc_create API instead of the old two-call
API for creating an RPC transport.

Test plan:
Compile the kernel with the NFS client build-in, and set CONFIG_NFS_ROOT.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:55 -04:00
David Howells
6daabf1b04 NFS: Fix up compiler warnings on 64-bit platforms in client.c
Fix up warnings from compiling on ppc64.

Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:55 -04:00
Trond Myklebust
158998b6fe SUNRPC: Make rpc_mkpipe() take the parent dentry as an argument
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:54 -04:00
Trond Myklebust
5dd3177ae5 NFSv4: Fix a use-after-free issue with the nfs server.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:54 -04:00
Trond Myklebust
275a082fe9 Add a real API for dealing with blk_congestion_wait()
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:54 -04:00
Chuck Lever
94a6d75320 NFS: Use cached page as buffer for NFS symlink requests
Now that we have a copy of the symlink path in the page cache, we can pass
a struct page down to the XDR routines instead of a string buffer.

Test plan:
Connectathon, all NFS versions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:53 -04:00
Chuck Lever
873101b337 NFS: copy symlinks into page cache before sending NFS SYMLINK request
Currently the NFS client does not cache symlinks it creates.  They get
cached only when the NFS client reads them back from the server.

Copy the symlink into the page cache before sending it.

Test plan:
Connectathon, all NFS versions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:53 -04:00
Chuck Lever
4f390c152b NFS: Fix double d_drop in nfs_instantiate() error path
If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a
d_drop before returning.  But some callers already do a d_drop in the case
of an error return.  Make certain we do only one d_drop in all error paths.

This issue was introduced because over time, the symlink proc API diverged
slightly from the create/mkdir/mknod proc API.  To prevent other coding
mistakes of this type, change the symlink proc API to be more like
create/mkdir/mknod and move the nfs_instantiate call into the symlink proc
routines so it is used in exactly the same way for create, mkdir, mknod,
and symlink.

Test plan:
Connectathon, all versions of NFS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:52 -04:00
Chuck Lever
d3db90e270 NFS: remove a no-longer-needed error check in nfs_symlink()
In the early days of NFS, there was no duplicate reply cache on the server.
Thus retransmitted non-idempotent requests often found that the request had
already completed on the server.  To avoid passing an unanticipated return
code to unsuspecting applications, NFS clients would often shunt error
codes that implied the request had been retried but already completed.

Thanks to NFS over TCP, duplicate reply caches on the server, and network
performance and reliability improvements, it is safe to remove such checks.

Test plan:
None.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:52 -04:00
Chuck Lever
ae5c79476f NFSD: Convert NFS server callback logic to use new rpc_create API
Replace xprt_create_proto/rpc_create_client call in NFS server callback
functions to use new rpc_create() API.

Test plan:
NFSv4 delegation functionality tests.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:50 -04:00
Chuck Lever
41877d207c NFS: Convert NFS client to use new rpc_create() API
Convert NFS client mount logic to use rpc_create() instead of the old
xprt_create_proto/rpc_create_client API.

Test plan:
Mount stress tests.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:50 -04:00
Chuck Lever
e1ec78928b LOCKD: Convert to use new rpc_create() API
Replace xprt_create_proto/rpc_create_client with new rpc_create()
interface in the Network Lock Manager.

Note that the semantics of NLM transports is now "hard" instead of "soft"
to provide a better guarantee that lock requests will get to the server.

Test plan:
Repeated runs of Connectathon locking suite.  Check network trace to ensure
NLM requests are working correctly.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:50 -04:00
Chuck Lever
6ca9482387 SUNRPC: Clean-up after previous patches.
Remove some unused macros related to accessing an RPC peer address

Test plan:
Compile kernel with CONFIG_NFS option enabled.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2006-09-22 23:24:49 -04:00