1
Commit Graph

16621 Commits

Author SHA1 Message Date
Andy Whitcroft
161599ff39 [PATCH] sparsemem: provide pfn_to_nid
Before SPARSEMEM is initialised we cannot provide an efficient pfn_to_nid()
implmentation; before initialisation is complete we use early_pfn_to_nid()
to provide location information.  Until recently there was no non-init user
of this functionality.  Provide a post init pfn_to_nid() implementation.

Note that this implmentation assumes that the pfn passed has been validated
with pfn_valid().  The current single user of this function already has
this check.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:24 -08:00
Andy Whitcroft
2bdaf115b1 [PATCH] flatmem split out memory model
There are three places we define pfn_to_nid().  Two in linux/mmzone.h and one
in asm/mmzone.h.  These in essence represent the three memory models.  The
definition in linux/mmzone.h under !NEED_MULTIPLE_NODES is both the FLATMEM
definition and the optimisation for single NUMA nodes; the one under SPARSEMEM
is the NUMA sparsemem one; the one in asm/mmzone.h under DISCONTIGMEM is the
discontigmem one.  This is not in the least bit obvious, particularly the
connection between the non-NUMA optimisations and the memory models.

Two patches:

flatmem-split-out-memory-model: simplifies the selection of pfn_to_nid()
implementations.  The selection is based primarily off the memory model
selected.  Optimisations for non-NUMA are applied where needed.

sparse-provide-pfn_to_nid: implement pfn_to_nid() for SPARSEMEM

This patch:

pfn_to_nid is memory model specific

The pfn_to_nid() call is memory model specific.  It represents the locality
identifier for the memory passed.  Classically this would be a NUMA node,
but not a chunk of memory under DISCONTIGMEM.

The SPARSEMEM and FLATMEM memory model non-NUMA versions of pfn_to_nid()
are folded together under NEED_MULTIPLE_NODES, while DISCONTIGMEM has its
own optimisation.  This is all very confusing.

This patch splits out each implementation of pfn_to_nid() so that we can
see them and the optimisations to each.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:24 -08:00
Russell King
03b00ebcc8 [PATCH] Shut up warnings in ipc/shm.c
Fix two warnings in ipc/shm.c

ipc/shm.c:122: warning: statement with no effect
ipc/shm.c:560: warning: statement with no effect

by converting the macros to empty inline functions.  For safety, let's do
all three.  This also has the advantage that typechecking gets performed
even without CONFIG_SHMEM enabled.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:24 -08:00
Mike Kravetz
a94b3ab7ea [PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES
The NODES_SPAN_OTHER_NODES config option was created so that DISCONTIGMEM
could handle pSeries numa layouts.  However, support for DISCONTIGMEM has
been replaced by SPARSEMEM on powerpc.  As a result, this config option and
supporting code is no longer needed.

I have already sent a patch to Paul that removes the option from powerpc
specific code.  This removes the arch independent piece.  Doesn't really
matter which is applied first.

Signed-off-by: Mike Kravetz <kravetz@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:24 -08:00
Andy Whitcroft
d5afa6dcf7 [PATCH] mm: pfn_to_pgdat not used in common code
pfn_to_pgdat() isn't used in common code.  Remove definition.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:24 -08:00
Andy Whitcroft
9f3fd602ae [PATCH] mm: kvaddr_to_nid not used in common code
kvaddr_to_nid() isn't used in common code nor in i386 code.  Remove these
definitions.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:23 -08:00
Christoph Lameter
6bda666a03 [PATCH] hugepages: fold find_or_alloc_pages into huge_no_page()
The number of parameters for find_or_alloc_page increases significantly after
policy support is added to huge pages.  Simplify the code by folding
find_or_alloc_huge_page() into hugetlb_no_page().

Adam Litke objected to this piece in an earlier patch but I think this is a
good simplification.  Diffstat shows that we can get rid of almost half of the
lines of find_or_alloc_page().  If we can find no consensus then lets simply
drop this patch.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:23 -08:00
Christoph Lameter
21abb1478a [PATCH] Remove old node based policy interface from mempolicy.c
mempolicy.c contains provisional interface for huge page allocation based on
node numbers.  This is in use in SLES9 but was never used (AFAIK) in upstream
versions of Linux.

Huge page allocations now use zonelists to figure out where to allocate pages.
 The use of zonelists allows us to find the closest hugepage which was the
consideration of the NUMA distance for huge page allocations.

Remove the obsolete functions.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Acked-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:23 -08:00
Christoph Lameter
5da7ca8607 [PATCH] Add NUMA policy support for huge pages.
The huge_zonelist() function in the memory policy layer provides an list of
zones ordered by NUMA distance.  The hugetlb layer will walk that list looking
for a zone that has available huge pages but is also in the nodeset of the
current cpuset.

This patch does not contain the folding of find_or_alloc_huge_page() that was
controversial in the earlier discussion.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@muc.de>
Acked-by: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:23 -08:00
Christoph Lameter
96df9333c9 [PATCH] mm: dequeue a huge page near to this node
This was discussed at
http://marc.theaimsgroup.com/?l=linux-kernel&m=113166526217117&w=2

This patch changes the dequeueing to select a huge page near the node
executing instead of always beginning to check for free nodes from node 0.
This will result in a placement of the huge pages near the executing
processor improving performance.

The existing implementation can place the huge pages far away from the
executing processor causing significant degradation of performance.  The
search starting from zero also means that the lower zones quickly run out
of memory.  Selecting a huge page near the process distributed the huge
pages better.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Adam Litke <agl@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:23 -08:00
David Gibson
1e8f889b10 [PATCH] Hugetlb: Copy on Write support
Implement copy-on-write support for hugetlb mappings so MAP_PRIVATE can be
supported.  This helps us to safely use hugetlb pages in many more
applications.  The patch makes the following changes.  If needed, I also have
it broken out according to the following paragraphs.

1. Add a pair of functions to set/clear write access on huge ptes.  The
   writable check in make_huge_pte is moved out to the caller for use by COW
   later.

2. Hugetlb copy-on-write requires special case handling in the following
   situations:

   - copy_hugetlb_page_range() - Copied pages must be write protected so
     a COW fault will be triggered (if necessary) if those pages are written
     to.

   - find_or_alloc_huge_page() - Only MAP_SHARED pages are added to the
     page cache.  MAP_PRIVATE pages still need to be locked however.

3. Provide hugetlb_cow() and calls from hugetlb_fault() and
   hugetlb_no_page() which handles the COW fault by making the actual copy.

4. Remove the check in hugetlbfs_file_map() so that MAP_PRIVATE mmaps
   will be allowed.  Make MAP_HUGETLB exempt from the depricated VM_RESERVED
   mapping check.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:23 -08:00
Adam Litke
86e5216f8d [PATCH] Hugetlb: Reorganize hugetlb_fault to prepare for COW
This patch splits the "no_page()" type activity into its own function,
hugetlb_no_page().  hugetlb_fault() becomes the entry point for hugetlb faults
and delegates to the appropriate handler depending on the type of fault.
Right now we still have only hugetlb_no_page() but a later patch introduces a
COW fault.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:22 -08:00
Adam Litke
85ef47f74a [PATCH] Hugetlb: Rename find_lock_page to find_or_alloc_huge_page
find_lock_huge_page() isn't a great name, since it does extra things not
analagous to find_lock_page().  Rename it find_or_alloc_huge_page() which is
closer to the mark.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:22 -08:00
Adam Litke
f0916794f0 [PATCH] Hugetlb: Remove duplicate i_size check
cleanup

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Adam Litke <agl@us.ibm.com>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "Seth, Rohit" <rohit.seth@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:22 -08:00
Badari Pulavarty
f6b3ec238d [PATCH] madvise(MADV_REMOVE): remove pages from tmpfs shm backing store
Here is the patch to implement madvise(MADV_REMOVE) - which frees up a
given range of pages & its associated backing store.  Current
implementation supports only shmfs/tmpfs and other filesystems return
-ENOSYS.

"Some app allocates large tmpfs files, then when some task quits and some
client disconnect, some memory can be released.  However the only way to
release tmpfs-swap is to MADV_REMOVE". - Andrea Arcangeli

Databases want to use this feature to drop a section of their bufferpool
(shared memory segments) - without writing back to disk/swap space.

This feature is also useful for supporting hot-plug memory on UML.

Concerns raised by Andrew Morton:

- "We have no plan for holepunching!  If we _do_ have such a plan (or
  might in the future) then what would the API look like?  I think
  sys_holepunch(fd, start, len), so we should start out with that."

- Using madvise is very weird, because people will ask "why do I need to
  mmap my file before I can stick a hole in it?"

- None of the other madvise operations call into the filesystem in this
  manner.  A broad question is: is this capability an MM operation or a
  filesytem operation?  truncate, for example, is a filesystem operation
  which sometimes has MM side-effects.  madvise is an mm operation and with
  this patch, it gains FS side-effects, only they're really, really
  significant ones."

Comments:

- Andrea suggested the fs operation too but then it's more efficient to
  have it as a mm operation with fs side effects, because they don't
  immediatly know fd and physical offset of the range.  It's possible to
  fixup in userland and to use the fs operation but it's more expensive,
  the vmas are already in the kernel and we can use them.

Short term plan &  Future Direction:

- We seem to need this interface only for shmfs/tmpfs files in the short
  term.  We have to add hooks into the filesystem for correctness and
  completeness.  This is what this patch does.

- In the future, plan is to support both fs and mmap apis also.  This
  also involves (other) filesystem specific functions to be implemented.

- Current patch doesn't support VM_NONLINEAR - which can be addressed in
  the future.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Hugh Dickins <hugh@veritas.com>
Cc: Andrea Arcangeli <andrea@suse.de>
Cc: Michael Kerrisk <mtk-manpages@gmx.net>
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:22 -08:00
Hans Reiser
d7339071f6 [PATCH] reiser4: vfs: add truncate_inode_pages_range()
This patch makes truncate_inode_pages_range from truncate_inode_pages.
truncate_inode_pages became a one-liner call to truncate_inode_pages_range.

Reiser4 needs truncate_inode_pages_ranges because it tries to keep
correspondence between existences of metadata pointing to data pages and pages
to which those metadata point to.  So, when metadata of certain part of file
is removed from filesystem tree, only pages of corresponding range are to be
truncated.

(Needed by the madvise(MADV_REMOVE) patch)

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:22 -08:00
Andy Whitcroft
900b2b463d [PATCH] memhotplug: register_memory should be global
register_memory is global and declared so in linux/memory.h.  Update the
HOTPLUG specific definition to match.  This fixes a compile warning when
HOTPLUG is enabled.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:22 -08:00
Andy Whitcroft
98a38ebdda [PATCH] memhotplug: register_ and unregister_memory_notifier should be global
Both register_memory_notifer and unregister_memory_notifier are global and
declared so in linux/memory.h.  Update the HOTPLUG specific definitions to
match.  This fixes a compile warning when HOTPLUG is enabled.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Andy Whitcroft
5ac24eefd1 [PATCH] memhotplug: __add_section remove unused pgdat definition
__add_section defines an unused pointer to the zones pgdat.  Remove this
definition.  This fixes a compile warning.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Paul Jackson
47f3a867f6 [PATCH] mm: fix __alloc_pages cpuset ALLOC_* flags
Two changes to the setting of the ALLOC_CPUSET flag in
mm/page_alloc.c:__alloc_pages()

- A bug fix - the "ignoring mins" case should not be honoring ALLOC_CPUSET.
  This case of all cases, since it is handling a request that will free up
  more memory than is asked for (exiting tasks, e.g.) should be allowed to
  escape cpuset constraints when memory is tight.

- A logic change to make it simpler.  Honor cpusets even on GFP_ATOMIC
  (!wait) requests.  With this, cpuset confinement applies to all requests
  except ALLOC_NO_WATERMARKS, so that in a subsequent cleanup patch, I can
  remove the ALLOC_CPUSET flag entirely.  Since I don't know any real reason
  this logic has to be either way, I am choosing the path of the simplest
  code.

Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Andrew Morton
a576219aca [PATCH] swsusp: resume_store() retval fix
- This function returns -EINVAL all the time.  Fix.

- Decruftify it a bit too.

- Writing to it doesn't seem to do what it's suppoed to do.

Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Andrew Morton
817c41d76e [PATCH] alpha: dma_map_page() fix
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
NeilBrown
1f1e030bf7 [PATCH] knfsd: fix hash function for IP addresses on 64bit little-endian machines.
The hash.h hash_long function, when used on a 64 bit machine, ignores many
of the middle-order bits.  (The prime chosen it too bit-sparse).

IP addresses for clients of an NFS server are very likely to differ only in
the low-order bits.  As addresses are stored in network-byte-order, these
bits become middle-order bits in a little-endian 64bit 'long', and so do
not contribute to the hash.  Thus you can have the situation where all
clients appear on one hash chain.

So, until hash_long is fixed (or maybe forever), us a hash function that
works well on IP addresses - xor the bytes together.

Thanks to "Iozone" <capps@iozone.org> for identifying this problem.

Cc: "Iozone" <capps@iozone.org>

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:21 -08:00
Herbert Xu
4b2f0260c7 [PATCH] nbd: fix TX/RX race condition
Janos Haar of First NetCenter Bt.  reported numerous crashes involving the
NBD driver.  With his help, this was tracked down to bogus bio vectors
which in turn was the result of a race condition between the
receive/transmit routines in the NBD driver.

The bug manifests itself like this:

CPU0				CPU1
do_nbd_request
	add req to queuelist
	nbd_send_request
		send req head
		for each bio
			kmap
			send
				nbd_read_stat
					nbd_find_request
					nbd_end_request
			kunmap

When CPU1 finishes nbd_end_request, the request and all its associated
bio's are freed.  So when CPU0 calls kunmap whose argument is derived from
the last bio, it may crash.

Under normal circumstances, the race occurs only on the last bio.  However,
if an error is encountered on the remote NBD server (such as an incorrect
magic number in the request), or if there were a bug in the server, it is
possible for the nbd_end_request to occur any time after the request's
addition to the queuelist.

The following patch fixes this problem by making sure that requests are not
added to the queuelist until after they have been completed transmission.

In order for the receiving side to be ready for responses involving
requests still being transmitted, the patch introduces the concept of the
active request.

When a response matches the current active request, its processing is
delayed until after the tranmission has come to a stop.

This has been tested by Janos and it has been successful in curing this
race condition.

From: Herbert Xu <herbert@gondor.apana.org.au>

  Here is an updated patch which removes the active_req wait in
  nbd_clear_queue and the associated memory barrier.

  I've also clarified this in the comment.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cc: <djani22@dynamicweb.hu>
Cc: Paul Clements <Paul.Clements@SteelEye.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:20 -08:00
Joshua Kwan
bd6a59b22f [PATCH] hfsplus oops fix
nls_utf8 is available, and the check in hfsplus_fill_super checks the wrong
pointer for NULLness (it checks the saved nls, not the new one that it
needs to use.)

Signed-off-by: Joshua Kwan <joshk@triplehelix.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-06 08:33:20 -08:00
Jens Axboe
e650c305ec [SCSI] scsi_end_async() needs to take an uptodate parameter
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 12:38:30 +01:00
Jens Axboe
15fc858a00 [BLOCK] Correct blk_execute_rq_nowait() prototype 2006-01-06 10:00:50 +01:00
Tejun Heo
ff5b8cf149 [BLOCK] I/O barrier documentation update
Update documentation to match new barrier implementation.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:58:37 +01:00
Tejun Heo
3e087b5754 [BLOCK] update IDE to use new blk_ordered for barriers
Update IDE to use new blk_ordered.  This change makes the
following behavior changes.

* Partial completion of the barrier request is handled as
  failure of the whole ordered sequence.  No more partial
  completion for barrier requests.

* Any failure of pre or post flush request results in failure
  of the whole ordered sequence.

So, successfully completed ordered sequence guarantees that
all requests prior to the barrier made to physical medium and,
then, the while barrier request made to the physical medium.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:57:31 +01:00
Tejun Heo
9a3dccc425 [BLOCK] add FUA support to libata
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:56:18 +01:00
Tejun Heo
93c9338713 [BLOCK] update libata to use new blk_ordered for barriers
Reflect changes in SCSI midlayer and updated to use new
ordered request implementation

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:55:00 +01:00
Tejun Heo
007365ad60 [BLOCK] scsi: add FUA support to sd
Add FUA support for barriers to SCSI disk.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:53:52 +01:00
Tejun Heo
461d4e90c8 [BLOCK] update SCSI to use new blk_ordered for barriers
All ordered request related stuff delegated to HLD.  Midlayer
now doens't deal with ordered setting or prepare_flush
callback.  sd.c updated to deal with blk_queue_ordered
setting.  Currently, ordered tag isn't used as SCSI midlayer
cannot guarantee request ordering.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:52:55 +01:00
Tejun Heo
797e7dbbee [BLOCK] reimplement handling of barrier request
Reimplement handling of barrier requests.

* Flexible handling to deal with various capabilities of
  target devices.
* Retry support for falling back.
* Tagged queues which don't support ordered tag can do ordered.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:51:03 +01:00
Tejun Heo
52d9e67536 [BLOCK] ll_rw_blk: separate out bio init part from __make_request
Separate out bio initialization part from __make_request.  It
will be used by the following blk_ordered_reimpl.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:49:58 +01:00
Tejun Heo
8ffdc6550c [BLOCK] add @uptodate to end_that_request_last() and @error to rq_end_io_fn()
add @uptodate argument to end_that_request_last() and @error
to rq_end_io_fn().  there's no generic way to pass error code
to request completion function, making generic error handling
of non-fs request difficult (rq->errors is driver-specific and
each driver uses it differently).  this patch adds @uptodate
to end_that_request_last() and @error to rq_end_io_fn().

for fs requests, this doesn't really matter, so just using the
same uptodate argument used in the last call to
end_that_request_first() should suffice.  imho, this can also
help the generic command-carrying request jens is working on.

Signed-off-by: tejun heo <htejun@gmail.com>
Signed-Off-By: Jens Axboe <axboe@suse.de>
2006-01-06 09:49:03 +01:00
Arjan van de Ven
64100099ed [BLOCK] mark some block/ variables cons
the patch below marks various read-only variables in block/* as const,
so that gcc can optimize the use of them; eg gcc will replace the use by
the value directly now and will even remove the memory usage of these.

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:46:02 +01:00
Jens Axboe
80cfd548ee [BLOCK] bio: check for same page merge possibilities in __bio_add_page()
For filesystems with a blocksize < page size, we can merge same page
calls into the bio_vec at the end of the bio. This saves segments
on systems with a page size > the "normal" 4kb fs block size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:43:28 +01:00
Jens Axboe
88ee5ef157 [BLOCK] ll_rw_blk: fastpath get_request()
Originally from: Nick Piggin <nickpiggin@yahoo.com.au>

Move current_io_context out of the get_request fastpth.  Also try to
streamline a few other things in this area.

Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:39:04 +01:00
Tejun Heo
ef9be1d336 [BLOCK] as-iosched: update alias handling
Unlike other ioscheds, as-iosched handles alias by chaing them using
rq->queuelist.  As aliased requests are very rare in the first place,
this complicates merge/dispatch handling without meaningful
performance improvement.  This patch updates as-iosched to dump
aliased requests into dispatch queue as other ioscheds do.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jens Axboe <axboe@suse.de>
2006-01-06 09:39:03 +01:00
Chuck Ebbert
9f155b9802 [PATCH] i386: PTRACE_POKEUSR: allow changing RF bit in EFLAGS register.
Setting RF (resume flag) allows a debugger to resume execution after a
code breakpoint without tripping the breakpoint again.  It is reset by
the CPU after execution of one instruction.

Requested by Stephane Eranian:
  "I am trying to the user HW debug registers on i386 and I am running
   into a problem with ptrace() not allowing access to EFLAGS_RF for
   POKEUSER (see FLAG_MASK).  [ ...  ] It avoids the need to remove the
   breakpoint, single step, and reinstall.  The equivalent functionality
   exists on IA-64 and is allowed by ptrace()"

Cc: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-05 20:50:51 -08:00
Linus Torvalds
29552b1462 Merge http://oss.oracle.com/git/ocfs2 2006-01-05 20:43:11 -08:00
Jody McIntyre
6c59f9d9fb Update MAINTAINERS - Jody is no longer at Steamballoon. 2006-01-05 23:04:08 -05:00
Jody McIntyre
52cab57873 Merge with http://kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2006-01-05 22:22:50 -05:00
Kris Katterjohn
46f25dffba [NET]: Change 1500 to ETH_DATA_LEN in some files
These patches add the header linux/if_ether.h and change 1500 to
ETH_DATA_LEN in some files.

Signed-off-by: Kris Katterjohn <kjak@users.sourceforge.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 16:48:56 -08:00
Andrew Morton
e924283bf9 [IPVS]: Another file needs linux/interrupt.h
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2006-01-05 16:48:55 -08:00
Linus Torvalds
d7906de1d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/brodo/pcmcia-2.6 2006-01-05 15:55:49 -08:00
Richard Purdie
725a6abfe3 [PATCH] pcmcia: add some IDs for ide-cs and dtl1_cs
Add some PCMCIA device IDs for the microdrive found in the Sharp Zaurus
and a different revision of the Socket CF+ Bluetooth card.

Signed-off-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2006-01-06 00:31:07 +01:00
Daniel Ritz
6e49388272 [PATCH] pcmcia: cleanup cs.c, reduce size
kill the socket_shutdown()/shutdown_socket() confusion by making it
one single function. move cs_socket_put() in there. nicer to read and
smaller:

original:
   text	   data	    bss	    dec	    hex	filename
  25181	   1076	     32	  26289	   66b1	drivers/pcmcia/pcmcia_core.ko

patched:
   text	   data	    bss	    dec	    hex	filename
  24973	   1076	     32	  26081	   65e1	drivers/pcmcia/pcmcia_core.ko

Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2006-01-06 00:28:15 +01:00
Alexey Dobriyan
3cf89b1859 [PATCH] drivers/pcmcia/cistpl.c: fix endian warnings
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
2006-01-06 00:28:10 +01:00