linux

Author	SHA1	Message	Date
NeilBrown	6712ecf8f6	Drop 'size' argument from bio_endio and bi_end_io As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:57 +02:00
NeilBrown	5bb23a688b	Don't decrement bi_size in bio_endio The only caller of bio_endio that does not pass the full bi_size is end_that_request_first. Also, no ->bi_end_io method is really interested in bi_size being decremented. So move the decrement and related code into ll_rw_blk and merge it with order_bio_endio to form req_bio_endio which does endio functionality specific to request completion. As some ->bi_end_io methods do check bi_size of 0, we set it thus for now, but that will go in the next patch. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./block/ll_rw_blk.c \| 42 +++++++++++++++++++++++++++--------------- ./fs/bio.c \| 23 +++++++++++------------ 2 files changed, 38 insertions(+), 27 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:57 +02:00
NeilBrown	d24517d793	Remove flush_dry_bio_endio The entire function of flush_dry_bio_endio is to undo the effects of bio_endio (when called on a barrier request). So remove the function and the call to bio_endio. This allows us to remove "bi_size" from "struct request_queue". Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./block/ll_rw_blk.c \| 39 ++------------------------------------- ./include/linux/blkdev.h \| 1 - 2 files changed, 2 insertions(+), 38 deletions(-) diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:57 +02:00
Satyam Sharma	db47d47537	ll_rw_blk: blk_cpu_notifier should be __cpuinitdata blk_cpu_notifier is marked as __devinitdata, but __devinitdata need not be __init even if HOTPLUG_CPU=n, which wastes space. It should be marked __cpuinitdata, and the callback itself as __cpuinit. Signed-off-by: Satyam Sharma <satyam@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
Jens Axboe	6c92e699b5	Fixup rq_for_each_segment() indentation Remove one level of nesting where appropriate. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
NeilBrown	bc1c56fde6	Share code between init_request_from_bio and blk_rq_bio_prep These have very similar functions and should share code where possible. Signed-off-by: Neil Brown <neilb@suse.de> diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
NeilBrown	66846572bf	Stop exporting blk_rq_bio_prep blk_rq_bio_prep is exported for use in exactly one place. That place can benefit from using the new blk_rq_append_bio instead. So - change dm-emc to call blk_rq_append_bio - stop exporting blk_rq_bio_prep, and - initialise rq_disk in blk_rq_bio_prep, as dm-emc needs it. Signed-off-by: Neil Brown <neilb@suse.de> diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
NeilBrown	3001ca7712	New function blk_req_append_bio ll_back_merge_fn is currently exported to SCSI where is it used, together with blk_rq_bio_prep, in exactly the same way these functions are used in __blk_rq_map_user. So move the common code into a new function (blk_rq_append_bio), and don't export ll_back_merge_fn any longer. Signed-off-by: Neil Brown <neilb@suse.de> diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
NeilBrown	5705f70217	Introduce rq_for_each_segment replacing rq_for_each_bio Every usage of rq_for_each_bio wraps a usage of bio_for_each_segment, so these can be combined into rq_for_each_segment. We define "struct req_iterator" to hold the 'bio' and 'index' that are needed for the double iteration. Signed-off-by: Neil Brown <neilb@suse.de> Various compile fixes by me... Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
NeilBrown	9dfa52831e	Merge blk_recount_segments into blk_recalc_rq_segments blk_recalc_rq_segments calls blk_recount_segments on each bio, then does some extra calculations to handle segments that overlap two bios. If we merge the code from blk_recount_segments into blk_recalc_rq_segments, we can process the whole request one bio_vec at a time, and not need the messy cross-bio calculations. Then blk_recount_segments can be implemented by calling blk_recalc_rq_segments, passing it a simple on-stack request which stores just the bio. Signed-off-by: Neil Brown <neilb@suse.de> diff .prev/block/ll_rw_blk.c ./block/ll_rw_blk.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-10-10 09:25:56 +02:00
Nick Piggin	dd941252a8	shared tag queue barrier comment Should add some comments for the tag barriers (they won't be so important if we can switch over to the explicit _lock bitops, but for now we should make it clear). Jens' original patch said a barrier after the test_and_clear_bit was also required. I can't see why (and it would prevent the use of the _lock bitop). Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> --	2007-09-14 13:56:47 -07:00
Jens Axboe	f3da54ba14	Fix race with shared tag queue maps There's a race condition in blk_queue_end_tag() for shared tag maps, users include stex (promise supertrak thingy) and qla2xxx. The former at least has reported bugs in this area, not sure why we haven't seen any for the latter. It could be because the window is narrow and that other conditions in the qla2xxx code hide this. It's a real bug, though, as the stex smp users can attest. We need to ensure two things - the tag bit clearing needs to happen AFTER we cleared the tag pointer, as the tag bit clearing/setting is what protects this map. Secondly, we need to ensure that the visibility of the tag pointer and tag bit clear are ordered properly. [ I removed the SMP barriers - "test_and_clear_bit()" already implies all the required barriers. -- Linus ] Also see http://bugzilla.kernel.org/show_bug.cgi?id=7842 Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-09-13 08:20:25 -07:00
Alan D. Brunelle	c7149d6bce	Fix remap handling by blktrace This patch provides more information concerning REMAP operations on block IOs. The additional information provides clearer details at the user level, and supports post-processing analysis in btt. o Adds in partition remaps on the same device. o Fixed up the remap information in DM to be in the right order o Sent up mapped-from and mapped-to device information Signed-off-by: Alan D. Brunelle <alan.brunelle@hp.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-08-11 22:34:48 +02:00
Jens Axboe	165125e1e4	[BLOCK] Get rid of request_queue_t typedef Some of the code has been gradually transitioned to using the proper struct request_queue, but there's lots left. So do a full sweet of the kernel and get rid of this typedef and replace its uses with the proper type. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-24 09:28:11 +02:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Christoph Lameter	94f6030ca7	Slab allocators: Replace explicit zeroing with __GFP_ZERO kmalloc_node() and kmem_cache_alloc_node() were not available in a zeroing variant in the past. But with __GFP_ZERO it is possible now to do zeroing while allocating. Use __GFP_ZERO to remove the explicit clearing of memory via memset whereever we can. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-17 10:23:02 -07:00
FUJITA Tomonori	abae1fde63	add a struct request pointer to the request structure This adds a struct request pointer to the request structure for the second data phase (bidi for now). A request queue supporting bidi requests sets QUEUE_FLAG_BIDI. This prevents sending bidi requests to a non-bidi queue. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-16 08:52:46 +02:00
FUJITA Tomonori	4e2872d6b0	bind bsg to all SCSI devices This patch binds bsg to all SCSI devices (their request queues) like the current sg driver does. We can send SCSI commands to non disk and cdrom scsi devices like OSD via bsg. This patch removes bsg_register_queue from blk_register_queue so bsg devices aren't bound to non SCSI block devices. If they want bsg, I'll send a patch to do that. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-16 08:52:46 +02:00
FUJITA Tomonori	d351af01b9	bsg: bind bsg to request_queue instead of gendisk This patch binds bsg devices to request_queue instead of gendisk. Any objects (like transport entities) can define own request_handler and create own bsg device. Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-16 08:52:46 +02:00
Jens Axboe	3d6392cfbd	bsg: support for full generic block layer SG v3 Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-16 08:52:44 +02:00
Tejun Heo	f4b09303d0	[BLOCK] drop unnecessary bvec rewinding from flush_dry_bio_endio Barrier bios are completed twice - once after the barrier write itself is done and again after the whole sequence is complete. flush_dry_bio_endio() is for the first completion. It doesn't really complete the bio. It rewinds bvec and resets bio so that it can be completed again when the whole barrier sequence is complete. The bvec rewinding code has the following problems. 1. The rewinding code is wrong because filesystems may pass bvec with non zero bv_offset. 2. The block layer doesn't guarantee anything about the state of bvec array on request completion. bv_offset and len are updated iff __end_that_request_first() completes the bvec partially. Because of #2, #1 doesn't really matter (nobody cares whether bvec is re-wound correctly or not) but then again by not doing unwinding at all, we'll always give back the same bvec to the caller as full bvec completion doesn't alter bvecs and the final completion is always full completion. Drop unnecessary rewinding code. This is spotted by Neil Brown. Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:03:33 +02:00
Jens Axboe	32eef96411	blk_hw_contig_segment(): bad segment size checks Two bugs in there: - The virt oversize check should use the current bio hardware back size and the next bio front size, not the same bio. Spotted by Neil Brown. - The segment size check should add hw front sizes, not total bio sizes. Spotted by James Bottomley Acked-by: James Bottomley <James.Bottomley@SteelEye.com> Acked-by: NeilBrown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-07-10 08:03:32 +02:00
Tejun Heo	bc90ba093a	block: always requeue !fs requests at the front SCSI marks internal commands with REQ_PREEMPT and push it at the front of the request queue using blk_execute_rq(). When entering suspended or frozen state, SCSI devices are quiesced using scsi_device_quiesce(). In quiesced state, only REQ_PREEMPT requests are processed. This is how SCSI blocks other requests out while suspending and resuming. As all internal commands are pushed at the front of the queue, this usually works. Unfortunately, this interacts badly with ordered requeueing. To preserve request order on requeueing (due to busy device, active EH or other failures), requests are sorted according to ordered sequence on requeue if IO barrier is in progress. The following sequence deadlocks. 1. IO barrier sequence issues. 2. Suspend requested. Queue is quiesced with part or all of IO barrier sequence at the front. 3. During suspending or resuming, SCSI issues internal command which gets deferred and requeued for some reason. As the command is issued after the IO barrier in #1, ordered requeueing code puts the request after IO barrier sequence. 4. The device is ready to process requests again but still is in quiesced state and the first request of the queue isn't REQ_PREEMPT, so command processing is deadlocked - suspending/resuming waits for the issued request to complete while the request can't be processed till device is put back into running state by resuming. This can be fixed by always putting !fs requests at the front when requeueing. The following thread reports this deadlock. http://thread.gmane.org/gmane.linux.kernel/537473 Signed-off-by: Tejun Heo <htejun@gmail.com> Acked-by: David Greaves <david@dgreaves.com> Acked-by: Jeff Garzik <jeff@garzik.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-06-15 16:12:20 -07:00
Jens Axboe	f653c34dd3	ll_rw_blk: fix gcc 4.2 warning on current_io_context() current_io_context() is both static and exported with EXPORT_SYMBOL(). As there are no users outside of ll_rw_blk.c itself, just kill the export. Problem reported by Martin Michlmayr <tbm@cyrius.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-15 10:44:15 -07:00
Neil Brown	d89d87965d	When stacked block devices are in-use (e.g. md or dm), the recursive calls to generic_make_request can use up a lot of space, and we would rather they didn't. As generic_make_request is a void function, and as it is generally not expected that it will have any effect immediately, it is safe to delay any call to generic_make_request until there is sufficient stack space available. As ->bi_next is reserved for the driver to use, it can have no valid value when generic_make_request is called, and as __make_request implicitly assumes it will be NULL (ELEVATOR_BACK_MERGE fork of switch) we can be certain that all callers set it to NULL. We can therefore safely use bi_next to link pending requests together, providing we clear it before making the real call. So, we choose to allow each thread to only be active in one generic_make_request at a time. If a subsequent (recursive) call is made, the bio is linked into a per-thread list, and is handled when the active call completes. As the list of pending bios is per-thread, there are no locking issues to worry about. I say above that it is "safe to delay any call...". There are, however, some behaviours of a make_request_fn which would make it unsafe. These include any behaviour that assumes anything will have changed after a recursive call to generic_make_request. These could include: - waiting for that call to finish and call it's bi_end_io function. md use to sometimes do this (marking the superblock dirty before completing a write) but doesn't any more - inspecting the bio for fields that generic_make_request might change, such as bi_sector or bi_bdev. It is hard to see a good reason for this, and I don't think anyone actually does it. - inspecing the queue to see if, e.g. it is 'full' yet. Again, I think this is very unlikely to be useful, or to be done. Signed-off-by: Neil Brown <neilb@suse.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: <dm-devel@redhat.com> Alasdair G Kergon <agk@redhat.com> said: I can see nothing wrong with this in principle. For device-mapper at the moment though it's essential that, while the bio mappings may now get delayed, they still get processed in exactly the same order as they were passed to generic_make_request(). My main concern is whether the timing changes implicit in this patch will make the rare data-corrupting races in the existing snapshot code more likely. (I'm working on a fix for these races, but the unfinished patch is already several hundred lines long.) It would be helpful if some people on this mailing list would test this patch in various scenarios and report back. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-05-11 13:28:37 +02:00
Linus Torvalds	9a9136e270	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial * git://git.kernel.org/pub/scm/linux/kernel/git/bunk/trivial: (25 commits) sound: convert "sound" subdirectory to UTF-8 MAINTAINERS: Add cxacru website/mailing list include files: convert "include" subdirectory to UTF-8 general: convert "kernel" subdirectory to UTF-8 documentation: convert the Documentation directory to UTF-8 Convert the toplevel files CREDITS and MAINTAINERS to UTF-8. remove broken URLs from net drivers' output Magic number prefix consistency change to Documentation/magic-number.txt trivial: s/i_sem /i_mutex/ fix file specification in comments drivers/base/platform.c: fix small typo in doc misc doc and kconfig typos Remove obsolete fat_cvf help text Fix occurrences of "the the " Fix minor typoes in kernel/module.c Kconfig: Remove reference to external mqueue library Kconfig: A couple of grammatical fixes in arch/i386/Kconfig Correct comments in genrtc.c to refer to correct /proc file. Fix more "deprecated" spellos. Fix "deprecated" typoes. ... Fix trivial comment conflict in kernel/relay.c.	2007-05-09 12:54:17 -07:00
Rafael J. Wysocki	8bb7844286	Add suspend-related notifications for CPU hotplug Since nonboot CPUs are now disabled after tasks and devices have been frozen and the CPU hotplug infrastructure is used for this purpose, we need special CPU hotplug notifications that will help the CPU-hotplug-aware subsystems distinguish normal CPU hotplug events from CPU hotplug events related to a system-wide suspend or resume operation in progress. This patch introduces such notifications and causes them to be used during suspend and resume transitions. It also changes all of the CPU-hotplug-aware subsystems to take these notifications into consideration (for now they are handled in the same way as the corresponding "normal" ones). [oleg@tv-sign.ru: cleanups] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Pavel Machek <pavel@ucw.cz> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Oleg Nesterov	28e53bddf8	unify flush_work/flush_work_keventd and rename it to cancel_work_sync flush_work(wq, work) doesn't need the first parameter, we can use cwq->wq (this was possible from the very beginnig, I missed this). So we can unify flush_work_keventd and flush_work. Also, rename flush_work() to cancel_work_sync() and fix all callers. Perhaps this is not the best name, but "flush_work" is really bad. (akpm: this is why the earlier patches bypassed maintainers) Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Tejun Heo <htejun@gmail.com> Cc: Auke Kok <auke-jan.h.kok@intel.com>, Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:53 -07:00
Andrew Morton	19a75d83ff	kblockd: use flush_work Switch the kblockd flushing from a global flush to a more specific flush_work(). (akpm: bypassed maintainers, sorry. There are other patches which depend on this) Cc: "Maciej W. Rozycki" <macro@linux-mips.org> Cc: David Howells <dhowells@redhat.com> Cc: Jens Axboe <axboe@suse.de> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:51 -07:00
Michael Opdenacker	59c51591a0	Fix occurrences of "the the " Signed-off-by: Michael Opdenacker <michael@free-electrons.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2007-05-09 08:57:56 +02:00
Mike Christie	821de3a27b	[PATCH] ll_rw_blk: fix missing bounce in blk_rq_map_kern() I think we might just need the blk_map_kern users now. For the async execute I added the bounce code already and the block SG_IO has it atleady. I think the blk_map_kern bounce code got dropped because we thought the correct gfp_t would be passed in. But I think all we need is the patch below and all the paths are take care of. The patch is not tested. Patch was made against scsi-misc. The last place that is sending non sg commands may just be md/dm-emc.c but that is is just waiting on alasdair to take some patches that fix that and a bunch of junk in there including adding bounce support. If the patch below is ok though and dm-emc finally gets converted then it will have sg and bonce buffer support. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-05-08 19:12:23 +02:00
Linus Torvalds	4f7a307dc6	Merge master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6 * master.kernel.org:/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (87 commits) [SCSI] fusion: fix domain validation loops [SCSI] qla2xxx: fix regression on sparc64 [SCSI] modalias for scsi devices [SCSI] sg: cap reserved_size values at max_sectors [SCSI] BusLogic: stop using check_region [SCSI] tgt: fix rdma transfer bugs [SCSI] aacraid: fix aacraid not finding device [SCSI] aacraid: Correct SMC products in aacraid.txt [SCSI] scsi_error.c: Add EH Start Unit retry [SCSI] aacraid: [Fastboot] Panics for AACRAID driver during 'insmod' for kexec test. [SCSI] ipr: Driver version to 2.3.2 [SCSI] ipr: Faster sg list fetch [SCSI] ipr: Return better qc_issue errors [SCSI] ipr: Disrupt device error [SCSI] ipr: Improve async error logging level control [SCSI] ipr: PCI unblock config access fix [SCSI] ipr: Fix for oops following SATA request sense [SCSI] ipr: Log error for SAS dual path switch [SCSI] ipr: Enable logging of debug error data for all devices [SCSI] ipr: Add new PCI-E IDs to device table ...	2007-05-05 13:30:44 -07:00
Jens Axboe	4e521c27ee	ll_rw_blk: add io_context private pointer To be used by as/cfq as they see fit. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-04-30 09:01:23 +02:00
Alan Stern	44ec95425c	[SCSI] sg: cap reserved_size values at max_sectors This patch (as857) modifies the SG_GET_RESERVED_SIZE and SG_SET_RESERVED_SIZE ioctls in the sg driver, capping the values at the device's request_queue's max_sectors value. This will permit cdrecord to obtain a legal value for the maximum transfer length, fixing Bugzilla #7026. The patch also caps the initial reserved_size value. There's no reason to have a reserved buffer larger than max_sectors, since it would be impossible to use the extra space. The corresponding ioctls in the block layer are modified similarly, and the initial value for the reserved_size is set as large as possible. This will effectively make it default to max_sectors. Note that the actual value is meaningless anyway, since block devices don't have a reserved buffer. Finally, the BLKSECTGET ioctl is added to sg, so that there will be a uniform way for users to determine the actual max_sectors value for any raw SCSI transport. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Jens Axboe <jens.axboe@oracle.com> Acked-by: Douglas Gilbert <dougg@torque.net> Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>	2007-04-17 18:09:56 -04:00
Vasily Tarasov	f772b3d9ca	block: blk_max_pfn is somtimes wrong There is a small problem in handling page bounce. At the moment blk_max_pfn equals max_pfn, which is in fact not maximum possible _number_ of a page frame, but the _amount_ of page frames. For example for the 32bit x86 node with 4Gb RAM, max_pfn = 0x100000, but not 0xFFFF. request_queue structure has a member q->bounce_pfn and queue needs bounce pages for the pages _above_ this limit. This routine is handled by blk_queue_bounce(), where the following check is produced: if (q->bounce_pfn >= blk_max_pfn) return; Assume, that a driver has set q->bounce_pfn to 0xFFFF, but blk_max_pfn equals 0x10000. In such situation the check above fails and for each bio we always fall down for iterating over pages tied to the bio. I want to notice, that for quite a big range of device drivers (ide, md, ...) such problem doesn't happen because they use BLK_BOUNCE_ANY for bounce_pfn. BLK_BOUNCE_ANY is defined as blk_max_pfn << PAGE_SHIFT, and then the check above doesn't fail. But for other drivers, which obtain reuired value from drivers, it fails. For example sata_nv uses ATA_DMA_MASK or dev->dma_mask. I propose to use (max_pfn - 1) for blk_max_pfn. And the same for blk_max_low_pfn. The patch also cleanses some checks related with bounce_pfn. Signed-off-by: Vasily Tarasov <vtaras@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2007-03-27 08:52:47 +02:00
Neil Brown	387bb17374	[PATCH] md: fix various bugs with aligned reads in RAID5 It is possible for raid5 to be sent a bio that is too big for an underlying device. So if it is a READ that we pass stright down to a device, it will fail and confuse RAID5. So in 'chunk_aligned_read' we check that the bio fits within the parameters for the target device and if it doesn't fit, fall back on reading through the stripe cache and making lots of one-page requests. Note that this is the earliest time we can check against the device because earlier we don't have a lock on the device, so it could change underneath us. Also, the code for handling a retry through the cache when a read fails has not been tested and was badly broken. This patch fixes that code. Signed-off-by: Neil Brown <neilb@suse.de> Cc: "Kai" <epimetreus@fastmail.fm> Cc: <stable@suse.de> Cc: <org@suse.de> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-09 09:25:46 -08:00
Randy Dunlap	af9997e426	[PATCH] fix kernel-doc warnings in 2.6.20-rc1 Fix kernel-doc warnings in 2.6.20-rc1. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-22 08:55:47 -08:00
Jens Axboe	8e5cfc45e7	[PATCH] Fixup blk_rq_unmap_user() API The blk_rq_unmap_user() API is not very nice. It expects the caller to know that rq->bio has to be reset to the original bio, and it will silently do nothing if that is not done. Instead make it explicit that we need to pass in the first bio, by expecting a bio argument. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-19 11:12:46 +01:00
Jens Axboe	48785bb9fa	[PATCH] __blk_rq_unmap_user() fails to return error If the bio is user copied, the copy back could return -EFAULT. Make sure we return any error seen during unmapping. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-19 11:07:59 +01:00
Jens Axboe	9c9381f942	[PATCH] __blk_rq_map_user() doesn't need to grab the queue_lock It was for driver private back_merge_fn hooks, but they don't exist anymore. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-19 08:34:17 +01:00
Jens Axboe	1aa4f24fe9	[PATCH] Remove queue merging hooks We have full flexibility of merging parameters now, so we can remove the hooks that define back/front/request merge strategies. Nobody is using them anymore. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-19 08:33:11 +01:00
Jens Axboe	2985259b0e	[PATCH] ->nr_sectors and ->hard_nr_sectors are not used for BLOCK_PC requests It's a file system thing, for block requests the only size used in the io paths is ->data_len as it is in bytes, not sectors. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-19 08:27:31 +01:00
Jens Axboe	7749a8d423	[PATCH] Propagate down request sync flag We need to do this, otherwise the io schedulers don't get access to the sync flag. Then they cannot tell the difference between a regular write and an O_DIRECT write, which can cause a performance loss. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-13 13:02:26 +01:00
Boaz Harrosh	2b02a17920	[PATCH] remove blk_queue_activity_fn While working on bidi support at struct request level I have found that blk_queue_activity_fn is actually never used. The only user is in ide-probe.c with this code: /* enable led activity for disk drives only */ if (drive->media == ide_disk && hwif->led_act) blk_queue_activity_fn(q, hwif->led_act, drive); And led_act is never initialized anywhere. (Looking back at older kernels it was used in the PPC arch, but was removed around 2.6.18) Unless it is all for future use off course. (this patch is against linux-2.6-block.git as off 2006/12/4) Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-12 10:22:23 +01:00
Andrew Morton	faccbd4b26	[PATCH] io-accounting: read accounting Wire up read accounting for block devices, within submit_bio(). Cc: Jay Lan <jlan@sgi.com> Cc: Shailabh Nagar <nagar@watson.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Chris Sturtivant <csturtiv@sgi.com> Cc: Tony Ernst <tee@sgi.com> Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net> Cc: David Wright <daw@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-10 09:55:41 -08:00
Akinobu Mita	c17bb49517	[PATCH] fault-injection capability for disk IO This patch provides fault-injection capability for disk IO. Boot option: fail_make_request=<probability>,<interval>,<space>,<times> <interval> -- specifies the interval of failures. <probability> -- specifies how often it should fail in percent. <space> -- specifies the size of free space where disk IO can be issued safely in bytes. <times> -- specifies how many times failures may happen at most. Debugfs: /debug/fail_make_request/interval /debug/fail_make_request/probability /debug/fail_make_request/specifies /debug/fail_make_request/times Example: fail_make_request=10,100,0,-1 echo 1 > /sys/blocks/hda/hda1/make-it-fail generic_make_request() on /dev/hda1 fails once per 10 times. Cc: Jens Axboe <axboe@suse.de> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:29:02 -08:00
Ingo Molnar	0231606785	[PATCH] hotplug CPU: clean up hotcpu_notifier() use There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn, prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus generating compiler warnings of unused symbols, hence forcing people to add #ifdefs. the compiler can skip truly unused functions just fine: text data bss dec hex filename 1624412 728710 3674856 6027978 5bfaca vmlinux.before 1624412 728710 3674856 6027978 5bfaca vmlinux.after [akpm@osdl.org: topology.c fix] Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:39 -08:00
Christoph Lameter	e18b890bb0	[PATCH] slab: remove kmem_cache_t Replace all uses of kmem_cache_t with struct kmem_cache. The patch was generated using the following script: #!/bin/sh # # Replace one string by another in all the kernel sources. # set -e for file in `find * -name ".c" -o -name ".h"\|xargs grep -l $1`; do quilt add $file sed -e "1,\$s/$1/$2/g" $file >/tmp/$$ mv /tmp/$$ $file quilt refresh done The script was run like this sh replace kmem_cache_t "struct kmem_cache" Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-07 08:39:25 -08:00
David Howells	4c1ac1b491	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 14:37:56 +00:00
Mike Christie	0e75f9063f	[PATCH] block: support larger block pc requests This patch modifies blk_rq_map/unmap_user() and the cdrom and scsi_ioctl.c users so that it supports requests larger than bio by chaining them together. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2006-12-01 10:40:55 +01:00

1 2 3

130 Commits