Convert sata_via to new EH. vt6420 used ATA_FLAG_SRST while vt6421
used ATA_FLAG_SATA_RESET. This difference seems to be an accident
rather than intended. This patch makes both flavors use
ata_bmdma_error_handler() which makes use of both SRST and SATA
hardreset. This behavior change is intended and if it breaks
anything, it should be very easy to spot.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
We don't need to use the heavier spin lock in the irq handler.
It's quite possible we can do this in nv_generic_interrupt() as well,
but I didn't take the time to pursue that train of thought.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
nf2/3 and ck804 have irq status register. Implement better irq
handler for those flavors of nv. This patch makes different flavors
of nv controllers use different irq handlers by using separate
port_info for each flavor.
This change also makes following EH and hotplug updates easier to
integrate.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Simplify interrupt constants and make NFORCE3 equal to NFORCE2.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
nv_host_desc and nv_host are used to discern different generations of
nv controllers. Kill those. New EH/hotplug implementation will use
standard port_info/ata_port_operations for that.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
sata_nv contained hotplug code which is mainly for demonstrating how
hotplug event is handled. This patch kills the demo code in
prepration for real hotplug implementation.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Reflect the fact that the Cell Broadband Engine supports 64k
pages by adding the bit to the CPU features.
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The page size encoding passed to tlbie is incorrect for new-style
large pages. This fixes it. This doesn't affect anything on older
machines because mmu_psize_defs[psize].penc (the page size encoding)
is 0 for 4k and 16M pages (the two are distinguished by a separate "is
a large page" bit).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Arnd Bergmann <arnd.bergmann@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
arm_timer() checks PF_EXITING to prevent BUG_ON(->exit_state)
in run_posix_cpu_timers().
However, for some reason it does so only for CPUCLOCK_PERTHREAD
case (which is imho wrong).
Also, this check is not reliable, PF_EXITING could be set on
another cpu without any locks/barriers just after the check,
so it can't prevent from attaching the timer to the exiting
task.
The previous patch makes this check unneeded.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
do_exit() clears ->it_##clock##_expires, but nothing prevents
another cpu to attach the timer to exiting process after that.
arm_timer() tries to protect against this race, but the check
is racy.
After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
before do_exit() calls 'schedule() local timer interrupt can find
tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
does sys_wait4) interrupted task has ->signal == NULL.
At this moment exiting task has no pending cpu timers, they were
cleanuped in __exit_signal()->posix_cpu_timers_exit{,_group}(),
so we can just return from irq.
John Stultz recently confirmed this bug, see
http://marc.theaimsgroup.com/?l=linux-kernel&m=115015841413687
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the local timer interrupt happens just after do_exit() sets PF_EXITING
(and before it clears ->it_xxx_expires) run_posix_cpu_timers() will call
check_process_timers() with tasklist_lock + ->siglock held and
check_process_timers:
t = tsk;
do {
....
do {
t = next_thread(t);
} while (unlikely(t->flags & PF_EXITING));
} while (t != tsk);
the outer loop will never stop.
Actually, the window is bigger. Another process can attach the timer
after ->it_xxx_expires was cleared (see the next commit) and the 'if
(PF_EXITING)' check in arm_timer() is racy (see the one after that).
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of fixes that should prevent crashes when using netconsole and
suspend/resume. First, netconsole poll routine shouldn't run unless the
device is up; second, the NAPI poll should be disabled during suspend.
This is only an issue on sky2, because it has to have one NAPI poll
routine for both ports on dual port boards. Normal drivers use
netif_rx_schedule_prep and that checks for netif_running.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If get_user_pages() returns less pages than what we asked for, we jump
to out_unmap which will return ERR_PTR(ret). But ret can contain a
positive number just smaller than local_nr_pages, so be sure to set it
to -EFAULT always.
Problem found and diagnosed by Damien Le Moal <damien@sdl.hitachi.co.jp>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some time ago the cdrom open routine was changed so that we call the
driver's open routine before checking to see if it is read only. However,
if we discovered that a read write open was not possible and the open
flags required a writable open, we just returned -EROFS without calling
the driver's release routine. This seems to work for most cdrom drivers,
but breaks the Powerpc iSeries virtual cdrom rather badly.
This just inserts the release call in the error path to balance the call
to "->open()" done by "open_for_data()".
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We don't clear the seek stat values in cfq_alloc_io_context(), and if
->seek_mean is unlucky enough to be set to -36 by chance, the first
invocation of cfq_update_io_seektime() will oops with a divide by zero
in do_div().
Just memset the entire cic instead of filling invididual values
independently.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If flock_lock_file() failed to allocate flock with locks_alloc_lock()
then "error = 0" is returned. Need to return some non-zero.
Signed-off-by: Pavel Emelianov <xemul@openvz.org>
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The resume bug was caused not by an early interrupt but because the idle
timeout was not being stopped on suspend. Also disable hardware IRQ's
on suspend. Will need to revisit this with hotplug?
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The hardware should be fully shut off during suspend, and the base
irq mask restored during resume.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the poll routine detects no hardware available, it needs to dequeue
it self from the network poll list. Linus didn't understand NAPI.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
It is cleaner, to not loop over both ports if only one exists.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The set power state function is cleaner if it doesn't return anything.
The only caller that could fail is in suspend() and it can check the argument
there.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Randy Dunlap <rdunlap@xenotime.net>
According to include/asm-alpha/bitops.h, only ALPHA_EV67 has hardware
hweight support, so ALPHA_EV6 needs to use GENERIC_HWEIGHT.
Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Ernst Herzberg <earny@net4u.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
shmem_rmdir() must undo the increment of i_nlink done in
shmem_get_inode() for directories, otherwise at least
IN_DELETE_SELF inotify event generation is broken.
Signed-off-by: Sergey Vlasov <vsu@altlinux.ru>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
I noticed a strange behavior in a tmpfs file system the other day, while
building packages - occasionally, and seemingly at random, make decided to
rebuild a target. However, only on tmpfs.
A file would be created, and if checked, it had a sub-second timestamp.
However, after an utimes related call where sub-seconds should be set, they
were zeroed instead. In the case that a file was created, and utimes(...,NULL)
was used on it in the same second, the timestamp on the file moved backwards.
After some digging, I found that this was being caused by tmpfs not having a
time granularity set, thus inheriting the default 1 second granularity.
Hugh adds: yes, we missed tmpfs when the s_time_gran mods went into 2.6.11.
Unfortunately, the granularity of CURRENT_TIME, often used in filesystems,
does not match the default granularity set by alloc_super. A few more such
discrepancies have been found, but this is the most important to fix now.
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This fixes two independent problems: it would not save the PCI state on
suspend (and thus try to resume a nonexistent state on resume), and
while shut off, if an interrupt happened on the same shared irq, the irq
handler would react very badly to the interrupt status being an invalid
all-ones state.
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
For a legacy ATA controller, libata registers two separate host sets.
There was no connection between the two hosts making it impossible to
traverse all ports related to the controller. This patch adds
host_set->next which points to the second host_set and makes
ata_pci_remove_one() remove all associated host_sets.
* On device removal, all ports hanging off the device are properly
detached. Prior to this patch, ports on the first host_set weren't
detached casuing oops on driver unloading.
* On device removal, both host_sets are properly freed
This will also be used by new power management code to suspend and
resume all ports of a controller. host_set/port representation will
be improved to handle legacy controllers better and this host_set
linking will go away with it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Depending on timing, ata_scsi_dev_rescan() might encounter a device
which is enabled but not yet attached to sdev. On such cases, the
original code caused oops. This patch makes ata_scsi_dev_rescan()
rescan only device which are attached to sdevs.
While at it, properly indent leading comment and add description about
how it's synchronized with sdev attach/detach.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
SIEN on some 3112 controllers doesn't mask SATA IRQ properly. IRQ
stays asserted even after SIEN is masked and IRQ is acked. Also, even
while frozen, any SATA PHY event including hardreset raises SATA IRQ.
Clearing SError seems to be the only way to deassert SATA IRQ.
This patch makes sil_host_intr() clear SError on SATA IRQs and ignore
SATA IRQs reported while frozen so that hardreset doesn't trigger
hotplug event (which ends up hardresetting again).
In such cases, the port still gets re-frozen to minimize the danger of
screaming interrupts. This results in one nil EH repeat on
controllers with broken SIEN but other than that does no harm.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
ATA_EH_REVALIDATE should be cleared after all devices on the target
port have been revalidated. Fix ata_eh_revalidate_and_attach()
accordingly.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Short-circuit interrupt handling if BMDMA2 is reported as 0xffffffff
indicating device removal.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
From: Aki M Nyrhinen <anyrhine@cs.helsinki.fi>
IMHO the current fix to the problem (in_flight underflow in reno)
is incorrect. it treats the symptons but ignores the problem. the
problem is timing out packets other than the head packet when we
don't have sack. i try to explain (sorry if explaining the obvious).
with sack, scanning the retransmit queue for timed out packets is
fine because we know which packets in our retransmit queue have been
acked by the receiver.
without sack, we know only how many packets in our retransmit queue the
receiver has acknowledged, but no idea which packets.
think of a "typical" slow-start overshoot case, where for example
every third packet in a window get lost because a router buffer gets
full.
with sack, we check for timeouts on those every third packet (as the
rest have been sacked). the packet counting works out and if there
is no reordering, we'll retransmit exactly the packets that were
lost.
without sack, however, we check for timeout on every packet and end up
retransmitting consecutive packets in the retransmit queue. in our
slow-start example, 2/3 of those retransmissions are unnecessary. these
unnecessary retransmissions eat the congestion window and evetually
prevent fast recovery from continuing, if enough packets were lost.
Signed-off-by: David S. Miller <davem@davemloft.net>
A soft lockup existed in the handling of ack vector records.
Specifically, when a tail of the list of ack vector records was
removed, it was possible to end up iterating infinitely on an element
of the tail.
Signed-off-by: Andrea Bittau <a.bittau@cs.ucl.ac.uk>
Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
People have been reporting that PPP connections over ptys, such as
used with PPTP, will hang randomly when transferring large amounts of
data, for instance in http://bugzilla.kernel.org/show_bug.cgi?id=6530.
I have managed to reproduce the problem, and the patch below fixes the
actual cause.
The problem is not in fact in ppp_async.c but in n_tty.c. What
happens is that when pptp reads from the pty, we call read_chan() in
drivers/char/n_tty.c on the master side of the pty. That copies all
the characters out of its buffer to userspace and then calls
check_unthrottle(), which calls the pty unthrottle routine, which
calls tty_wakeup on the slave side, which calls ppp_asynctty_wakeup,
which calls tasklet_schedule. So far so good. Since we are in
process context, the tasklet runs immediately and calls
ppp_async_process(), which calls ppp_async_push, which calls the
tty->driver->write function to send some more output.
However, tty->driver->write() returns zero, because the master
tty->receive_room is still zero. We haven't returned from
check_unthrottle() yet, and read_chan() only updates tty->receive_room
_after_ calling check_unthrottle. That means that the driver->write
call in ppp_async_process() returns 0. That would be fine if we were
going to get a subsequent wakeup call, but we aren't (we just had it,
and the buffer is now empty).
The solution is for n_tty.c to update tty->receive_room _before_
calling the driver unthrottle routine. The patch below does this.
With this patch I was able to transfer a 900MB file over a PPTP
connection (taking about 25 minutes), whereas without the patch the
connection would always stall in under a minute.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>