* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: fix reservations in btrfs_page_mkwrite
Btrfs: advance window_start if we're using a bitmap
btrfs: mask out gfp flags in releasepage
Btrfs: fix enospc error caused by wrong checks of the chunk
Btrfs: do not defrag a file partially
Btrfs: fix warning for 32-bit build of fs/btrfs/check-integrity.c
Btrfs: use cluster->window_start when allocating from a cluster bitmap
Btrfs: Check for NULL page in extent_range_uptodate
btrfs: Fix busyloops in transaction waiting code
Btrfs: make sure a bitmap has enough bytes
Btrfs: fix uninit warning in backref.c
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm: (31 commits)
ARM: 7304/1: ioremap: fix boundary check when reusing static mapping
ARM: 7301/1: Rename the T() macro to TUSER() to avoid namespace conflicts
ARM: 7299/1: ftrace: clear zero bit in reported IPs for Thumb-2
ARM: 7298/1: realview: fix mapping of MPCore private memory region
PCMCIA: fix sa1111 oops on remove
ARM: 7288/1: mach-sa1100: add missing module_init() call
ARM: 7297/1: smp_twd: make sure timer is stopped before registering it
ARM: 7296/1: proc-v7.S: remove HARVARD_CACHE preprocessor guards
ARM: 7295/1: cortex-a7: move proc_info out of !CONFIG_ARM_LPAE block
ARM: 7293/1: logical_cpu_map: decouple CPU mapping from SMP
ARM: 7291/1: cache: assume 64-byte L1 cachelines for ARMv7 CPUs
ARM: 7290/1: vmlinux.lds.S: align the exception fixup table to a 4-byte boundary
ARM: 7289/1: vmlinux.lds.S: do not hardcode cacheline size as 32 bytes
MFD: ucb1x00-ts: fix resume failure
MFD: ucb1x00-core: fix gpiolib direction_output handling
MFD: ucb1x00-core: fix missing restore of io output data on resume
MFD: mcp-core: fix mcp_priv() to be more type safe
MFD: mcp-core: fix complaints from the genirq layer
Revert "ARM: sa11x0: Implement autoloading of codec and codec pdata for mcp bus."
Revert "ARM: sa1100: Refactor mcp-sa11x0 to use platform resources."
...
Fix up conflict due to arch/arm/mach-mx5/Kconfig having been merged into
mach-imx5 (commit 784a90c0a7: "ARM i.MX: Merge i.MX5 support into
mach-imx"), but the ARM_L1_CACHE_SHIFT_6 entry was moved to be driven by
the CPU_V7 logic from it in the old location in rmk's branch (commit
a092f2b153: "ARM: 7291/1: cache: assume 64-byte L1 cachelines for
ARMv7 CPUs").
AT91 needed reset fixes which resulted in some minor code refactoring,
it also adds a feature-removal for one of their platforms for 3.4.
The USB patches have been acked by Greg K-H.
i.MX and ux500 both have some minor fixes, nothing controversial.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPIlhOAAoJEIwa5zzehBx3l+UP/1LcalLOTEU0bthHNBUoQYEQ
drDqaFlnBsDryuOzbtGGO41jRySqQWQfQTiVThrEpE3ZUH+NLih5cDRk2sUIzdMd
8tFWMN7HJnvaA0LT6ODcnC4HoRUBWieYLrPnjA+rlUPFCY/vJQ5010xFhZs2nGBo
Y5AOQ6Fun/1z5P4V5u+6GzKPPsJZPaCqEPwLJoc5yCJgvfO6RnAFuICv0F183lMh
TYVkZISF3GYdD/wWQZuytYJRj6IB3mV1GCe0q1fRD9E49929mOnC4SZ3gwa3nF4D
9IAJviVq+YtEIwTw4H+DJ+k7NXS7GR+F6hxcCrWygEYlOgFqHlRYIL35pxXWq+Sm
s0jMxC3RdyXmbfhuwi4o607OwMQkwVyNM+N5xUGaww4Inn3Lw6VPaqyQRtW6Bhiz
o/fyuOALFt7FDoLII02BJgFMdoeEV1gfXlTTdiz7yEekE0h9ng0A3VSRQRph3kLn
CScQZyOFGrFFm9UNM0T0FrWlshd+ZU1yQGAdZHIX/Fv/euLNhXDnF2bKxAx7CiTh
wcFezf7vjXjs/iTV+ZsDBe0oKWGmdvxVJksDV6X74DlMfZd9AYi+ntvXzvWrcHdp
C6wmTtePlSbEMem3RhWEQD8EeTy/qwzPRwLVNqlyT7QURhqCgJlbx133gMEt6113
wd95I9VZjKE6KSKh7BsS
=nS4F
-----END PGP SIGNATURE-----
Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
arm-soc fixes for 3.3-rc:
AT91 needed reset fixes which resulted in some minor code refactoring,
it also adds a feature-removal for one of their platforms for 3.4.
The USB patches have been acked by Greg K-H.
i.MX and ux500 both have some minor fixes, nothing controversial.
* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
arch/arm/mach-imx/mach-mx53_ard.c: add missing iounmap
ARM: imx: iomux-v1.h: Fix build error due to __init annotation
ARM: at91: Fix at91sam9g45 and at91cap9 reset
ARM: at91: make rstc soc independent
ARM: at91: introduce AT91_SAM9_ALT_RESET to select the at91sam9 alternative reset
ARM: at91: merge at91cap9_ddrsdr.h in at91sam9_ddrsdr.h
ARM: at91: fix cap9 ddrsdr register
ARM/USB: at91/ohci-at91: rename vbus_pin_inverted to vbus_pin_active_low
USB: at91: fix clk_get error handling
ARM: at91: removal of CAP9 SoC family
ARM: at91: fix at91rm9200 soc subtype handling
mach-ux500: no MMC_CAP_SD_HIGHSPEED on Snowball
mach-ux500: enable ARM errata 764369
mach-ux500: do not override outer.inv_all
mach-ux500: musb: now musb is always in OTG mode
ARM: imx6: add missing twd_clk for imx6q clock
Smatch complains that we have some inconsistent NULL checking.
If "task" were NULL then it would lead to a NULL dereference
later. We can remove this test because earlier on in the
function we have:
if (!task)
task = current;
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Clemens Ladisch <clemens@ladisch.de>
Link: http://lkml.kernel.org/r/20120128105246.GA25092@elgon.mountain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Can be necessary if an inode gets deleted (through -ENOSPC) before being
written. Might be better to move this into logfs_write_rec(), but for
now go with the stupid&safe patch.
Signed-off-by: Joern Engel <joern@logfs.org>
This is a bad one. I wonder whether we were so far protected by
no_free_segments(sb) usually being smaller than LOGFS_NO_AREAS.
Found by Dan Carpenter <dan.carpenter@oracle.com> using smatch.
Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
LogFS sets PG_private flag to indicate a pined page. We assumed that
marking a page as private is enough to ensure its existence. But
instead it is necessary to hold a reference count to the page.
The change resolves the following BUG
BUG: Bad page state in process flush-253:16 pfn:6a6d0
page flags: 0x100000000000808(uptodate|private)
Suggested-and-Acked-by: Joern Engel <joern@logfs.org>
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
caif is a subsystem and as such it needs to register with
register_pernet_subsys instead of register_pernet_device.
Among other problems using register_pernet_device was resulting in
net_generic being called before the caif_net structure was allocated.
Which has been causing net_generic to fail with either BUG_ON's or by
return NULL pointers.
A more ugly problem that could be caused is packets in flight why the
subsystem is shutting down.
To remove confusion also remove the cruft cause by inappropriately
trying to fix this bug.
With the aid of the previous patch I have tested this patch and
confirmed that using register_pernet_subsys makes the failure go away as
it should.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Sjur Brændeland <sjur.brandeland@stericsson.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By definition net_generic should never be called when it can return
NULL. Fail conspicously with a BUG_ON to make it clear when people mess
up that a NULL return should never happen.
Recently there was a bug in the CAIF subsystem where it was registered
with register_pernet_device instead of register_pernet_subsys. It was
erroneously concluded that net_generic could validly return NULL and
that net_assign_generic was buggy (when it was just inefficient).
Hopefully this BUG_ON will prevent people to coming to similar erroneous
conclusions in the futrue.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Tested-by: Sasha Levin <levinsasha928@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use virtio_mb() to make sure the available index to be exposed before
checking the the avail event. Otherwise we may get stale value of
avail event in guest and never kick the host after.
Note: this fixes a bug introduced by ee7cd8981e.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: stable@kernel.org
Note: this fixes a bug introduced recently in
7b21e34fd1.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since commit 576d2f2525 "ARM: add
generic ioremap optimization by reusing static mappings" ioremap()
is trying to reuse existing static mapping when possible.
The condition checking boundaries of the requested and existing
mappings didn't take in-page offset into consideration though,
which lead to obscure and hard to debug problems when requested
mapping crossed end of the static one.
Signed-off-by: Pawel Moll <pawel.moll@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Fix fast memory registration opcode in local invalidate completion.
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Donald Wood <Donald.E.Wood@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Zero high order word of fast memory registration (FMR) length field.
FMR length field is 32 bits, so high word should always be zero.
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Donald Wood <Donald.E.Wood@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
After reporting a new connection request to user space, the rdma_ucm
will discard subsequent events until the user has associated a user
space idenfier with the kernel cm_id. This is needed to avoid
reporting a reject/disconnect event to the user for a request that
they may not have processed.
The user space identifier is set once the user tries to accept the
connection request. However, the following race exists in ucma_accept():
ctx->uid = cmd.uid;
<events may be reported now>
ret = rdma_accept(ctx->cm_id, ...);
Once ctx->uid has been set, new events may be reported to the user.
While the above mentioned race is avoided, there is an issue that the
user _may_ receive a reject/disconnect event if rdma_accept() fails,
depending on when the event is processed. To simplify the use of
rdma_accept(), discard all events unless rdma_accept() succeeds.
This problem was discovered based on questions from Roland Dreier
<roland@purestorage.com>.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 8d4548f2b ("IB/qib: Default some module parameters optimally")
introduced an issue with older root complexes. They cannot handle the
pcie_caps of 0x51 (MaxReadReq 4096, MaxPayload=256).
A typical diagnostic in this situation reported by syslog contains
the text:
[PCIe Poisoned TLP][Send DMA memory read]
Restore the module paramter default to zero with will avoid any
changes in the root complex.
Reviewed-by: Mark Debbage <mark.debbage@qlogic.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
alloc_dummy_hdrq() is called with locks held and thus should not use
GFP_KERNEL.
The semantic patch that makes this report is available in
scripts/coccinelle/locks/call_kern.cocci.
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Make sure all exit paths from this function unlock everything.
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Set a reject flag, when sending MPA reject message to inform the peer
that the application has rejected the connection.
Signed-off-by: Tatyana Nikolova <Tatyana.E.Nikolova@intel.com>
Signed-off-by: Faisal Latif <Faisal.Latif@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
"dentry" is a valid pointer. "*dentry" was intended.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@qlogic.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
We have just been investigating kernel panics related to
cq->ibcq.event_handler() completion calls. The problem is that
ib_destroy_qp() fails with -EBUSY.
Further investigation revealed qp->usecnt is not initialized. This
counter was introduced in linux-3.2 by commit 0e0ec7e063
("RDMA/core: Export ib_open_qp() to share XRC TGT QPs") but it only
gets initialized for IB_QPT_XRC_TGT, but it is checked in
ib_destroy_qp() for any QP type.
Fix this by initializing qp->usecnt for every QP we create.
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: Sven Breuner <sven.breuner@itwm.fraunhofer.de>
[ Initialize qp->usecnt in uverbs too. - Sean ]
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Commit 5a05a8200a ("davinci_emac:
use an unique MDIO bus name") introduced during the v3.3 merge
window updated the davinci mdio bus name to make it unique.
Update the bus name in board files which use DaVinci MDIO bus
to match the new name. Without this PHY is not detected with
error like:
PHY 0:01 not found
net eth0: could not connect to phy 0:01
Tested on DM365 and DA850 EVMs.
Cc: Florian Fainelli <florian@openwrt.org>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
* commit 'v3.3-rc1': (9775 commits)
Linux 3.3-rc1
x86, syscall: Need __ARCH_WANT_SYS_IPC for 32 bits
qnx4: don't leak ->BitMap on late failure exits
qnx4: reduce the insane nesting in qnx4_checkroot()
qnx4: di_fname is an array, for crying out loud...
KEYS: Permit key_serial() to be called with a const key pointer
keys: fix user_defined key sparse messages
ima: fix cred sparse warning
uml: fix compile for x86-64
MPILIB: Add a missing ENOMEM check
tpm: fix (ACPI S3) suspend regression
nvme: fix merge error due to change of 'make_request_fn' fn type
xen: using EXPORT_SYMBOL requires including export.h
gpio: tps65910: Use correct offset for gpio initialization
acpi/apei/einj: Add extensions to EINJ from rev 5.0 of acpi spec
intel_idle: Split up and provide per CPU initialization func
ACPI processor: Remove unneeded variable passed by acpi_processor_hotadd_init V2
tg3: Fix single-vector MSI-X code
openvswitch: Fix multipart datapath dumps.
ipv6: fix per device IP snmp counters
...
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (31 commits)
gma500: Fix suspend/resume functions
drm/exynos: fixed pm feature for fimd module.
MAINTAINERS: added maintainer entry for Exynos DRM Driver.
drm/exynos: fixed build dependency for DRM_EXYNOS_FIMD
drm/exynos: fix build dependency for DRM_EXYNOS_HDMI
drm/exynos: use release_mem_region instead of release_resource
agp: fix scratch page cleanup
drm/i915: fixup forcewake spinlock fallout in drpc debugfs function
drm/i915: debugfs: show semaphore registers also on gen7
drm/i915: allow userspace forcewake references also on gen7
drm/i915: Re-enable gen7 RC6 and GPU turbo after resume.
drm/i915: Correct debugfs printout for RC1e.
Revert "drm/i915: Work around gen7 BLT ring synchronization issues."
drm/i915: rip out the HWSTAM missed irq workaround
drm/i915: paper over missed irq issues with force wake voodoo
drm/i915: Hold gt_lock across forcewake register reads
drm/i915: Hold gt_lock during reset
drm/i915: Move reset forcewake processing to gen6_do_reset
drm/i915: protect force_wake_(get|put) with the gt_lock
drm/i915: convert force_wake_get to func pointer in the gpu reset code
...
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix silent output on Haier W18 laptop
ALSA: hda: set mute led polarity for laptops with buggy BIOS based on SSID
ALSA: hda - Fix silent output on ASUS A6Rp
ALSA: Fix memory leak on error in snd_compr_set_params()
ALSA: ymfpci - Don't create invalid PCM & mixers when AC97 doesn't support
Josef fixed btrfs_page_mkwrite to properly release reserved
extents if there was an error. But if we fail to get a reservation
and we fail to dirty the inode (for ENOSPC reasons), we'll end up
trying to release a reservation we never had.
This makes sure we only release if we were able to reserve.
Signed-off-by: Chris Mason <chris.mason@oracle.com>
The user reports that he needs to add model=auto for audio to
work properly. In fact, since node 0x15 is not even a pin node,
the existing fixup is definitely wrong. Relevant information can
be found in the buglink below.
Cc: stable@kernel.org (3.2+)
BugLink: https://bugs.launchpad.net/bugs/918254
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Both the suspend and resume functions incorrectly set psbfb =
to_psb_fb(NULL) outside of the loop over all of the framebuffers. Fix
this by moving the assignment of psbfb inside the loop and removing the
initialisation of fb.
Signed-off-by: Ryan Mallon <rmallon@gmail.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This issue happens under the following conditions:
1. preemption is off
2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined
3. RT scheduling class
4. SMP system
Sequence is as follows:
1.suppose current task is A. start schedule()
2.task A is enqueued pushable task at the entry of schedule()
__schedule
prev = rq->curr;
...
put_prev_task
put_prev_task_rt
enqueue_pushable_task
4.pick the task B as next task.
next = pick_next_task(rq);
3.rq->curr set to task B and context_switch is started.
rq->curr = next;
4.At the entry of context_swtich, release this cpu's rq->lock.
context_switch
prepare_task_switch
prepare_lock_switch
raw_spin_unlock_irq(&rq->lock);
5.Shortly after rq->lock is released, interrupt is occurred and start IRQ context
6.try_to_wake_up() which called by ISR acquires rq->lock
try_to_wake_up
ttwu_remote
rq = __task_rq_lock(p)
ttwu_do_wakeup(rq, p, wake_flags);
task_woken_rt
7.push_rt_task picks the task A which is enqueued before.
task_woken_rt
push_rt_tasks(rq)
next_task = pick_next_pushable_task(rq)
8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
lowest_rq can be the remote rq.
(But,If preemption is on, double_lock_balance always return 1 and it
does't happen.)
push_rt_task
find_lock_lowest_rq
if (double_lock_balance(rq, lowest_rq))..
9.find_lock_lowest_rq return the available rq. task A is migrated to
the remote cpu/rq.
push_rt_task
...
deactivate_task(rq, next_task, 0);
set_task_cpu(next_task, lowest_rq->cpu);
activate_task(lowest_rq, next_task, 0);
10. But, task A is on irq context at this cpu.
So, task A is scheduled by two cpus at the same time until restore from IRQ.
Task A's stack is corrupted.
To fix it, don't migrate an RT task if it's still running.
Signed-off-by: Chanho Min <chanho.min@lge.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/CAOAMb1BHA=5fm7KTewYyke6u-8DP0iUuJMpgQw54vNeXFsGpoQ@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch fixes the sampling interrupt throttling mechanism.
It was broken in v3.2. Events were not being unthrottled. The
unthrottling mechanism required that events be checked at each
timer tick.
This patch solves this problem and also separates:
- unthrottling
- multiplexing
- frequency-mode period adjustments
Not all of them need to be executed at each timer tick.
This third version of the patch is based on my original patch +
PeterZ proposal (https://lkml.org/lkml/2012/1/7/87).
At each timer tick, for each context:
- if the current CPU has throttled events, we unthrottle events
- if context has frequency-based events, we adjust sampling periods
- if we have reached the jiffies interval, we multiplex (rotate)
We decoupled rotation (multiplexing) from frequency-mode sampling
period adjustments. They should not necessarily happen at the same
rate. Multiplexing is subject to jiffies_interval (currently at 1
but could be higher once the tunable is exposed via sysfs).
We have grouped frequency-mode adjustment and unthrottling into the
same routine to minimize code duplication. When throttled while in
frequency mode, we scan the events only once.
We have fixed the threshold enforcement code in __perf_event_overflow().
There was a bug whereby it would allow more than the authorized rate
because an increment of hwc->interrupts was not executed at the right
place.
The patch was tested with low sampling limit (2000) and fixed periods,
frequency mode, overcommitted PMU.
On a 2.1GHz AMD CPU:
$ cat /proc/sys/kernel/perf_event_max_sample_rate
2000
We set a rate of 3000 samples/sec (2.1GHz/3000 = 700000):
$ perf record -e cycles,cycles -c 700000 noploop 10
$ perf report -D | tail -21
Aggregated stats:
TOTAL events: 80086
MMAP events: 88
COMM events: 2
EXIT events: 4
THROTTLE events: 19996
UNTHROTTLE events: 19996
SAMPLE events: 40000
cycles stats:
TOTAL events: 40006
MMAP events: 5
COMM events: 1
EXIT events: 4
THROTTLE events: 9998
UNTHROTTLE events: 9998
SAMPLE events: 20000
cycles stats:
TOTAL events: 39996
THROTTLE events: 9998
UNTHROTTLE events: 9998
SAMPLE events: 20000
For 10s, the cap is 2x2000x10 = 40000 samples.
We get exactly that: 20000 samples/event.
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: <stable@kernel.org> # v3.2+
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120126160319.GA5655@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
try_to_wake_up() has a problem which may change status from TASK_DEAD to
TASK_RUNNING in race condition with SMI or guest environment of virtual
machine. As a result, exited task is scheduled() again and panic occurs.
Here is the sequence how it occurs:
----------------------------------+-----------------------------
|
CPU A | CPU B
----------------------------------+-----------------------------
TASK A calls exit()....
do_exit()
exit_mm()
down_read(mm->mmap_sem);
rwsem_down_failed_common()
set TASK_UNINTERRUPTIBLE
set waiter.task <= task A
list_add to sem->wait_list
:
raw_spin_unlock_irq()
(I/O interruption occured)
__rwsem_do_wake(mmap_sem)
list_del(&waiter->list);
waiter->task = NULL
wake_up_process(task A)
try_to_wake_up()
(task is still
TASK_UNINTERRUPTIBLE)
p->on_rq is still 1.)
ttwu_do_wakeup()
(*A)
:
(I/O interruption handler finished)
if (!waiter.task)
schedule() is not called
due to waiter.task is NULL.
tsk->state = TASK_RUNNING
:
check_preempt_curr();
:
task->state = TASK_DEAD
(*B)
<--- set TASK_RUNNING (*C)
schedule()
(exit task is running again)
BUG_ON() is called!
--------------------------------------------------------
The execution time between (*A) and (*B) is usually very short,
because the interruption is disabled, and setting TASK_RUNNING at (*C)
must be executed before setting TASK_DEAD.
HOWEVER, if SMI is interrupted between (*A) and (*B),
(*C) is able to execute AFTER setting TASK_DEAD!
Then, exited task is scheduled again, and BUG_ON() is called....
If the system works on guest system of virtual machine, the time
between (*A) and (*B) may be also long due to scheduling of hypervisor,
and same phenomenon can occur.
By this patch, do_exit() waits for releasing task->pi_lock which is used
in try_to_wake_up(). It guarantees the task becomes TASK_DEAD after
waking up.
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120117174031.3118.E1E9C6FF@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch adds the TCO Watchdog DeviceIDs for the Intel Lynx Point PCH.
Signed-off-by: Seth Heasley <seth.heasley@intel.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Let the watchdog core to check the valid value range of min_timeout/max_timeout.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Correct typo "unexpectdly" to "unexpectedly" in pnx4008_wdt.c
and stmp3xxx_wdt.c
Signed-off-by: Masanari Iida<standby24x7@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
While receiving WDIOS_DISABLECARD option for WDIOC_SETOPTIONS command,
call wafwdt_stop() to disable watchdog.
Call wafwdt_start() while receiving WDIOS_ENABLECARD option.
Current code has reverse behavior.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
While receiving WDIOS_DISABLECARD option for WDIOC_SETOPTIONS command,
call wm8350_wdt_stop() to disable watchdog.
Call wm8350_wdt_start() while receiving WDIOS_ENABLECARD option.
Current code has reverse behavior.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
It is only used in this driver, so no need to make the symbol global.
Signed-off-by: Axel Lin <axel.lin@gmail.com>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>