there's a new ptrace arch level feature in .28:
config X86_PTRACE_BTS
bool "Branch Trace Store"
it has broken fork() handling: the old DS area gets copied over into
a new task without clearing it.
Fixes exist but they came too late:
c5dee61: x86, bts: memory accounting
bf53de9: x86, bts: add fork and exit handling
and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.
Debugged-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Tested-by: Helge Deller <deller@gmx.de>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure. Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed. This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier. The edac_poller will wake up sleepers when
it is found.
EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices. They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device. This is exactly
where edac_poller and rmmod would have a great chance to deadlock.
In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,
device_ctls_mutex would have already been grabbed by
edac_device_del_device(). So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.
edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex. Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.
Moreover, an edac_dev.work should check first if it is being removed. If
this is the case, then it should bail out immediately. Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed. The current edac_dev.op_state can
be used to serve this purpose.
The original deadlock problem and the solution have been witnessed and
tested on actual hardware. Without the solution, rmmod an edac driver
would result in below deadlock:
root@localhost:/root> rmmod mv64x60_edac
EDAC DEBUG: mv64x60_dma_err_remove()
EDAC DEBUG: edac_device_del_device()
EDAC DEBUG: find_edac_device_by_dev()
(hang for a moment)
INFO: task edac-poller:2030 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
edac-poller D 00000000 0 2030 2
Call Trace:
[df159dc0] [c0071e3c] free_hot_cold_page+0x17c/0x304 (unreliable)
[df159e80] [c000a024] __switch_to+0x6c/0xa0
[df159ea0] [c03587d8] schedule+0x2f4/0x4d8
[df159f00] [c03598a8] __mutex_lock_slowpath+0xa0/0x174
[df159f40] [e1030434] edac_device_workq_function+0x28/0xd8 [edac_core]
[df159f60] [c003beb4] run_workqueue+0x114/0x218
[df159f90] [c003c674] worker_thread+0x5c/0xc8
[df159fd0] [c004106c] kthread+0x5c/0xa0
[df159ff0] [c0013538] original_kernel_thread+0x44/0x60
INFO: task rmmod:2062 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
rmmod D 0ff2c9fc 0 2062 1839
Call Trace:
[df119c00] [c0437a74] 0xc0437a74 (unreliable)
[df119cc0] [c000a024] __switch_to+0x6c/0xa0
[df119ce0] [c03587d8] schedule+0x2f4/0x4d8
[df119d40] [c03591dc] schedule_timeout+0xb0/0xf4
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If cgroup_get_rootdir() failed, free_cg_links() will be called in the
failure path, but tmp_cg_links hasn't been initialized at that time.
I introduced this bug in the 2.6.27 merge window.
Signed-off-by: Li Zefan <lizf@cn.fujitsu.com>
Acked-by: Serge Hallyn <serue@us.ibm.com>
Cc: Paul Menage <menage@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
During test of the w1-gpio driver i found that in "w1.c:679
w1_slave_found()" the device id is converted to little-endian with
"cpu_to_le64()", but its not converted back to cpu format in "w1_io.c:293
w1_reset_select_slave()".
Based on a patch created by Andreas Hummel.
[akpm@linux-foundation.org: remove unneeded cast]
Reported-by: Andreas Hummel <andi_hummel@gmx.de>
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch for the rtc-isl1208 driver makes it reject invalid dates.
Signed-off-by: Chris Elston <celston@katalix.com>
[a.zummo@towertech.it: added comment explaining the check]
Signed-off-by: Alessandro Zummo <a.zummo@towertech.it>
Cc: Hebert Valerio Riedel <hvr@gnu.org>
Cc: David Brownell <david-b@pacbell.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fix a NULL pointer dereference that would occur if the video decoder tied to
the em28xx supports the VIDIOC_INT_RESET call (for example: the cx25840 driver)
Signed-off-by: Devin Heitmueller <dheitmueller@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This check was introduced with the logic the wrong way around.
Fixes regression: http://bugzilla.kernel.org/show_bug.cgi?id=12216
Tested-by: François Valenduc <francois.valenduc@tvcablenet.be>
Signed-off-by: Dave Airlie <airlied@redhat.com>
In each case, if the NULL test is necessary, then the dereference should be
moved below the NULL test.
The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@@
type T;
expression E;
identifier i,fld;
statement S;
@@
- T i = E->fld;
+ T i;
... when != E
when != i
if (E == NULL) S
+ i = E->fld;
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
The way the code is written it was assuming dshd has the function of a
hypothetical dshw instruction ...
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Impact: Prevent kernel crash with posix timer clockid CLOCK_MONOTONIC_RAW
commit 2d42244ae7 (clocksource:
introduce CLOCK_MONOTONIC_RAW) introduced a new clockid, which is only
available to read out the raw not NTP adjusted system time.
The above commit did not prevent that a posix timer can be created
with that clockid. The timer_create() syscall succeeds and initializes
the timer to a non existing hrtimer base. When the timer is deleted
either by timer_delete() or by the exit() cleanup the kernel crashes.
Prevent the creation of timers for CLOCK_MONOTONIC_RAW by setting the
posix clock function to no_timer_create which returns an error code.
Reported-and-tested-by: Eric Sesterhenn <snakebyte@gmx.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs:
fs/9p: change simple_strtol to simple_strtoul
9p: convert d_iname references to d_name.name
9p: Remove potentially bad parameter from function entry debug print.
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: fix resume (S2R) broken by Intel microcode module, on A110L
x86 gart: don't complain if no AMD GART found
AMD IOMMU: panic if completion wait loop fails
AMD IOMMU: set cmd buffer pointers to zero manually
x86: re-enable MCE on secondary CPUS after suspend/resume
AMD IOMMU: allocate rlookup_table with __GFP_ZERO
Impact: fix deadlock
This is in response to the following bug report:
Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12100
Subject : resume (S2R) broken by Intel microcode module, on A110L
Submitter : Andreas Mohr <andi@lisas.de>
Date : 2008-11-25 08:48 (19 days old)
Handled-By : Dmitry Adamushko <dmitry.adamushko@gmail.com>
[ The deadlock scenario has been discovered by Andreas Mohr ]
I think I might have a logical explanation why the system:
(http://bugzilla.kernel.org/show_bug.cgi?id=12100)
might hang upon resuming, OTOH it should have likely hanged each and every time.
(1) possible deadlock in microcode_resume_cpu() if either 'if' section is
taken;
(2) now, I don't see it in spec. and can't experimentally verify it (newer
ucodes don't seem to be available for my Core2duo)... but logically-wise, I'd
think that when read upon resuming, the 'microcode revision' (MSR 0x8B) should
be back to its original one (we need to reload ucode anyway so it doesn't seem
logical if a cpu doesn't drop the version)... if so, the comparison with
memcmp() for the full 'struct cpu_signature' is wrong... and that's how one of
the aforementioned 'if' sections might have been triggered - leading to a
deadlock.
Obviously, in my tests I simulated loading/resuming with the ucode of the same
version (just to see that the file is loaded/re-loaded upon resuming) so this
issue has never popped up.
I'd appreciate if someone with an appropriate system might give a try to the
2nd patch (titled "fix a comparison && deadlock...").
In any case, the deadlock situation is a must-have fix.
Reported-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Tested-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: move the BTS buffer accounting to the mlock bucket
Add alloc_locked_buffer() and free_locked_buffer() functions to mm/mlock.c
to kalloc a buffer and account the locked memory to current.
Account the memory for the BTS buffer to the tracer.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Impact: introduce new ptrace facility
Add arch_ptrace_untrace() function that is called when the tracer
detaches (either voluntarily or when the tracing task dies);
ptrace_disable() is only called on a voluntary detach.
Add ptrace_fork() and arch_ptrace_fork(). They are called when a
traced task is forked.
Clear DS and BTS related fields on fork.
Release DS resources and reclaim memory in ptrace_untrace(). This
releases resources already when the tracing task dies. We used to do
that when the traced task dies.
Signed-off-by: Markus Metzger <markus.t.metzger@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Since v9ses->uid is unsigned, it would seem better to use simple_strtoul that
simple_strtol.
A simplified version of the semantic patch that makes this change is as
follows: (http://www.emn.fr/x-info/coccinelle/)
// <smpl>
@r2@
long e;
position p;
@@
e = simple_strtol@p(...)
@@
position p != r2.p;
type T;
T e;
@@
e =
- simple_strtol@p
+ simple_strtoul
(...)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
d_iname is rubbish for long file names.
Use d_name.name in printks instead.
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] mpt fusion: clear list of outstanding commands on host reset
[SCSI] scsi_lib: only call scsi_unprep_request() under queue lock
[SCSI] ibmvstgt: move crq_queue_create to the end of initialization
[SCSI] libiscsi REGRESSION: fix passthrough support with older iscsi tools
[SCSI] aacraid: disable Dell Percraid quirk on Adaptec 2200S and 2120S
* 'drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm/i915: GEM on PAE has problems - disable it for now.
drm/i915: Don't return busy for buffers left on the flushing list.
There will be a Oops or frequent underrun messages when playing music with
omap soc driver, this is because a data region is incorretly sized, other data
region will be overwriten when writing to this data region.
Signed-off-by: Stanley Miao <stanley.miao@windriver.com>
Acked-by: Jarkko Nikula <jarkko.nikula@nokia.com>
Cc: stable@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The previous commit re-enabled hp_nid setup for IDT92HD73*, but
it's unneeded indeed for Dell laptops that have multiple headphones.
Setting the extra hp_nid results in a non-working "Headpohne" mixer
control. Thus hp_nid should be 0 for these dell models.
Also, the automatic addition of hp_nid should check whether it's
a dual-HP model or not. For dual-HPs, the pins are already checked
by the early workaround.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
The ACPI interpreter usually runs with irqs enabled.
However, during suspend/resume it runs with
irqs disabled to evaluate _GTS/_BFS, as well as
by irqrouter_resume() which evaluates _CRS, _PRS, _SRS.
http://bugzilla.kernel.org/show_bug.cgi?id=12252
Signed-off-by: Wu Fengguang <wfg@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
acpi_early_init() was changed to over-write the cmdline param,
making it really inconvenient to set debug flags at boot-time.
Also,
This sets the default level to "info", which is what all the ACPI
drivers use. So to enable messages from drivers, you only have to
supply the "layer" (a.k.a. "component"). For non-"info" ACPI core
and ACPI interpreter messages, you have to supply both level and
layer masks, as before.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Added the model without the jack-detection for some desktops that
have really no jack-detection. The recent driver caused regressions
regarding the sound output on such machines.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
This reverts commit 07f455f779.
ALSA: hda: removed unneeded hp_nid references
Removed unneeded hp_nid references for 92hd73xx codec family.
This caused the silent output on some Intel desktops due to missing
routing of widget 0x0a and 0x0d.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Fix problem that deleting multiple logical drives could cause a panic.
It fixes a panic which can be easily reproduced in the following way: Just
create several "arrays," each with multiple logical drives via hpacucli,
then delete the first array, and it will blow up in deregister_disk(), in
the call to get_host() when it tries to dig the hba pointer out of a NULL
queue pointer.
The problem has been present since my code to make rebuild_lun_table
behave better went in.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cca.cpqcorp.net>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
On PAE systems, GEM allocates pages using shmem, and passes these
pages to be bound into AGP, however the AGP interfaces + the x86
set_memory interfaces all take unsigned long not dma_addr_t.
The initial fix for this was a mess, so we need to do this correctly
for 2.6.29.
Signed-off-by: Dave Airlie <airlied@redhat.com>
These buffers don't have active rendering still occurring to them, they just
need either a flush to be emitted or a retire_requests to occur so that we
notice they're done. Return unbusy so that one of the two occurs. The two
expected consumers of this interface (OpenGL and libdrm_intel BO cache) both
want this behavior.
Signed-off-by: Eric Anholt <eric@anholt.net>
Acked-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
When we read the write-intent-bitmap off the device, we currently
read a whole number of pages.
When PAGE_SIZE is 4K, this works due to the alignment we enforce
on the superblock and bitmap.
When PAGE_SIZE is 64K, this case read past the end-of-device
which causes an error.
When we write the superblock, we ensure to clip the last page
to just be the required size. Copy that code into the read path
to just read the required number of sectors.
Signed-off-by: Neil Brown <neilb@suse.de>
Cc: stable@kernel.org