Instead of creating idle threads at boot for all possible CPUs, we
create them on demand, like x86 or ARM, and we properly call init_idle
to re-initialize an idle thread when a CPU was unplugged and is now
re-plugged.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Instead, keep it static, expose an accessor and use that from
the PowerMac code. Avoids easy namespace collisions and will
make it easier to consolidate with other implementations.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
On some machines that use i2c to synchronize the timebases (such
as PowerMac7,2/7,3 G5 machines), hotplug CPU would crash when
putting back a new CPU online due to the underlying i2c bus being
closed.
This uses the newly added bringup_done() callback to move the close
along with other housekeeping calls, and adds a CPU notifier to
re-open the i2c bus around subsequent hotplug operations
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This allows us to stop abusing smp_ops->setup_cpu() for cleanup
tasks that have to take place after the initial boot time CPU
bringup.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current code soft-disables, and then goes to NAP mode which
turns interrupts on. That means that if an interrupt occurs, we
will hit the masked interrupt code path which isn't what we want,
as it will return with EE off, which will either get us out of
NAP mode, or fail to enter it (according to spec).
Instead, let's just rely on the fact that it is safe to take
decrementer interrupts on an offline CPU and leave interrupts
enabled. We can also get rid of the special case in asm for
power4_cpu_offline_powersave() and just use power4_idle().
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Those instructions do nothing on non-threaded processors such
as 970's used on those machines.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use the generic code, just add the MPIC priority setting,
I don't see any use in mucking around with the decrementer,
as 32-bit will have EE off all along, and 64-bit will be able
to deal with it.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Use generic cpu_state, call idle_task_exit() properly, and
remove smp_core99_cpu_die() which isn't useful, the generic
function does the job just fine.
Remove the last remnants of cpu_enable(), everybody uses the normal
__cpu_up() path now
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is used by some "soft" hotplug implementations. I needs to
call idle_task_exit() when the CPU is going away, and we remove
the now no-longer needed set_cpu_online() and local_irq_enable()
which are handled by the return to start_secondary
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Various thing are torn down when a CPU is hot-unplugged. That CPU
is expected to go back to start_secondary when re-plugged to re
initialize everything, such as clock sources, maps, ...
Some implementations just return from cpu_die() callback
in the idle loop when the CPU is "re-plugged". This is not enough.
We fix it using a little asm trampoline which resets the stack
and calls back into start_secondary as if we were all fresh from
boot. The trampoline already existed on ppc64, but we add it for
ppc32
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
With some implementations, it is possible that a timer interrupt
occurs every few seconds on an offline CPU. In this case, just
re-arm the decrementer and return immediately
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Commit 60d9f461a2 ("appletalk: remove
the BKL") added a dereference of "sk" before checking for NULL in
atalk_release().
Guard the code block completely, rather than partially, with the
NULL check.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
* 'keithp/drm-intel-fixes' of /ssd/git/drm-next:
drm/i915: Reset GMBUS controller after NAK
drm/i915: Busy-spin wait_for condition in atomic contexts
drm/i915/lvds: Always return connected in the absence of better information
Nouveau needs access to this structure to build an ELD block for use
by the HDA audio codec.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
As underlying ST driver registration API's have changed with
latest 2.6.38-rc8 kernel this patch will update the FM driver
accordingly.
Signed-off-by: Manjunatha Halli <manjunatha_halli@ti.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
altera-jtag.c needs to include <linux/delay.h> to fix a build error:
drivers/staging/altera-stapl/altera-jtag.c:398: error: implicit declaration of function 'udelay'
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Igor M. Liplianin <liplianin@netup.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
This quirk is needed for the docking station mic of
Lenovo Thinkpad X220 to function correctly.
BugLink: http://bugs.launchpad.net/bugs/746259
Cc: stable@kernel.org
Tested-by: James Ferguson <james.ferguson@canonical.com>
Signed-off-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
After a crash dump on an SGI Altix UV system the crash kernel
fails to cause a reboot. EFI mode is disabled in the kdump
kernel, so only the reboot_type of BOOT_ACPI works.
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: rja@sgi.com
LKML-Reference: <E1Q5Iuo-00013b-UK@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
sys_perf_event_open() had an imbalance in the number of task refs it
took causing memory leakage
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: stable@kernel.org # .37+
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jiri reported:
|
| - once an event is created by sys_perf_event_open, task context
| is created and it stays even if the event is closed, until the
| task is finished ... thats what I see in code and I assume it's
| correct
|
| - when the task opens event, perf_sched_events jump label is
| incremented and following callbacks are started from scheduler
|
| __perf_event_task_sched_in
| __perf_event_task_sched_out
|
| These callback *in/out set/unset cpuctx->task_ctx value to the
| task context.
|
| - close is called on event on CPU 0:
| - the task is scheduled on CPU 0
| - __perf_event_task_sched_in is called
| - cpuctx->task_ctx is set
| - perf_sched_events jump label is decremented and == 0
| - __perf_event_task_sched_out is not called
| - cpuctx->task_ctx on CPU 0 stays set
|
| - exit is called on CPU 1:
| - the task is scheduled on CPU 1
| - perf_event_exit_task is called
| - task_ctx_sched_out unsets cpuctx->task_ctx on CPU 1
| - put_ctx destroys the context
|
| - another call of perf_rotate_context on CPU 0 will use invalid
| task_ctx pointer, and eventualy panic.
|
Cure this the simplest possibly way by partially reverting the
jump_label optimization for the sched_out case.
Reported-and-tested-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@kernel.org> # .37+
LKML-Reference: <1301520405.4859.213.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The default setting of perf record is to mmap 128 pages if the user
did not override with -m.
However the page size may vary accross different architecture
settings, giving different default size between each.
Moreover the kernel side still has a default max number of mlocked
pages of 512 kiB + 1 page for unprivileged users. 128 + 1 pages
with page size > 4096 overlaps this threshold.
Thus, better adapt to this limitation and set the default number of
pages to fit those 512 kiB + 1 page.
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1301535324-9735-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The interval for checking scheduling domains if they are due to be
balanced currently depends on boot state NR_CPUS, which may not
accurately reflect the number of online CPUs at the time of check.
Thus replace NR_CPUS with num_online_cpus().
(ed: Should only affect those who set NR_CPUS really high, such as 4096
or so :-)
Signed-off-by: Sisir Koppaka <sisir.koppaka@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <AANLkTikqHWid2Q93F5U5Qw5snJH8C5PXoa7J6=6hYO94@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Correct all function names pertaining to load balancing and explain
shortly how load balancing is performed.
Signed-off-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1301241433-3790-1-git-send-email-bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
sched_setscheduler() (in sched.c) is called in order of changing the
scheduling policy and/or the real-time priority of a task. Thus,
if we find out that neither of those are actually being modified, it
is possible to return earlier and save the overhead of a full
deactivate+activate cycle of the task in question.
Beside that, if we have more than one SCHED_FIFO task with the same
priority on the same rq (which means they share the same priority queue)
having one of them changing its position in the priority queue because of
a sched_setscheduler (as it happens by means of the deactivate+activate)
that does not actually change the priority violates POSIX which states,
for SCHED_FIFO:
"If a thread whose policy or priority has been modified by
pthread_setschedprio() is a running thread or is runnable, the effect on
its position in the thread list depends on the direction of the
modification, as follows: a. <...> b. If the priority is unchanged, the
thread does not change position in the thread list. c. <...>"
http://pubs.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_08.html
(ed: And the POSIX specification here does, briefly and somewhat unexpectedly,
match what common sense tells us as well. )
Signed-off-by: Dario Faggioli <raistlin@linux.it>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1300971618.3960.82.camel@Palantir>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
We should reduce the number of reserved completion queues from the total
number of entries. Since the queue size is power of two, not reducing the
reserved entries, caused a double queue size, which may lead to allocation
failures in some cases.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
In case of allocation failure, tried to use the promiscuous QP
entry that was previously freed.
Now freeing this entry only in case we will not put it back to the list
of promiscuous entries.
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once a NAK has been asserted by the slave, we need to reset the GMBUS
controller in order to continue. This is done by asserting the Software
Clear Interrupt bit and then clearing it again to restore operations.
If we don't clear the NAK, then all future GMBUS xfers will fail,
including DDC probes and EDID retrieval.
v2: Add some comments as suggested by Keith Packard.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35781
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Tested-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Tested-by: "Mengmeng Meng" <mengmeng.meng@intel.com>
During modesetting, we need to wait for the hardware to report
readiness by polling the registers. Normally, we call msleep() between
reads, because some state changes may take a whole vblank or more
to complete. However during a panic, we are in an atomic context and
cannot sleep. Instead, busy spin polling the termination condition.
References: https://bugzilla.kernel.org/show_bug.cgi?id=31772
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
We have to pass task_struct of previous process to function
schedule_tail(). Currently in ret_from_fork previous thread_info
is passed:
switch_to: mov %g6, %g3 /* previous thread_info in g6 */
ret_from_fork: call schedule_tail
mov %g3, %o0 /* previous thread_info is passed */
void schedule_tail(struct task_struct *prev);
Signed-off-by: Tkhai Kirill <tkhai@yandex.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
On a "really fragile" laptop I noticed a single
i8042.c: i8042 controller selftest failed. (0x1 != 0x55)
error in the log. But there's no reason to print this message at
KERN_ERR level each time that loop fails, especially since the message
telling about the overall selftest failure is printed at KERN_INFO level
(on X86).
Add an actual error message for non-X86 systems, where a selftest
failure is (apparently) more serious. Remove a space in an another error
message.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
'struct dmi_system_id' arrays must always have a terminator to keep
dmi_check_system() from looking at data (and possibly crashing) it
isn't supposed to look at.
The issue went unnoticed until ef8313bb1a,
but was introduced about a year earlier with
7705d548cb (which also similarly changed
lifebook.c, but the problem there got eliminated shortly afterwards).
The first hunk therefore is a stable candidate back to 2.6.33, while
the full change is needed only on 2.6.38.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: stable@kernel.org
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
We should return IRQ_NONE from interrupt handler in case keyboard
does not report DATA_AVAIL condition.
Signed-off-by: Rajeev Kumar <rajeev-dlh.kumar@st.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Some devices provide absolute axes with min/max of 0/0 (e.g. wacom's
ABS_MISC axis). Current uinput restrictions do not allow duplication of
these devices and require hacks in userspace to work around this.
If the kernel accepts physical devices with a min/max of 0/0, uinput
shouldn't disallow the same range.
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
With increasing number of PCI function ids, add the PCI function
id in the define name instead of its symbolic name in the BKDG
for more clarity. This renames function 4 define.
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
LKML-Reference: <20110330183447.GA3668@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This is further fallout from delay.h removal from asm/apic.h and asm/dma.h:
ca444564a9: x86: Stop including <linux/delay.h> in two asm header files
Which caused this build failure:
sound/soc/codecs/sn95031.c: In function ‘sn95031_get_mic_bias’:
sound/soc/codecs/sn95031.c:153:2: error: implicit declaration of function ‘msleep’ [-Werror=implicit-function-declaration]
Cc: Jean Delvare <khali@linux-fr.org>
Cc: James E.J. Bottomley <James.Bottomley@suse.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
LKML-Reference: <20110325152014.297890ec@endymion.delvare>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
It absolutely needs to be able to get at pdev_archdata members
which are sparc specific.
Signed-off-by: David S. Miller <davem@davemloft.net>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Like DCCP and other similar pieces of code, there are mechanisms
here to try allocating smaller hash tables if the allocation
fails. So pass in __GFP_NOWARN like the others do instead of
emitting a scary message.
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix all of the problems spotted by CONFIG_DEBUG_SECTION_MISMATCH under
arch/sparc during a 64-bit defconfig build.
They fall into two categorites:
1) of_device_id is marked as __initdata, and we can never do this
since these objects sit in the device core data structures way
past boot. So even if a driver will never be reloaded, we have
to keep the device ID table around.
Mark such cases const instead.
2) The bootmem alloc/free handling code in mdesc.c was not fully
marked __init as it should be, thus generating a reference
to free_bootmem_late() (which is __init) from non-__init code.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commits 01a16b21 (netlink: kill eff_cap from struct netlink_skb_parms)
and c53fa1ed (netlink: kill loginuid/sessionid/sid members from struct
netlink_skb_parms) removed some members from struct netlink_skb_parms
that depend on the current context, all netlink users are now required
to do synchronous message processing.
connector however queues received messages and processes them in a work
queue, which is not valid anymore. This patch converts connector to do
synchronous message processing by invoking the registered callback handler
directly from the netlink receive function.
In order to avoid invoking the callback with connector locks held, a
reference count is added to struct cn_callback_entry, the reference
is taken when finding a matching callback entry on the device's queue_list
and released after the callback handler has been invoked.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Daniel J Blueman reported a lockdep splat in trie_firstleaf(), caused by
RTNL being not locked before a call to fib_table_flush()
Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>