1
Commit Graph

119783 Commits

Author SHA1 Message Date
Steven Rostedt
f1eecf0e4f powerpc/ppc32: static ftrace fixes for PPC32
Impact: fix for PowerPC 32 code

There were some early init code that was not safe for static
ftrace to boot on my PowerBook. This code must only use relative
addressing, and static mcount performs a compare of the
ftrace_trace_function pointer, and gets that with an absolute address.
In the early init boot up code, this will cause a fault.

This patch removes tracing from the files containing the offending
functions.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 14:08:07 +01:00
Steven Rostedt
0029ff8752 powerpc: ftrace, use create_branch
Impact: clean up

Paul Mackerras pointed out that the code to determine if the branch
can reach the destination is incorrect. Michael Ellerman suggested
to pull out the code from create_branch and use that.

Simply using create_branch is probably the best.

Reported-by: Michael Ellerman <michael@ellerman.id.au>
Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 14:08:01 +01:00
Steven Rostedt
ec682cef2d powerpc: ftrace, added missing icache flush
Impact: fix to PowerPC code modification

After modifying code it is essential to flush the icache. This patch
adds the missing flush.

Reported-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 14:07:56 +01:00
Steven Rostedt
d9af12b72b powerpc: ftrace, fix cast aliasing and add code verification
Impact: clean up and robustness addition

This patch addresses the comments made by Paul Mackerras.
It removes the type casting between unsigned int and unsigned char
pointers, and replaces them with a use of all unsigned int.

Verification that the jump is indeed made to a trampoline has also
been added.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 14:07:50 +01:00
Steven Rostedt
c7b0d17366 powerpc: ftrace, do nothing in mcount call for dyn ftrace
Impact: quicken mcount calls that are not replaced by dyn ftrace

Dynamic ftrace no longer does on the fly recording of mcount locations.
The mcount locations are now found at compile time. The mcount
function no longer needs to store registers and call a stub function.
It can now just simply return.

Since there are some functions that do not get converted to a nop
(.init sections and other code that may disappear), this patch should
help speed up that code.

Also, the stub for mcount on PowerPC 32 can not be a simple branch
link register like it is on PowerPC 64. According to the ABI specification:

"The _mcount routine is required to restore the link register from
 the stack so that the profiling code can be inserted transparently,
 whether or not the profiled function saves the link register itself."

This means that we must restore the link register that was used
to make the call to mcount.  The minimal mcount function for PPC32
ends up being:

 mcount:
        mflr    r0
        mtctr   r0
        lwz     r0, 4(r1)
        mtlr    r0
        bctr

Where we move the link register used to call mcount into the
ctr register, and then restore the link register from the stack.
Then we use the ctr register to jump back to the mcount caller.
The r0 register is free for us to use.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 14:07:45 +01:00
walimis
c072c24975 ftrace: improve documentation
Impact: extend documentation with notice of using wild cards correctly

We know that we can use wild cards to set set_ftrace_filter, but there's
problem when using them naively such as:

   echo h* > /debug/tracing/set_ftrace_filter

If there are files named with "h" prefix in current directory,
echo "h*" will echo these filenames to set_ftrace_filter, not the
intended "h*".

For example:

  $ cat /debug/tracing/available_filter_functions |grep ^hr |wc -l
  23
  $ ls
  $ touch hraa hrdd
  $ ls
  hraa  hrdd
  $ echo hr* > /debug/tracing/set_ftrace_filter
  $ cat /debug/tracing/set_ftrace_filter

No output in /debug/tracing/set_ftrace_filter!

If we use '' to escape wild cards, it works:

  $ ls
  hraa  hrdd
  $ echo "hr*" > /debug/tracing/set_ftrace_filter
  $ cat /debug/tracing/set_ftrace_filter |wc -l
  23

This problem can lead to unexpected result if current directory has a
lot of files.

Signed-off-by: walimis <walimisdev@gmail.com>
Acked-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 13:15:14 +01:00
Liming Wang
50cdaf08a8 ftrace: improve seq_operation of ftrace
Impact: make ftrace position computing more sane

First remove useless ->pos field. Then we needn't check seq_printf
in .show like other place.

Signed-off-by: Liming Wang <liming.wang@windriver.com>
Reviewed-by: Bruce Ashfield <bruce.ashfield@windriver.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 12:30:40 +01:00
Török Edwin
c7425acb42 tracing, alpha: fix build: add missing #ifdef CONFIG_STACKTRACE
There are architectures that still have no stacktrace support.

Signed-off-by: Török Edwin <edwintorok@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 11:33:00 +01:00
Ingo Molnar
d51090b346 tracing/function-graph-tracer: more output tweaks
Impact: prettify the output some more

Before:

0)           |     sys_read() {
0)      0.796 us |   fget_light();
0)           |       vfs_read() {
0)           |         rw_verify_area() {
0)           |           security_file_permission() {
------------8<---------- thread sshd-1755 ------------8<----------

After:

 0)               |  sys_read() {
 0)      0.796 us |    fget_light();
 0)               |    vfs_read() {
 0)               |      rw_verify_area() {
 0)               |        security_file_permission() {
 ------------------------------------------
 | 1)  migration/0--1  =>  sshd-1755
 ------------------------------------------

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 10:15:06 +01:00
Frederic Weisbecker
1a056155ed tracing/function-graph-tracer: adjustments of the trace informations
Impact: increase the visual qualities of the call-graph-tracer output

This patch applies various trace output formatting changes:

 - CPU is now a decimal number, followed by a parenthesis.

 - Overhead is now on the second column (gives a good visibility)

 - Cost is now on the third column, can't exceed 9999.99 us. It is
   followed by a virtual line based on a "|" character.

 - Functions calls are now the last column on the right. This way, we
   haven't dynamic column (which flow is harder to follow) on its right.

 - CPU and Overhead have their own option flag. They are default-on but you
   can disable them easily:

      echo nofuncgraph-cpu > trace_options
      echo nofuncgraph-overhead > trace_options

TODO:

_ Refactoring of the thread switch output.
_ Give a default-off option to output the thread and its pid on each row.
_ Provide headers
_ ....

Here is an example of the new trace style:

0)           |             mutex_unlock() {
0)      0.639 us |           __mutex_unlock_slowpath();
0)      1.607 us |         }
0)           |             remove_wait_queue() {
0)      0.616 us |           _spin_lock_irqsave();
0)      0.616 us |           _spin_unlock_irqrestore();
0)      2.779 us |         }
0)      0.495 us |         n_tty_set_room();
0) ! 9999.999 us |       }
0)           |           tty_ldisc_deref() {
0)      0.615 us |         _spin_lock_irqsave();
0)      0.616 us |         _spin_unlock_irqrestore();
0)      2.793 us |       }
0)           |           current_fs_time() {
0)      0.488 us |         current_kernel_time();
0)      0.495 us |         timespec_trunc();
0)      2.486 us |       }
0) ! 9999.999 us |     }
0) ! 9999.999 us |   }
0) ! 9999.999 us | }
0)           |     sys_read() {
0)      0.796 us |   fget_light();
0)           |       vfs_read() {
0)           |         rw_verify_area() {
0)           |           security_file_permission() {
0)      0.488 us |         cap_file_permission();
0)      1.720 us |       }
0)      3.  4 us |     }
0)           |         tty_read() {
0)      0.488 us |       tty_paranoia_check();
0)           |           tty_ldisc_ref_wait() {
0)           |             tty_ldisc_try() {
0)      0.615 us |           _spin_lock_irqsave();
0)      0.615 us |           _spin_unlock_irqrestore();
0)      5.436 us |         }
0)      6.427 us |       }

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-28 09:45:04 +01:00
Russell King
af38d90d6a Merge branch 'for-rmk' of git://git.kernel.org/pub/scm/linux/kernel/git/ycmiao/pxa-linux-2.6 2008-11-27 23:50:31 +00:00
Russell King
487ff32082 Allow architectures to override copy_user_highpage()
With aliasing VIPT cache support, the ARM implementation of
clear_user_page() and copy_user_page() sets up a temporary kernel space
mapping such that we have the same cache colour as the userspace page.
This avoids having to consider any userspace aliases from this operation.

However, when highmem is enabled, kmap_atomic() have to setup mappings.
The copy_user_highpage() and clear_user_highpage() call these functions
before delegating the copies to copy_user_page() and clear_user_page().

The effect of this is that each of the *_user_highpage() functions setup
their own kmap mapping, followed by the *_user_page() functions setting
up another mapping.  This is rather wasteful.

Thankfully, copy_user_highpage() can be overriden by architectures by
defining __HAVE_ARCH_COPY_USER_HIGHPAGE.  However, replacement of
clear_user_highpage() is more difficult because its inline definition
is not conditional.  It seems that you're expected to define
__HAVE_ARCH_ALLOC_ZEROED_USER_HIGHPAGE and provide a replacement
__alloc_zeroed_user_highpage() implementation instead.

The allocation itself is fine, so we don't want to override that.  What
we really want to do is to override clear_user_highpage() with our own
version which doesn't kmap_atomic() unnecessarily.

Other VIPT architectures (PARISC and SH) would also like to override
this function as well.

Acked-by: Hugh Dickins <hugh@veritas.com>
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2008-11-27 23:39:48 +00:00
Jan Kara
52b19ac993 udf: Fix BUG_ON() in destroy_inode()
udf_clear_inode() can leave behind buffers on mapping's i_private list (when
we truncated preallocation). Call invalidate_inode_buffers() so that the list
is properly cleaned-up before we return from udf_clear_inode(). This is ugly
and suggest that we should cleanup preallocation earlier than in clear_inode()
but currently there's no such call available since drop_inode() is called under
inode lock and thus is unusable for disk operations.

Signed-off-by: Jan Kara <jack@suse.cz>
2008-11-27 17:38:28 +01:00
Marek Vasut
a730b327ca [ARM] pxa/palmtx: misc fixes to use generic GPIO API
Signed-off-by: Marek Vasut <marek.vasut@gmail.com>
Signed-off-by: Eric Miao <eric.miao@marvell.com>
2008-11-27 22:47:26 +08:00
Joerg Roedel
b627c8b17c x86: always define DECLARE_PCI_UNMAP* macros
Impact: fix boot crash on AMD IOMMU if CONFIG_GART_IOMMU is off

Currently these macros evaluate to a no-op except the kernel is compiled
with GART or Calgary support. But we also need these macros when we have
SWIOTLB, VT-d or AMD IOMMU in the kernel. Since we always compile at
least with SWIOTLB we can define these macros always.

This patch is also for stable backport for the same reason the SWIOTLB
default selection patch is.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27 12:44:08 +01:00
Russell King
6417a917b5 Merge branch 'omap-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap-2.6 2008-11-27 11:13:10 +00:00
Martin Schwidefsky
abd942194d [S390] Update default configuration.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:58 +01:00
Heiko Carstens
0778dc3a62 [S390] Fix alignment of initial kernel stack.
We need an alignment of 16384 bytes for the initial kernel stack if
the kernel is configured for 16384 bytes stacks but the linker script
currently guarantees only an alignment of 8192 bytes.

So fix this and simply use THREAD_SIZE as alignment value which will
always do the right thing.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:58 +01:00
Christian Borntraeger
2944a5c971 [S390] pgtable.h: Fix oops in unmap_vmas for KVM processes
When running several kvm processes with lots of memory overcommitment,
we have seen an oops during process shutdown:
------------[ cut here ]------------
Kernel BUG at 0000000000193434 [verbose debug info unavailable]
addressing exception: 0005 [#1] PREEMPT SMP
Modules linked in: kvm sunrpc qeth_l2 dm_mod qeth ccwgroup
CPU: 10 Not tainted 2.6.28-rc4-kvm-bigiron-00521-g0ccca08-dirty #8
Process kuli (pid: 14460, task: 0000000149822338, ksp: 0000000024f57650)
Krnl PSW : 0704e00180000000 0000000000193434 (unmap_vmas+0x884/0xf10)
R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 EA:3
Krnl GPRS: 0000000000000002 0000000000000000 000000051008d000 000003e05e6034e0
           00000000001933f6 00000000000001e9 0000000407259e0a 00000002be88c400
           00000200001c1000 0000000407259608 0000000407259e08 0000000024f577f0
           0000000407259e09 0000000000445fa8 00000000001933f6 0000000024f577f0
Krnl Code: 0000000000193426: eb22000c000d sllg %r2,%r2,12
           000000000019342c: a7180000 lhi %r1,0
           0000000000193430: b2290012 iske %r1,%r2
          >0000000000193434: a7110002 tmll %r1,2
           0000000000193438: a7840006 brc 8,193444
           000000000019343c: 9602c000 oi 0(%r12),2
           0000000000193440: 96806000 oi 0(%r6),128
           0000000000193444: a7110004 tmll %r1,4
Call Trace:
([<00000000001933f6>] unmap_vmas+0x846/0xf10)
[<0000000000199680>] exit_mmap+0x210/0x458
[<000000000012a8f8>] mmput+0x54/0xfc
[<000000000012f714>] exit_mm+0x134/0x144
[<000000000013120c>] do_exit+0x240/0x878
[<00000000001318dc>] do_group_exit+0x98/0xc8
[<000000000013e6b0>] get_signal_to_deliver+0x30c/0x358
[<000000000010bee0>] do_signal+0xec/0x860
[<0000000000112e30>] sysc_sigpending+0xe/0x22
[<000002000013198a>] 0x2000013198a
INFO: lockdep is turned off.
Last Breaking-Event-Address:
[<00000000001a68d0>] free_swap_and_cache+0x1a0/0x1a4
<4>---[ end trace bc19f1d51ac9db7c ]---

The faulting instruction is the storage key operation (iske) in
ptep_rcp_copy (called by pte_clear, called by unmap_vmas). iske
reads dirty and reference bit information for a physical page and
requires a valid physical address. Since we are in pte_clear, we
cannot rely on the pte containing a valid address. Fortunately we
dont need these information in pte_clear - after all there is no
mapping. The best fix is to remove the needless call to ptep_rcp_copy
that contains the iske.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:57 +01:00
Christian Borntraeger
8107d8296b [S390] fix/cleanup sched_clock
CONFIG_PRINTK_TIME reveals that sched_clock has a wrong offset during boot:
..
[    0.000000]   Movable zone: 0 pages used for memmap
[    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 775679
[    0.000000] Kernel command line: dasd=4b6c root=/dev/dasda1 ro noinitrd
[    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[6920575.975232] console [ttyS0] enabled
[6920575.987586] Dentry cache hash table entries: 524288 (order: 10, 4194304 bytes)
[6920575.991404] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
..

The s390 implementation of sched_clock uses the store clock instruction and
subtracts jiffies_timer_cc.
jiffies_timer_cc is a local variable in arch/s390/kernel/time.c and only used
for sched_clock and monotonic clock. For historical reasons there is an offset
on that value. With todays code this offset is unnecessary. By removing that
offset we can get a sched_clock which returns the nanoseconds after time_init.
This improves CONFIG_PRINTK_TIME.

Since sched_clock is the only user, I have also renamed jiffies_timer_cc to
sched_clock_base_cc. In addition, the local variable init_timer_cc is redundant
and can be romved as well.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:57 +01:00
Martin Schwidefsky
59da21398e [S390] fix system call parameter functions.
syscall_get_nr() currently returns a valid result only if the call
chain of the traced process includes do_syscall_trace_enter(). But
collect_syscall() can be called for any sleeping task, the result of
syscall_get_nr() in general is completely bogus.

To make syscall_get_nr() work for any sleeping task the traps field
in pt_regs is replace with svcnr - the system call number the process
is executing. If svcnr == 0 the process is not on a system call path.

The syscall_get_arguments and syscall_set_arguments use regs->gprs[2]
for the first system call parameter. This is incorrect since gprs[2]
may have been overwritten with the system call number if the call
chain includes do_syscall_trace_enter. Use regs->orig_gprs2 instead.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-27 11:06:56 +01:00
Frederic Weisbecker
83a8df618e tracing/function-graph-tracer: enhancements for the trace output
Impact: enhance the output of the graph-tracer

This patch applies some ideas of Ingo Molnar and Steven Rostedt.

* Output leaf functions in one line with parenthesis, semicolon and duration
  output.

* Add a second column (after cpu) for an overhead sign.
  if duration > 100 us, "!"
  if duration > 10 us, "+"
  else " "

* Print output in us with remaining nanosec: u.n

* Print duration on the right end, following the indentation of the functions.
  Use also visual clues: "-" on entry call (no duration to output) and "+" on
  return (duration output).

The name of the tracer has been fixed as well: function-branch becomes
function_branch.

Here is an example of the new output:

CPU[000]           dequeue_entity() {                    -
CPU[000]             update_curr() {                    -
CPU[000]               update_min_vruntime();                    + 0.512 us
CPU[000]             }                                + 1.504 us
CPU[000]             clear_buddies();                    + 0.481 us
CPU[000]             update_min_vruntime();                    + 0.504 us
CPU[000]           }                                + 4.557 us
CPU[000]           hrtick_update() {                    -
CPU[000]             hrtick_start_fair();                    + 0.489 us
CPU[000]           }                                + 1.443 us
CPU[000] +       }                                + 14.655 us
CPU[000] +     }                                + 15.678 us
CPU[000] +   }                                + 16.686 us
CPU[000]     msecs_to_jiffies();                    + 0.481 us
CPU[000]     put_prev_task_fair();                    + 0.504 us
CPU[000]     pick_next_task_fair();                    + 0.482 us
CPU[000]     pick_next_task_rt();                    + 0.504 us
CPU[000]     pick_next_task_fair();                    + 0.481 us
CPU[000]     pick_next_task_idle();                    + 0.489 us
CPU[000]     _spin_trylock();                    + 0.655 us
CPU[000]     _spin_unlock();                    + 0.609 us

CPU[000]  ------------8<---------- thread bash-2794 ------------8<----------

CPU[000]               finish_task_switch() {                    -
CPU[000]                 _spin_unlock_irq();                    + 0.722 us
CPU[000]               }                                + 2.369 us
CPU[000] !           }                                + 501972.605 us
CPU[000] !         }                                + 501973.763 us
CPU[000]           copy_from_read_buf() {                    -
CPU[000]             _spin_lock_irqsave();                    + 0.670 us
CPU[000]             _spin_unlock_irqrestore();                    + 0.699 us
CPU[000]             copy_to_user() {                    -
CPU[000]               might_fault() {                    -
CPU[000]                 __might_sleep();                    + 0.503 us
CPU[000]               }                                + 1.632 us
CPU[000]               __copy_to_user_ll();                    + 0.542 us
CPU[000]             }                                + 3.858 us
CPU[000]             tty_audit_add_data() {                    -
CPU[000]               _spin_lock_irq();                    + 0.609 us
CPU[000]               _spin_unlock_irq();                    + 0.624 us
CPU[000]             }                                + 3.196 us
CPU[000]             _spin_lock_irqsave();                    + 0.624 us
CPU[000]             _spin_unlock_irqrestore();                    + 0.625 us
CPU[000] +         }                                + 13.611 us
CPU[000]           copy_from_read_buf() {                    -
CPU[000]             _spin_lock_irqsave();                    + 0.624 us
CPU[000]             _spin_unlock_irqrestore();                    + 0.616 us
CPU[000]           }                                + 2.820 us
CPU[000]

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27 10:59:14 +01:00
Ingo Molnar
c7cc773076 Merge branches 'tracing/blktrace', 'tracing/ftrace', 'tracing/function-graph-tracer' and 'tracing/power-tracer' into tracing/core 2008-11-27 10:56:13 +01:00
Steven Rostedt
4cd4262034 sched: prevent divide by zero error in cpu_avg_load_per_task
Impact: fix divide by zero crash in scheduler rebalance irq

While testing the branch profiler, I hit this crash:

divide error: 0000 [#1] PREEMPT SMP
[...]
RIP: 0010:[<ffffffff8024a008>]  [<ffffffff8024a008>] cpu_avg_load_per_task+0x50/0x7f
[...]
Call Trace:
 <IRQ> <0> [<ffffffff8024fd43>] find_busiest_group+0x3e5/0xcaa
 [<ffffffff8025da75>] rebalance_domains+0x2da/0xa21
 [<ffffffff80478769>] ? find_next_bit+0x1b2/0x1e6
 [<ffffffff8025e2ce>] run_rebalance_domains+0x112/0x19f
 [<ffffffff8026d7c2>] __do_softirq+0xa8/0x232
 [<ffffffff8020ea7c>] call_softirq+0x1c/0x3e
 [<ffffffff8021047a>] do_softirq+0x94/0x1cd
 [<ffffffff8026d5eb>] irq_exit+0x6b/0x10e
 [<ffffffff8022e6ec>] smp_apic_timer_interrupt+0xd3/0xff
 [<ffffffff8020e4b3>] apic_timer_interrupt+0x13/0x20

The code for cpu_avg_load_per_task has:

	if (rq->nr_running)
		rq->avg_load_per_task = rq->load.weight / rq->nr_running;

The runqueue lock is not held here, and there is nothing that prevents
the rq->nr_running from going to zero after it passes the if condition.

The branch profiler simply made the race window bigger.

This patch saves off the rq->nr_running to a local variable and uses that
for both the condition and the division.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27 10:29:52 +01:00
Lai Jiangshan
4f5a7f40dd ftrace: prevent recursion
Impact: prevent unnecessary stack recursion

if the resched flag was set before we entered, then don't reschedule.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-27 10:11:53 +01:00
Lin Ming
e899b6485c ACPICA: disable _BIF warning
A generic work-around from ACPICA is in the queue,
but since Linux has a work-around in its battery
driver, we can disable this warning now.

Allow _BIF method to return an Package with Buffer elements

http://bugzilla.kernel.org/show_bug.cgi?id=11822

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-27 02:17:30 -05:00
Len Brown
a6e0887f21 ACPI: delete OSI(Linux) DMI dmesg spam
Linux will continue to ignore OSI(Linux),
except for a white-list containing a few systems.

So delete the black-list,
and stop soliciting user-feedback on the console.

Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-27 01:55:21 -05:00
Bob Moore
95a28ed086 ACPICA: Allow _WAK method to return an Integer
This can happen if the _WAK method returns nothing (as per ACPI
1.0) but does return an integer if the implicit return mechanism
is enabled.  This is the only method that has this problem,
since it is also defined to return a package of two integers
(ACPI 1.0b+). In all other cases, if a method returns an object
when one was not expected, no warning is issued.

Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-27 01:55:13 -05:00
Henrique de Moraes Holschuh
0081b16202 ACPI: thinkpad-acpi: fix fan sleep/resume path
This fixes a regression from v2.6.27, caused by commit
5814f737e1cd2cfa2893badd62189acae3e1e1fd, "ACPI: thinkpad-acpi:
attempt to preserve fan state on resume".

It is possible for fan_suspend() to fail to properly initialize
fan_control_desired_level as required by fan_resume(), resulting on
the fan always being set to level 7 on resume if the user didn't
touch the fan controller.

In order to get fan sleep/resume handling to work right:

1. Fix the fan_suspend handling of the T43 firmware quirk. If it is
still undefined, we didn't touch the fan yet and that means we have no
business doing it on resume.

2. Store the fan level on its own variable to avoid any possible
issues with hijacking fan_control_desired_level (which isn't supposed
to have anything other than 0-7 in it, anyway).

3. Change the fan_resume code to me more straightforward to understand
(although we DO optimize the boolean logic there, otherwise it looks
disgusting).

4. Add comments to help understand what the code is supposed to be
doing.

5. Change fan_set_level to be less strict about how auto and
full-speed modes are requested.

http://bugzilla.kernel.org/show_bug.cgi?id=11982

Signed-off-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br>
Reported-by: Tino Keitel <tino.keitel@tikei.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 19:24:22 -05:00
Alessandro Guido
3fedd90fdf sony-laptop: printk tweak
There's no need to print "Sony: " just after "sony-laptop: " (DRV_PFX).

Signed-off-by: Alessandro Guido <ag@alessandroguido.name>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 18:12:14 -05:00
Alessandro Guido
38cfc148e1 sony-laptop: brightness regression fix
After commit 540b8bb9c3:

  sony-laptop: fingers off backlight if video.ko is serving this functionality

I can't set brightness on my sony laptop (nothing in /sys/class/backlight).
dmesg says "sony-laptop: Sony: Brightness ignored, must be controlled by ACPI
video driver".

The function acpi_video_backlight_support returns 0 if we should use the
vendor-specific backlight support, while non-0 if the ACPI generic should
be used. Because of this, the check introduced by the said commit appears
reversed.

Signed-off-by: Alessandro Guido <ag@alessandroguido.name>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 18:12:07 -05:00
Len Brown
3bdca1b863 Revert "ACPI: don't enable control method power button as wakeup device when Fixed Power button is used"
This reverts commit faee816b15.

http://bugzilla.kernel.org/show_bug.cgi?id=12091

Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 17:55:15 -05:00
Rafael J. Wysocki
65df78473f ACPI suspend: Blacklist boxes that require us to set SCI_EN directly on resume
Some Apple boxes evidently require us to set SCI_EN on resume
directly, because if we don't do that, they hung somewhere in the
resume code path.  Moreover, on these boxes it is not sufficient to
use acpi_enable() to turn ACPI on during resume.  All of this is
against the ACPI specification which states that (1) the BIOS is
supposed to return from the S3 sleep state with ACPI enabled
(SCI_EN set) and (2) the SCI_EN bit is owned by the hardware and we
are not supposed to change it.

For this reason, blacklist the affected systems so that the SCI_EN
bit is set during resume on them.

[NOTE: Unconditional setting SCI_EN for all system on resume doesn't
 work, because it makes some other systems crash (that's to be
 expected).  Also, it is not entirely clear right now if all of the
 Apple boxes require this workaround.]

This patch fixes the recent regression tracked as
http://bugzilla.kernel.org/show_bug.cgi?id=12038

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Tested-by: Tino Keitel <tino.keitel@gmx.de>
Tested-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 17:53:13 -05:00
Pavel Machek
40599072dc ACPI: scheduling in atomic via acpi_evaluate_integer ()
Now I know why I had strange "scheduling in atomic" problems:
acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
: GFP_KERNEL)... which is (of course) broken.

There's no way to reliably tell if we need GFP_ATOMIC or not from
code, this one for example fails to detect spinlocks held.

Fortunately, allocation seems small enough to be done on stack.

Signed-off-by: Pavel Machek <pavel@suse.cz>
Acked-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 17:39:06 -05:00
Tero Kristo
723fdb781a ARM: OMAP: Fixes for suspend / resume GPIO wake-up handling
Use the correct wake-up enable register, and make it
work with 34xx also.

Signed-off-by: Tero Kristo <tero.kristo@nokia.com>
Signed-off-by: Kevin Hilman <khilman@deeprootsystems.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2008-11-26 14:35:16 -08:00
Alexey Starikovskiy
558073dd56 ACPI: battery: Convert discharge energy rate to current properly
ACPI battery interface reports its state either in mW or in mA, and
discharge rate in your case is reported in mW. power_supply interface
does not have such a parameter, so current_now parameter is used
for all cases. But in case of mW, reported discharge should
be converted into mA.

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Tested-by: Ferenc Wagner <wferi@niif.hu>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 17:23:10 -05:00
Kay Sievers
90f671301a parisc: struct device - replace bus_id with dev_name(), dev_set_name()
(I did not compile or test it, please let me know, or help fixing
 it, if something is wrong with the conversion)

This patch is part of a larger patch series which will remove
the "char bus_id[20]" name string from struct device. The device
name is managed in the kobject anyway, and without any size
limitation, and just needlessly copied into "struct device".

To set and read the device name dev_name(dev) and dev_set_name(dev)
must be used. If your code uses static kobjects, which it shouldn't
do, "const char *init_name" can be used to statically provide the
name the registered device should have. At registration time, the
init_name field is cleared, to enforce the use of dev_name(dev) to
access the device name at a later time.

We need to get rid of all occurrences of bus_id in the entire tree
to be able to enable the new interface. Please apply this patch,
and possibly convert any remaining remaining occurrences of bus_id.

We want to submit a patch to -next, which will remove bus_id from
"struct device", to find the remaining pieces to convert, and finally
switch over to the new api, which will remove the 20 bytes array
and does no longer have a size limitation.

Thanks,
Kay

Cc: Matthew Wilcox <matthew@wil.cx>
Cc: linux-parisc@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2008-11-26 22:22:41 +00:00
Helge Deller
7a3f5134a8 parisc: fix kernel crash when unwinding a userspace process
Any user on existing parisc 32- and 64bit-kernels can easily crash
the kernel and as such enforce a DSO.
A simple testcase is available here:
        http://gsyprf10.external.hp.com/~deller/crash.tgz

The problem is introduced by the fact, that the handle_interruption()
crash handler calls the show_regs() function, which in turn tries to
unwind the stack by calling parisc_show_stack().  Since the stack contains
userspace addresses, a try to unwind the stack is dangerous and useless
and leads to the crash.

The fix is trivial: For userspace processes
a) avoid to unwind the stack, and
b) avoid to resolve userspace addresses to kernel symbol names.

While touching this code, I converted print_symbol() to %pS
printk formats and made parisc_show_stack() static.

An initial patch for this was written by Kyle McMartin back in August:
http://marc.info/?l=linux-parisc&m=121805168830283&w=2

Compile and run-tested with a 64bit parisc kernel.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Grant Grundler <grundler@parisc-linux.org>
Cc: Matthew Wilcox <matthew@wil.cx>
Cc: <stable@kernel.org>		[2.6.25.x, 2.6.26.x, 2.6.27.x, earlier...]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2008-11-26 22:22:39 +00:00
Geert Uytterhoeven
9860d1b08b parisc: __kernel_time_t is always long
__kernel_time_t is always long on PA-RISC, irrespective of CONFIG_64BIT,
hence move it out of the #ifdef CONFIG_64BIT / #else / #endif block.

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2008-11-26 22:22:36 +00:00
Alexey Starikovskiy
7b4d469228 ACPI: EC: count interrupts only if called from interrupt handler.
fix 2.6.28 EC interrupt storm regression

Signed-off-by: Alexey Starikovskiy <astarikovskiy@suse.de>
Signed-off-by: Len Brown <len.brown@intel.com>
2008-11-26 17:16:45 -05:00
Jeff Layton
a98ee8c1c7 [CIFS] fix regression in cifs_write_begin/cifs_write_end
The conversion to write_begin/write_end interfaces had a bug where we
were passing a bad parameter to cifs_readpage_worker. Rather than
passing the page offset of the start of the write, we needed to pass the
offset of the beginning of the page. This was reliably showing up as
data corruption in the fsx-linux test from LTP.

It also became evident that this code was occasionally doing unnecessary
read calls. Optimize those away by using the PG_checked flag to indicate
that the unwritten part of the page has been initialized.

CC: Nick Piggin <npiggin@suse.de>
Acked-by: Dave Kleikamp <shaggy@us.ibm.com>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
2008-11-26 19:32:33 +00:00
Ping Cheng
545f4e99de Input: wacom - add support for new USB Tablet PCs
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2008-11-26 10:14:18 -05:00
Ingo Molnar
0bfc24559d blktrace: port to tracepoints, update
Port to the new tracepoints API: split DEFINE_TRACE() and DECLARE_TRACE()
sites. Spread them out to the usage sites, as suggested by
Mathieu Desnoyers.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
2008-11-26 13:04:35 +01:00
Arnaldo Carvalho de Melo
5f3ea37c77 blktrace: port to tracepoints
This was a forward port of work done by Mathieu Desnoyers, I changed it to
encode the 'what' parameter on the tracepoint name, so that one can register
interest in specific events and not on classes of events to then check the
'what' parameter.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 12:13:34 +01:00
Arjan van de Ven
f3f47a6768 tracing: add "power-tracer": C/P state tracer to help power optimization
Impact: new "power-tracer" ftrace plugin

This patch adds a C/P-state ftrace plugin that will generate
detailed statistics about the C/P-states that are being used,
so that we can look at detailed decisions that the C/P-state
code is making, rather than the too high level "average"
that we have today.

An example way of using this is:

 mount -t debugfs none /sys/kernel/debug
 echo cstate > /sys/kernel/debug/tracing/current_tracer
 echo 1 > /sys/kernel/debug/tracing/tracing_enabled
 sleep 1
 echo 0 > /sys/kernel/debug/tracing/tracing_enabled
 cat /sys/kernel/debug/tracing/trace | perl scripts/trace/cstate.pl > out.svg

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 08:29:32 +01:00
Steven Rostedt
437f24fb89 ftrace: add cpu annotation for function graph tracer
Impact: enhancement for function graph tracer

When run on a SMP box, the function graph tracer is confusing because
it shows the different CPUS as changes in the trace.

This patch adds the annotation of 'CPU[###]' where ### is a three digit
number. The output will look similar to this:

CPU[001]     dput() {
CPU[000] } 726
CPU[001]     } 487
CPU[000] do_softirq() {
CPU[001]   } 2221
CPU[000]   __do_softirq() {
CPU[000]     __local_bh_disable() {
CPU[001]   unroll_tree_refs() {
CPU[000]     } 569
CPU[001]   } 501
CPU[000]     rcu_process_callbacks() {
CPU[001]   kfree() {

What makes this nice is that now you can grep the file and produce
readable format for a particular CPU.

 # cat /debug/tracing/trace > /tmp/trace
 # grep '^CPU\[000\]' /tmp/trace > /tmp/trace0
 # grep '^CPU\[001\]' /tmp/trace > /tmp/trace1

Will give you:

 # head /tmp/trace0
CPU[000] ------------8<---------- thread sshd-3899 ------------8<----------
CPU[000]     inotify_dentry_parent_queue_event() {
CPU[000]     } 2531
CPU[000]     inotify_inode_queue_event() {
CPU[000]     } 505
CPU[000]   } 69626
CPU[000] } 73089
CPU[000] audit_syscall_exit() {
CPU[000]   path_put() {
CPU[000]     dput() {

 # head /tmp/trace1
CPU[001] ------------8<---------- thread pcscd-3446 ------------8<----------
CPU[001]               } 4186
CPU[001]               dput() {
CPU[001]               } 543
CPU[001]               vfs_permission() {
CPU[001]                 inode_permission() {
CPU[001]                   shmem_permission() {
CPU[001]                     generic_permission() {
CPU[001]                     } 501
CPU[001]                   } 2205

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 06:52:57 +01:00
Steven Rostedt
660c7f9be9 ftrace: add thread comm to function graph tracer
Impact: enhancement to function graph tracer

Export the trace_find_cmdline so the function graph tracer can
use it to print the comms of the threads.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 06:52:56 +01:00
Steven Rostedt
e53a6319cc ftrace: let function tracing and function return run together
Impact: feature

This patch enables function tracing and function return to run together.
I've tested this by enabling the stack tracer and return tracer, where
both the function entry and function return are used together with
dynamic ftrace.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 06:52:55 +01:00
Steven Rostedt
5a45cfe1c6 ftrace: use code patching for ftrace graph tracer
Impact: more efficient code for ftrace graph tracer

This patch uses the dynamic patching, when available, to patch
the function graph code into the kernel.

This patch will ease the way for letting both function tracing
and function graph tracing run together.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 06:52:54 +01:00
Steven Rostedt
df4fc31558 ftrace: add function tracing to single thread
Impact: feature to function trace a single thread

This patch adds the ability to function trace a single thread.
The file:

  /debugfs/tracing/set_ftrace_pid

contains the pid to trace. Valid pids are any positive integer.
Writing any negative number to this file will disable the pid
tracing and the function tracer will go back to tracing all of
threads.

This feature works with both static and dynamic function tracing.

Signed-off-by: Steven Rostedt <srostedt@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-11-26 06:52:52 +01:00