1
Commit Graph

351 Commits

Author SHA1 Message Date
Ingo Molnar
b226f744d4 Merge branch 'linus' into perf/core
Merge reason: pick up tools/perf/ changes from upstream.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-15 08:44:44 +02:00
Arnaldo Carvalho de Melo
d5b889f2ec perf tools: Move threads & last_match to threads.c
This was just being copy'n'pasted all over.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091013141629.GD21809@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 17:12:18 +02:00
Mike Galbraith
f4f0b41818 perf tools: Remove expensive old debug code from perf top
Calling gettimeofday() at high frequency is painful for handicapped
boxen. The spot calling gettimeofday() is old unneeded debug code,
so remove it.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1255438640.7173.1.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 15:30:15 +02:00
Vincent Legoll
cfed95a693 perf tools: Do not manually count string lengths
Use strlen & macros instead of manually counting string lengths as
this is error prone and may lend to bugs.

Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Cc: Linus Torvalds <torvalds@osdl.org>
LKML-Reference: <4727185d0910130118m5387058dndb02ac9b384af9f0@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-13 11:55:31 +02:00
Ashwin Chaugule
63c9e01e1a perf tools: Remove static debugfs path from parse-events
Timechart doesn't work if debugfs is not in /sys/kernel/debug/.
Fixed by using global debugfs_path which is filled in by perf.

Signed-off-by: Ashwin Chaugule <ashwinc@quicinc.com>
Cc: "Arjan van de Ven" <arjan@linux.intel.com>
LKML-Reference: <a751bdc6978478de6d10440e587a2cc7.squirrel@www.codeaurora.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 22:41:05 +02:00
Ingo Molnar
55621ccf2b perf tools: Fix the NO_64BIT build on pure 64-bit systems
Randy Dunlap reported that 'make NO_64BIT=1' fails to build
a pure 32-b it binary on 64-bit/64-bit x86 systems.

The reason is that we dont pass in the -m32 and GCC defaults
to -m64.

So pass it in - and also extend the warning message about libelf
dependencies - glibc-dev[el] is needed as well beyond the libelf
library.

Reported-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: Message-Id: <20091005131729.78444bfb.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 10:00:27 +02:00
Mike Galbraith
55ffb7a6bd perf sched: Add -C option to measure on a specific CPU
To refresh, trying to sched record only one CPU results in bogus
latencies as below.

I fixed^Wmade it stop doing the bad thing today, by
following task migration events properly.

Before:

  marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10
  marge:/root/tmp # perf sched lat
   -----------------------------------------------------------------------------------------
    Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms |
   -----------------------------------------------------------------------------------------
    Xorg:4943             |      1.290 ms |        1 | avg: 1670.132 ms | max: 1670.132 ms |
    hald-addon-stor:3569  |      0.091 ms |        3 | avg:  658.609 ms | max: 1975.797 ms |
    hald-addon-stor:3573  |      0.209 ms |        4 | avg:  499.138 ms | max: 1990.565 ms |
    audispd:4270          |      0.012 ms |        1 | avg:    0.015 ms | max:    0.015 ms |
  ....

  marge:/root/tmp # perf sched trace|grep 'Xorg:4943'
           swapper-0     [000]   401.184013288: sched_stat_runtime: task: Xorg:4943 runtime: 1233188 [ns], vruntime: 19105169779 [ns]
   rt2870TimerQHan-4947  [000]   402.854140127: sched_stat_wait: task: Xorg:4943 wait: 580073 [ns]
   rt2870TimerQHan-4947  [000]   402.854141770: sched_migrate_task: task Xorg:4943 [140] from: 1  to: 0
   rt2870TimerQHan-4947  [000]   402.854143854: sched_stat_wait: task: Xorg:4943 wait: 0 [ns]
   rt2870TimerQHan-4947  [000]   402.854145397: sched_switch: task rt2870TimerQHan:4947 [140] (D) ==> Xorg:4943 [140]
              Xorg-4943  [000]   402.854193133: sched_stat_runtime: task: Xorg:4943 runtime: 56546 [ns], vruntime: 11766332500 [ns]
              Xorg-4943  [000]   402.854196842: sched_switch: task Xorg:4943 [140] (S) ==> swapper:0 [140]

After:

  marge:/root/tmp # taskset -c 1 perf sched record -C 0 -- sleep 10
  marge:/root/tmp # perf sched lat
   -----------------------------------------------------------------------------------------
    Task                  |   Runtime ms  | Switches | Average delay ms | Maximum delay ms |
   -----------------------------------------------------------------------------------------
    amarokapp:11150       |    271.297 ms |      878 | avg:    0.130 ms | max:    1.057 ms |
    konsole:5965          |      1.370 ms |       12 | avg:    0.092 ms | max:    0.855 ms |
    Xorg:4943             |    179.980 ms |     1109 | avg:    0.087 ms | max:    1.206 ms |
    hald-addon-stor:3574  |      0.212 ms |        9 | avg:    0.040 ms | max:    0.169 ms |
    hald-addon-stor:3570  |      0.223 ms |        9 | avg:    0.037 ms | max:    0.223 ms |
    klauncher:5864        |      0.550 ms |        8 | avg:    0.032 ms | max:    0.048 ms |

The 'Maximum delay ms' results are now sane.

Signed-off-by: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 09:10:55 +02:00
Mike Galbraith
7e4ff9e3e8 perf tools: Fix counter sample frequency breakage
Commit 42e59d7d19 switched to a default sample frequency of
1KHz, which overrides any user supplied count, causing sched, top
and timechart to miss events due to their discrete events
being flagged PERF_SAMPLE_PERIOD.

Override default sample frequency when the user profides a
period count, and make both record and top honor that user
supplied option.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <1255326963.15107.2.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 09:10:55 +02:00
Randy Dunlap
cbef79a82a perf tools: Fix const char type propagation
The following perf build warnings/errors in function
argument types:

  builtin-sched.c:1894: warning: passing argument 1 of 'sort_dimension__add' discards qualifiers from pointer target type
  util/trace-event-parse.c:685: warning: passing argument 2 of 'read_expected' discards qualifiers from pointer target type
  util/trace-event-parse.c:741: warning: passing argument 4 of 'test_type_token' discards qualifiers from pointer target type
  util/trace-event-parse.c:706: warning: passing argument 2 of 'read_expected_item' discards qualifiers from pointer target type

... trigger because older GCC is not able to prove that
sort_dimension__add() does not change the string.

Some goes for test_type_token().

Fix this by improving type consistency.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091005131729.78444bfb.randy.dunlap@oracle.com>
[ Also remove ugly type cast now unnecessary. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-12 08:35:00 +02:00
Frederic Weisbecker
26dd2cb074 perf tools: Provide backward compatibility with previous perf.data version
We have merged the trace.info file into perf.data by adding one
section in the perf headers. This makes it incompatible with
previous version: the new perf tools can't read the older
perf.data.

To support the previous format, we check the headers size. If they
have the same size than in the previous format, then ignore the
trace info section that doesn't exist.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1255032449-12022-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 22:11:02 +02:00
Frederic Weisbecker
97ea1a7fa6 perf tools: Fix thread comm resolution in perf sched
This reverts commit 9a92b479b2 ("perf
tools: Improve thread comm resolution in perf sched") and fixes the
real bug.

The bug was elsewhere:

We are failing to resolve thread names in perf sched because the
table of threads we are building, on top of comm events, has a per
process granularity. But perf sched, unlike the other perf tools,
needs a per thread granularity as we are profiling every tasks
individually.

So fix it by building our threads table using the tid instead of
the pid as the thread identifier.

v2: Revert the previous fix - it is not really needed

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1255028657-11158-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 21:10:21 +02:00
Arnaldo Carvalho de Melo
2e538c4a18 perf tools: Improve kernel/modules symbol lookup
This removes the ovelapping of vmlinux addresses with modules,
using the ELF section name when using --vmlinux and creating a
unique DSO name when using /proc/kallsyms ([kernel].N).

This is done by creating multiple 'struct map' instances for
address ranges backed by DSOs that have just the symbols for that
range and a name that is derived from the ELF section name.o

Now it is possible to ask for just the symbols in some particular
kernel section:

$ perf report -m --vmlinux ../build/tip-recvmmsg/vmlinux \
	--dsos [kernel].vsyscall_fn | head -15
    52.73%             Xorg  [.] vread_hpet
    18.61%          firefox  [.] vread_hpet
    14.50%     npviewer.bin  [.] vread_hpet
     6.83%           compiz  [.] vread_hpet
     5.73%         glxgears  [.] vread_hpet
     0.63%             java  [.] vread_hpet
     0.30%   gnome-terminal  [.] vread_hpet
     0.23%             perf  [.] vread_hpet
     0.18%            xchat  [.] vread_hpet
$

Now we don't have to first lookup the list of modules and then, if
it fails, vmlinux symbols, its just a simple lookup for the map
then the symbols, just like for threads.

Reports generated using /proc/kallsyms and --vmlinux should provide
the same results, modulo the DSO name for sections other than
".text".

But they don't right now because things like:

 ffffffff81011c20-ffffffff81012068 system_call
 ffffffff81011c30-ffffffff81011c9b system_call_after_swapgs
 ffffffff81011c9c-ffffffff81011cb6 system_call_fastpath
 ffffffff81011cb7-ffffffff81011cbb ret_from_sys_call

I.e. overlapping symbols, again some ASM special case that we have
to fixup.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254934136-8503-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 19:27:11 +02:00
Arnaldo Carvalho de Melo
da21d1b547 perf tools: Up the verbose level for some really verbose stuff
Like printing every symbol created.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254923340-4870-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 19:27:10 +02:00
Frederic Weisbecker
9a92b479b2 perf tools: Improve thread comm resolution in perf sched
When we get sched traces that involve a task that was already
created before opening the event, we won't have the comm event for
it.

So if we can't find the comm event for a given thread, we look at
the traces that may contain these informations.

Before:

 ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
 kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
 kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
 :5124:5124            |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
 :6244:6244            |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
 firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
 npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
 :6245:6245            |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |

After:

 ata/1:371             |      0.000 ms |        1 | avg: 3988.693 ms | max: 3988.693 ms |
 kondemand/1:421       |      0.096 ms |        3 | avg:  345.346 ms | max: 1035.989 ms |
 kondemand/0:420       |      0.025 ms |        3 | avg:  421.332 ms | max:  964.014 ms |
 firefox:5124          |      0.103 ms |        5 | avg:   74.082 ms | max:  277.194 ms |
 npviewer.bin:6244     |      0.691 ms |        9 | avg:  125.655 ms | max:  271.306 ms |
 firefox:5080          |      0.924 ms |        5 | avg:   53.833 ms | max:  257.828 ms |
 npviewer.bin:6225     |     21.871 ms |       53 | avg:   22.462 ms | max:  220.835 ms |
 npviewer.bin:6245     |      9.631 ms |       21 | avg:   41.864 ms | max:  213.349 ms |

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1255012632-7882-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 16:56:33 +02:00
Frederic Weisbecker
016e92fbc9 perf tools: Unify perf.data mapping and events handling
This librarizes the perf.data file mapping and handling in various
perf tools, roughly reducing the amount of code and fixing the
places that mmap from beginning of the file whereas we want to mmap
from the beginning of the data, leading to page fault because the
mmap window is too small since the trace info are written in the
file too.

TODO:

 - convert perf timechart too

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arjan van de Ven <arjan@infradead.org>
LKML-Reference: <20091007104729.GD5043@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-08 16:56:32 +02:00
Frederic Weisbecker
03456a158d perf tools: Merge trace.info content into perf.data
This drops the trace.info file and move its contents into the
common perf.data file.

This is done by creating a new trace_info section into this file. A
user of perf headers needs to call perf_header__set_trace_info() to
save the trace meta informations into the perf.data file.

A file created by perf after his patch is unsupported by previous
version because the size of the headers have increased.

That said, it's two new fields that have been added in the end of
the headers, and those could be ignored by previous versions if
they just handled the dynamic header size and then ignore the
unknow part. The offsets guarantee the compatibility. We'll do a
-stable fix for that.

But current previous versions handle the header size using its
static size, not dynamic, then it's not backward compatible with
trace records.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <20091006213643.GA5343@nowhere>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07 08:36:10 +02:00
Frederic Weisbecker
b209aa1f83 perf tools: Start the perf.data mapping at data offset in perf trace
Currently, we are mapping perf.data in the beginning of the file
and use the data offset as a buffer offset.

This may exceed the mapping area if the data offset is upper than
page_size * mmap_window and result in a page fault (thing that
happen if we merge trace.info in perf.data).

Instead, let's start the mapping in the page that matches our data
offset.

v2: Drop a junk from another patch (trace_report() removal)

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1254856886-10348-1-git-send-email-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-07 08:36:10 +02:00
Ingo Molnar
42e59d7d19 perf tools: Default to 1 KHz auto-sampling freq events
Use auto-freq events by default in perf record and
perf top.

This allows more consistent hardware event sampling,
regardless of the intensity of the underlying event.

It also keeps us from over-sampling on larger/busier
systems.

(also make surrounding initializations more consistent)

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:41:09 +02:00
Tom Zanussi
064739bc4b perf trace: Add string/dynamic cases to format_flags
Needed for distinguishing string fields in event stream processing.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: hch@infradead.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254809398-8078-4-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:04:46 +02:00
Tom Zanussi
2774601811 perf trace: Add subsystem string to struct event
Needed to fully qualify event names for event stream processing.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: hch@infradead.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254809398-8078-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:04:46 +02:00
Tom Zanussi
26a50744b2 tracing/events: Add 'signed' field to format files
The sign info used for filters in the kernel is also useful to
applications that process the trace stream.  Add it to the format
files and make it available to userspace.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: hch@infradead.org
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254809398-8078-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:04:45 +02:00
Ingo Molnar
d9b2002c40 Merge branch 'perf/urgent' into perf/core
Merge reason: Upcoming patch is dependent on a fix in perf/urgent.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 15:02:34 +02:00
Peter Zijlstra
906010b213 perf_event: Provide vmalloc() based mmap() backing
Some architectures such as Sparc, ARM and MIPS (basically
everything with flush_dcache_page()) need to deal with dcache
aliases by carefully placing pages in both kernel and user maps.

These architectures typically have to use vmalloc_user() for this.

However, on other architectures, vmalloc() is not needed and has
the downsides of being more restricted and slower than regular
allocations.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: David Miller <davem@davemloft.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Paul Mackerras <paulus@samba.org>
LKML-Reference: <1254830228.21044.272.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 14:21:50 +02:00
Arnaldo Carvalho de Melo
818331303b perf tools: elf_sym__is_function() should accept "zero" sized functions
Asm routines that end up having size equal to zero are not really
zero sized, and as now we do kernel_maps__fixup_sym_end, at least
for kernel routines this gets fixed.

A similar fixup needs to be done for the userspace bits as well,
but as this fixup started only because in /proc/kallsyms we don't
have the end address nor the function size, it appeared here first.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <1254796503-27203-1-git-send-email-acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 12:08:08 +02:00
Tom Zanussi
b934cdd55f perf trace: Update eval_flag() flags array to match interrupt.h
Add missing BLOCK_IOPOLL_SOFTIRQ entry.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254808849-7829-3-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 12:02:34 +02:00
Tom Zanussi
d4c3768faa perf trace: Remove unused code in builtin-trace.c
And some minor whitespace cleanup.

Signed-off-by: Tom Zanussi <tzanussi@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: rostedt@goodmis.org
Cc: lizf@cn.fujitsu.com
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
LKML-Reference: <1254808849-7829-2-git-send-email-tzanussi@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-06 12:02:33 +02:00
Arnaldo Carvalho de Melo
c3b32fcbc7 perf report: Use kernel_maps__find_symbol as fallback to find vdsos, etc
In resolve_symbol, as we're moving to breaking the kernel symbols
list per address ranges, i.e. kernel linking sections, so that we
don't have a big kernel_map that in its range covers what is in the
modules.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05 20:35:24 +02:00
Arnaldo Carvalho de Melo
a2a99e8e12 perf tools: /proc/modules names don't always match its name
$ cut -d' ' -f1 /proc/modules|grep _|wc -l
 29
 $ cut -d' ' -f1 /proc/modules|grep _|sed 's/$/.ko'/g|while read n;do find /lib/modules/`uname -r` -name $n;done|wc -l
 12

For instance:

 $ grep ^aes_x86 /proc/modules
 aes_x86_64 9056 2 - Live 0xffffffffa0091000
 $ l /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko
 -rw-r--r-- 1 root root 136438 2009-09-22 19:05 /lib/modules/2.6.31-tip/kernel/arch/x86/crypto/aes-x86_64.ko

Handle that by introducing a strxfrchar routine that replaces
dashes with underscores when matching file names to loaded modules.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05 20:35:23 +02:00
Arnaldo Carvalho de Melo
af427bf529 perf tools: Create maps for modules when processing kallsyms
So that we get kallsyms processing closer to vmlinux + modules
symtabs processing.

One change in behaviour is that since when one specifies --vmlinux
-m should be used to ask for modules, so it is now for kallsyms as
well.

Also continue if one manages to load the vmlinux data but module
processing fails, so that at least some analisys can be done with
part of the needed symbols.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05 20:35:23 +02:00
Arnaldo Carvalho de Melo
5c2068059a perf top: Keep the default of asking for kernel module symbols
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-05 20:35:22 +02:00
Chris Wilson
933da83aa1 perf: Propagate term signal to child
If we launch the child on behalf of the user, ensure that it dies
along with ourselves when we are interrupted.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
LKML-Reference: <1254616502-4728-1-git-send-email-chris@chris-wilson.co.uk>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04 19:37:39 +02:00
Arnaldo Carvalho de Melo
ec218fc4a7 perf tools: Remove show_mask bitmask
As it was not being exposed via any command line and with --dsos/--comms
we can do this and even more, like asking for just kernel + some module:

[root@doppio linux-2.6-tip]# perf report --dsos \[kernel\],\[drm\]
--vmlinux /home/acme/git/build/tip-recvmmsg/vmlinux --modules | head -15
 # Samples: 619669
 #
 # Overhead          Command  Shared Object  Symbol
 # ........  ...............  .............  ......
 #
      7.12%          swapper  [kernel]       [k] read_hpet
      6.86%             init  [kernel]       [k] read_hpet
      6.22%             init  [kernel]       [k] mwait_idle_with_hints
      5.34%          swapper  [kernel]       [k] mwait_idle_with_hints
      3.01%          firefox  [kernel]       [.] vread_hpet
      2.14%             Xorg  [drm]          [k] drm_clflush_pages
      2.09%           pidgin  [kernel]       [.] vread_hpet
      1.58%     npviewer.bin  [kernel]       [.] vread_hpet
      1.37%          swapper  [kernel]       [k] hpet_next_event
      1.23%             Xorg  [kernel]       [k] read_hpet
[root@doppio linux-2.6-tip]#

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20091003233048.GA30535@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-04 19:37:11 +02:00
Arnaldo Carvalho de Melo
9735abf11b perf tools: Move hist_entry__add common code to hist.c
Now perf report and annotate do the callgraph/hit processing in
their specialized hist_entry__add functions.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-03 16:01:59 +02:00
Arnaldo Carvalho de Melo
439d473b47 perf tools: Rewrite and improve support for kernel modules
Representing modules as struct map entries, backed by a DSO, etc,
using /proc/modules to find where the module is loaded.

DSOs now can have a short and long name, so that in verbose mode we
can show exactly which .ko or vmlinux image was used.

As kernel modules now are a DSO separate from the kernel, we can
ask for just the hits for a particular set of kernel modules, just
like we can do with shared libraries:

[root@doppio linux-2.6-tip]# perf report -n --vmlinux
/home/acme/git/build/tip-recvmmsg/vmlinux --modules --dsos \[drm\] | head -15
    84.58%      13266             Xorg  [k] drm_clflush_pages
     4.02%        630             Xorg  [k] trace_kmalloc.clone.0
     3.95%        619             Xorg  [k] drm_ioctl
     2.07%        324             Xorg  [k] drm_addbufs
     1.68%        263             Xorg  [k] drm_gem_close_ioctl
     0.77%        120             Xorg  [k] drm_setmaster_ioctl
     0.70%        110             Xorg  [k] drm_lastclose
     0.68%        106             Xorg  [k] drm_open
     0.54%         85             Xorg  [k] drm_mm_search_free
[root@doppio linux-2.6-tip]#

Specifying --dsos /lib/modules/2.6.31-tip/kernel/drivers/gpu/drm/drm.ko
would have the same effect. Allowing specifying just 'drm.ko' is left
for another patch.

Processing kallsyms so that per kernel module struct map are
instantiated was also left for another patch. That will allow
removing the module name from each of its symbols.

struct symbol was reduced by removing the ->module backpointer and
moving it (well now the map) to struct symbol_entry in perf top,
that is its only user right now.

The total linecount went down by ~500 lines.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Avi Kivity <avi@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-02 10:48:42 +02:00
Mulyadi Santosa
1ad0560e8c perf tools: Run generate-cmdlist.sh properly
Right now generate-cmdlist.sh is not executable, so we
should call it as an argument ".".

This fixes cases where due to different umask defaults
the generate-cmdlist.sh script is not executable in
a kernel tree checkout.

Signed-off-by: Mulyadi Santosa <mulyadi.santosa@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <f284c33d0909251201w422e9687x8cd3a784e85adf7d@mail.gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01 10:12:03 +02:00
Arjan van de Ven
39a90a8ef1 perf timechart: Add a power-only mode
For doing work on the Linux power management components, I need to
make long (30+ seconds) traces. Currently, this then results in a
HUGE svg file, with mostly process data that isn't interesting.

This patch adds a --power-only mode to perf timechart that only
outputs the CPU power section of the SVG; this significantly
reduces the size of the SVG file, making even 30+ second traces
viewable with inkscape.

As a minor tweak for the same effect, the minimum text size is
decreased; current inkscape cannot zoom in deep enough to show text
this small, but it reduces inkscape compute time.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: peterz@infradead.org
LKML-Reference: <20090924154013.0675ab71@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01 09:26:40 +02:00
Arnaldo Carvalho de Melo
2ccdc450e6 perf top: Remove dead {min,max}_ip unused variables
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20090924212400.GA15321@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-10-01 08:46:15 +02:00
Arnaldo Carvalho de Melo
cad3071424 perf trace: Remove dead code
Several variables are not used at all, cut'n'paste leftovers.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
LKML-Reference: <20090928200818.GF3361@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:57:57 +02:00
Arnaldo Carvalho de Melo
a80deb622d perf sched: Remove dead code
Several variables are not used at all, cut'n'paste leftovers.

Also check if the sample_type is RAW earlier, to avoid needless
searches.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:57:57 +02:00
Arnaldo Carvalho de Melo
1b46cddfcc perf tools: Use rb_tree for maps
Threads can have many and kernel modules will be represented as a
tree of maps as well.

Ah, and for a perf.data with 146607 samples:

Before:

[root@doppio ~]# perf stat -r 5 perf report > /dev/null

 Performance counter stats for 'perf report' (5 runs):

     699.823680  task-clock-msecs         #      0.991 CPUs    ( +-   0.454% )
             74  context-switches         #      0.000 M/sec   ( +-   1.709% )
              2  CPU-migrations           #      0.000 M/sec   ( +-  17.008% )
          23114  page-faults              #      0.033 M/sec   ( +-   0.000% )
     1381257019  cycles                   #   1973.721 M/sec   ( +-   0.290% )
     1456894438  instructions             #      1.055 IPC     ( +-   0.007% )
       18779818  cache-references         #     26.835 M/sec   ( +-   0.380% )
         641799  cache-misses             #      0.917 M/sec   ( +-   1.200% )

    0.705972729  seconds time elapsed   ( +-   0.501% )

[root@doppio ~]#

After

 Performance counter stats for 'perf report' (5 runs):

     691.261451  task-clock-msecs         #      0.993 CPUs    ( +-   0.307% )
             72  context-switches         #      0.000 M/sec   ( +-   0.829% )
              6  CPU-migrations           #      0.000 M/sec   ( +-  18.409% )
          23127  page-faults              #      0.033 M/sec   ( +-   0.000% )
     1366395876  cycles                   #   1976.670 M/sec   ( +-   0.153% )
     1443136016  instructions             #      1.056 IPC     ( +-   0.012% )
       17956402  cache-references         #     25.976 M/sec   ( +-   0.325% )
         661924  cache-misses             #      0.958 M/sec   ( +-   1.335% )

    0.696127275  seconds time elapsed   ( +-   0.377% )

I.e. we see some speedup too.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
LKML-Reference: <20090928174846.GA3361@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:57:56 +02:00
John Kacur
3d1d07ecd2 perf tools: Put common histogram functions in their own file
Move histogram related functions into their own files (hist.c and
hist.h) and make use of them in builtin-annotate.c and
builtin-report.c.

Signed-off-by: John Kacur <jkacur@redhat.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <alpine.LFD.2.00.0909281531180.8316@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:57:56 +02:00
Arnaldo Carvalho de Melo
8357275bb9 perf top: Add poll_idle to the skip list
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mike Galbraith <efault@gmx.de>
LKML-Reference: <20090925220239.GA5488@ghostprotocols.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-30 13:56:54 +02:00
John Kacur
dd68ada2d4 perf tools: Create util/sort.and use it
Create util/sort.[ch] and move common functionality for
builtin-report.c and builtin-annotate.c there, and make use of it.

Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0909241758390.11383@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:27:52 +02:00
John Kacur
8b40f521cf perf tools: Protect header files with a consistent style
There was a colorful mix of header guards - standardize them.

Signed-off-by: John Kacur <jkacur@redhat.com>
LKML-Reference: <alpine.LFD.2.00.0909241756530.11383@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:27:51 +02:00
John Kacur
cbfeb267cb perf annotate: Add the cmp_null function and make use of it
This function exists in builtin-report.c but not in
builtin-annotate.c Functions that use cmp_null are shorter and
clearer.

Synchronizing functions between these two files will also make it
easier to potential share code in the future.

Signed-off-by: John Kacur <jkacur@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <alpine.LFD.2.00.0909241754031.11383@localhost.localdomain>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:27:50 +02:00
Eric Dumazet
725b13685c perf tools: Dont use openat()
openat() is still a young glibc facility, better to not use it in a
non performance critical program (perf list)

Many machines have older glibc (RHEL 4 Update 5 -> glibc-2.3.4-2.36
on my dev machine for example).

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ulrich Drepper <drepper@redhat.com>
LKML-Reference: <4ABB767D.6080004@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 21:20:08 +02:00
Eric Dumazet
a255a9981a perf tools: Fix buffer allocation
"perf top" cores dump on my dev machine, if run from a directory
where vmlinux is present:

  *** glibc detected *** malloc(): memory corruption: 0x085670d0 ***

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@kernel.org>
LKML-Reference: <4ABB6EB7.7000002@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 15:11:23 +02:00
Kirill Smelkov
67030036eb perf tools: .gitignore += perf*.html
I've tried building the docs in tools/perf/Documentation/ , and after
that `git status` showed dozen of untracked htmls. Let's ignore them.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
LKML-Reference: <1253790022-10300-1-git-send-email-kirr@mns.spb.ru>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 14:01:22 +02:00
Mike Galbraith
dd906a0fe8 perf tools: Handle relative paths while loading module symbols
Inform util/module.c::mod_dso__load_module_paths() that relative
paths do exist in some modules.dep, and make it fail noisily should
it encounter a path that it doesn't understand, or a module it
cannot open.

Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1253779628.10513.8.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-24 11:40:35 +02:00
Mike Galbraith
508c4d0874 perf tools: Fix module symbol loading bug
Avi Kivity reported 'perf annotate' failures with modules, the
requested function was not annotated.

If there are no modules currently loaded, or the last module
scanned is not loaded, dso__load_modules() steps on the value from
dso__load_vmlinux(), so we happily load the kallsyms symbols on top
of what we've already loaded.

Fix that such that the total count of symbols loaded is returned.
Should module symbol load fail after parsing of vmlinux, is's a
hard failure, so do not silently fall-back to kallsyms.

Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: rostedt@goodmis.org
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
LKML-Reference: <1253697658.11461.36.camel@marge.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-23 13:45:48 +02:00