License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 07:07:57 -07:00
|
|
|
// SPDX-License-Identifier: GPL-2.0
|
2017-04-17 11:23:08 -07:00
|
|
|
#include <inttypes.h>
|
2021-09-16 05:09:39 -07:00
|
|
|
#include <signal.h>
|
2013-09-10 22:09:28 -07:00
|
|
|
#include <stdio.h>
|
|
|
|
#include <stdlib.h>
|
|
|
|
#include <string.h>
|
2021-09-16 05:09:39 -07:00
|
|
|
#include <sys/types.h>
|
2013-09-10 22:09:28 -07:00
|
|
|
|
perf report: Support LLVM for addr2line()
In addition to the existing support for libbfd and calling out to
an external addr2line command, add support for using libllvm directly.
This is both faster than libbfd, and can be enabled in distro builds
(the LLVM license has an explicit provision for GPLv2 compatibility).
Thus, it is set as the primary choice if available.
As an example, running 'perf report' on a medium-size profile with
DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
libbfd, 153 seconds with external llvm-addr2line, and I got tired and
aborted the test after waiting for 55 minutes with external bfd
addr2line (which is the default for perf as compiled by distributions
today).
Evidently, for this case, the bfd addr2line process needs 18 seconds (on
a 5.2 GHz Zen 3) to load the .debug ELF in question, hits the 1-second
timeout and gets killed during initialization, getting restarted anew
every time. Having an in-process addr2line makes this much more robust.
As future extensions, libllvm can be used in many other places where
we currently use libbfd or other libraries:
- Symbol enumeration (in particular, for PE binaries).
- Demangling (including non-Itanium demangling, e.g. Microsoft
or Rust).
- Disassembling (perf annotate).
However, these are much less pressing; most people don't profile PE
binaries, and perf has non-bfd paths for ELF. The same with demangling;
the default _cxa_demangle path works fine for most users, and while bfd
objdump can be slow on large binaries, it is possible to use
--objdump=llvm-objdump to get the speed benefits. (It appears
LLVM-based demangling is very simple, should we want that.)
Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
correctly detected using feature_check, and thus was not tested.
Committer notes:
Added the name and a __maybe_unused to address:
1 13.50 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC)
util/srcline.c: In function 'dso__free_a2l':
util/srcline.c:184:20: error: parameter name omitted
void dso__free_a2l(struct dso *)
^~~~~~~~~~~~
make[3]: *** [/git/perf-6.11.0-rc3/tools/build/Makefile.build:158: util] Error 2
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240803152008.2818485-1-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-03 08:20:06 -07:00
|
|
|
#include <linux/compiler.h>
|
2013-09-10 22:09:28 -07:00
|
|
|
#include <linux/kernel.h>
|
2019-06-26 08:13:13 -07:00
|
|
|
#include <linux/string.h>
|
2019-07-04 07:32:27 -07:00
|
|
|
#include <linux/zalloc.h>
|
2013-09-10 22:09:28 -07:00
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
#include <api/io.h>
|
|
|
|
|
2013-09-10 22:09:30 -07:00
|
|
|
#include "util/dso.h"
|
2013-09-10 22:09:28 -07:00
|
|
|
#include "util/debug.h"
|
2017-03-25 13:34:26 -07:00
|
|
|
#include "util/callchain.h"
|
2019-06-28 02:23:03 -07:00
|
|
|
#include "util/symbol_conf.h"
|
perf report: Support LLVM for addr2line()
In addition to the existing support for libbfd and calling out to
an external addr2line command, add support for using libllvm directly.
This is both faster than libbfd, and can be enabled in distro builds
(the LLVM license has an explicit provision for GPLv2 compatibility).
Thus, it is set as the primary choice if available.
As an example, running 'perf report' on a medium-size profile with
DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
libbfd, 153 seconds with external llvm-addr2line, and I got tired and
aborted the test after waiting for 55 minutes with external bfd
addr2line (which is the default for perf as compiled by distributions
today).
Evidently, for this case, the bfd addr2line process needs 18 seconds (on
a 5.2 GHz Zen 3) to load the .debug ELF in question, hits the 1-second
timeout and gets killed during initialization, getting restarted anew
every time. Having an in-process addr2line makes this much more robust.
As future extensions, libllvm can be used in many other places where
we currently use libbfd or other libraries:
- Symbol enumeration (in particular, for PE binaries).
- Demangling (including non-Itanium demangling, e.g. Microsoft
or Rust).
- Disassembling (perf annotate).
However, these are much less pressing; most people don't profile PE
binaries, and perf has non-bfd paths for ELF. The same with demangling;
the default _cxa_demangle path works fine for most users, and while bfd
objdump can be slow on large binaries, it is possible to use
--objdump=llvm-objdump to get the speed benefits. (It appears
LLVM-based demangling is very simple, should we want that.)
Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
correctly detected using feature_check, and thus was not tested.
Committer notes:
Added the name and a __maybe_unused to address:
1 13.50 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC)
util/srcline.c: In function 'dso__free_a2l':
util/srcline.c:184:20: error: parameter name omitted
void dso__free_a2l(struct dso *)
^~~~~~~~~~~~
make[3]: *** [/git/perf-6.11.0-rc3/tools/build/Makefile.build:158: util] Error 2
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240803152008.2818485-1-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-03 08:20:06 -07:00
|
|
|
#ifdef HAVE_LIBLLVM_SUPPORT
|
|
|
|
#include "util/llvm-c-helpers.h"
|
|
|
|
#endif
|
2017-04-17 12:30:49 -07:00
|
|
|
#include "srcline.h"
|
2017-10-30 19:06:54 -07:00
|
|
|
#include "string2.h"
|
2014-11-12 19:05:27 -07:00
|
|
|
#include "symbol.h"
|
2021-09-16 05:09:39 -07:00
|
|
|
#include "subcmd/run-command.h"
|
2014-11-12 19:05:27 -07:00
|
|
|
|
2023-06-07 23:18:12 -07:00
|
|
|
/* If addr2line doesn't return data for 1 second then timeout. */
|
|
|
|
int addr2line_timeout_ms = 1 * 1000;
|
2015-08-07 15:24:05 -07:00
|
|
|
bool srcline_full_filename;
|
|
|
|
|
2023-06-12 07:10:46 -07:00
|
|
|
char *srcline__unknown = (char *)"??:0";
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
static const char *srcline_dso_name(struct dso *dso)
|
2017-03-25 13:34:25 -07:00
|
|
|
{
|
|
|
|
const char *dso_name;
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
if (dso__symsrc_filename(dso))
|
|
|
|
dso_name = dso__symsrc_filename(dso);
|
2017-03-25 13:34:25 -07:00
|
|
|
else
|
2024-05-04 14:38:01 -07:00
|
|
|
dso_name = dso__long_name(dso);
|
2017-03-25 13:34:25 -07:00
|
|
|
|
|
|
|
if (dso_name[0] == '[')
|
|
|
|
return NULL;
|
|
|
|
|
2024-06-22 23:48:49 -07:00
|
|
|
if (is_perf_pid_map_name(dso_name))
|
2017-03-25 13:34:25 -07:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return dso_name;
|
|
|
|
}
|
|
|
|
|
2017-10-09 13:32:58 -07:00
|
|
|
static int inline_list__append(struct symbol *symbol, char *srcline,
|
|
|
|
struct inline_node *node)
|
2017-03-25 13:34:26 -07:00
|
|
|
{
|
|
|
|
struct inline_list *ilist;
|
|
|
|
|
|
|
|
ilist = zalloc(sizeof(*ilist));
|
|
|
|
if (ilist == NULL)
|
|
|
|
return -1;
|
|
|
|
|
2017-10-09 13:32:57 -07:00
|
|
|
ilist->symbol = symbol;
|
2017-10-09 13:32:58 -07:00
|
|
|
ilist->srcline = srcline;
|
2017-03-25 13:34:26 -07:00
|
|
|
|
2017-05-23 23:21:27 -07:00
|
|
|
if (callchain_param.order == ORDER_CALLEE)
|
|
|
|
list_add_tail(&ilist->list, &node->val);
|
|
|
|
else
|
|
|
|
list_add(&ilist->list, &node->val);
|
2017-03-25 13:34:26 -07:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-10-09 13:32:58 -07:00
|
|
|
/* basename version that takes a const input string */
|
|
|
|
static const char *gnu_basename(const char *path)
|
|
|
|
{
|
|
|
|
const char *base = strrchr(path, '/');
|
|
|
|
|
|
|
|
return base ? base + 1 : path;
|
|
|
|
}
|
|
|
|
|
|
|
|
static char *srcline_from_fileline(const char *file, unsigned int line)
|
|
|
|
{
|
|
|
|
char *srcline;
|
|
|
|
|
|
|
|
if (!file)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
if (!srcline_full_filename)
|
|
|
|
file = gnu_basename(file);
|
|
|
|
|
|
|
|
if (asprintf(&srcline, "%s:%u", file, line) < 0)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
return srcline;
|
|
|
|
}
|
|
|
|
|
2017-10-30 19:06:54 -07:00
|
|
|
static struct symbol *new_inline_sym(struct dso *dso,
|
|
|
|
struct symbol *base_sym,
|
|
|
|
const char *funcname)
|
|
|
|
{
|
|
|
|
struct symbol *inline_sym;
|
|
|
|
char *demangled = NULL;
|
|
|
|
|
perf report: Don't crash on invalid inline debug information
When the function name for an inline frame is invalid, we must not try
to demangle this symbol, otherwise we crash with:
#0 0x0000555555895c01 in bfd_demangle ()
#1 0x0000555555823262 in demangle_sym (dso=0x555555d92b90, elf_name=0x0, kmodule=0) at util/symbol-elf.c:215
#2 dso__demangle_sym (dso=dso@entry=0x555555d92b90, kmodule=<optimized out>, kmodule@entry=0, elf_name=elf_name@entry=0x0) at util/symbol-elf.c:400
#3 0x00005555557fef4b in new_inline_sym (funcname=0x0, base_sym=0x555555d92b90, dso=0x555555d92b90) at util/srcline.c:89
#4 inline_list__append_dso_a2l (dso=dso@entry=0x555555c7bb00, node=node@entry=0x555555e31810, sym=sym@entry=0x555555d92b90) at util/srcline.c:264
#5 0x00005555557ff27f in addr2line (dso_name=dso_name@entry=0x555555d92430 "/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf", addr=addr@entry=2888, file=file@entry=0x0,
line=line@entry=0x0, dso=dso@entry=0x555555c7bb00, unwind_inlines=unwind_inlines@entry=true, node=0x555555e31810, sym=0x555555d92b90) at util/srcline.c:313
#6 0x00005555557ffe7c in addr2inlines (sym=0x555555d92b90, dso=0x555555c7bb00, addr=2888, dso_name=0x555555d92430 "/home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf")
at util/srcline.c:358
So instead handle the case where we get invalid function names for
inlined frames and use a fallback '??' function name instead.
While this crash was originally reported by Hadrien for rust code, I can
now also reproduce it with trivial C++ code. Indeed, it seems like
libbfd fails to interpret the debug information for the inline frame
symbol name:
$ addr2line -e /home/milian/.debug/.build-id/f7/186d14bb94f3c6161c010926da66033d24fce5/elf -if b48
main
/usr/include/c++/8.2.1/complex:610
??
/usr/include/c++/8.2.1/complex:618
??
/usr/include/c++/8.2.1/complex:675
??
/usr/include/c++/8.2.1/complex:685
main
/home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39
I've reported this bug upstream and also attached a patch there which
should fix this issue:
https://sourceware.org/bugzilla/show_bug.cgi?id=23715
Reported-by: Hadrien Grasland <grasland@lal.in2p3.fr>
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: a64489c56c30 ("perf report: Find the inline stack for a given address")
[ The above 'Fixes:' cset is where originally the problem was
introduced, i.e. using a2l->funcname without checking if it is NULL,
but this current patch fixes the current codebase, i.e. multiple csets
were applied after a64489c56c30 before the problem was reported by Hadrien ]
Link: http://lkml.kernel.org/r/20180926135207.30263-3-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-09-26 06:52:07 -07:00
|
|
|
if (!funcname)
|
|
|
|
funcname = "??";
|
|
|
|
|
2017-10-30 19:06:54 -07:00
|
|
|
if (dso) {
|
|
|
|
demangled = dso__demangle_sym(dso, 0, funcname);
|
|
|
|
if (demangled)
|
|
|
|
funcname = demangled;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (base_sym && strcmp(funcname, base_sym->name) == 0) {
|
|
|
|
/* reuse the real, existing symbol */
|
|
|
|
inline_sym = base_sym;
|
|
|
|
/* ensure that we don't alias an inlined symbol, which could
|
|
|
|
* lead to double frees in inline_node__delete
|
|
|
|
*/
|
|
|
|
assert(!base_sym->inlined);
|
|
|
|
} else {
|
|
|
|
/* create a fake symbol for the inline frame */
|
|
|
|
inline_sym = symbol__new(base_sym ? base_sym->start : 0,
|
2019-02-19 06:05:31 -07:00
|
|
|
base_sym ? (base_sym->end - base_sym->start) : 0,
|
2017-10-30 19:06:54 -07:00
|
|
|
base_sym ? base_sym->binding : 0,
|
2018-04-26 07:09:10 -07:00
|
|
|
base_sym ? base_sym->type : 0,
|
2017-10-30 19:06:54 -07:00
|
|
|
funcname);
|
|
|
|
if (inline_sym)
|
|
|
|
inline_sym->inlined = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
free(demangled);
|
|
|
|
|
|
|
|
return inline_sym;
|
|
|
|
}
|
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
#define MAX_INLINE_NEST 1024
|
|
|
|
|
perf report: Support LLVM for addr2line()
In addition to the existing support for libbfd and calling out to
an external addr2line command, add support for using libllvm directly.
This is both faster than libbfd, and can be enabled in distro builds
(the LLVM license has an explicit provision for GPLv2 compatibility).
Thus, it is set as the primary choice if available.
As an example, running 'perf report' on a medium-size profile with
DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
libbfd, 153 seconds with external llvm-addr2line, and I got tired and
aborted the test after waiting for 55 minutes with external bfd
addr2line (which is the default for perf as compiled by distributions
today).
Evidently, for this case, the bfd addr2line process needs 18 seconds (on
a 5.2 GHz Zen 3) to load the .debug ELF in question, hits the 1-second
timeout and gets killed during initialization, getting restarted anew
every time. Having an in-process addr2line makes this much more robust.
As future extensions, libllvm can be used in many other places where
we currently use libbfd or other libraries:
- Symbol enumeration (in particular, for PE binaries).
- Demangling (including non-Itanium demangling, e.g. Microsoft
or Rust).
- Disassembling (perf annotate).
However, these are much less pressing; most people don't profile PE
binaries, and perf has non-bfd paths for ELF. The same with demangling;
the default _cxa_demangle path works fine for most users, and while bfd
objdump can be slow on large binaries, it is possible to use
--objdump=llvm-objdump to get the speed benefits. (It appears
LLVM-based demangling is very simple, should we want that.)
Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
correctly detected using feature_check, and thus was not tested.
Committer notes:
Added the name and a __maybe_unused to address:
1 13.50 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC)
util/srcline.c: In function 'dso__free_a2l':
util/srcline.c:184:20: error: parameter name omitted
void dso__free_a2l(struct dso *)
^~~~~~~~~~~~
make[3]: *** [/git/perf-6.11.0-rc3/tools/build/Makefile.build:158: util] Error 2
Signed-off-by: Steinar H. Gunderson <sesse@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240803152008.2818485-1-sesse@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-03 08:20:06 -07:00
|
|
|
#ifdef HAVE_LIBLLVM_SUPPORT
|
|
|
|
|
|
|
|
static void free_llvm_inline_frames(struct llvm_a2l_frame *inline_frames,
|
|
|
|
int num_frames)
|
|
|
|
{
|
|
|
|
if (inline_frames != NULL) {
|
|
|
|
for (int i = 0; i < num_frames; ++i) {
|
|
|
|
zfree(&inline_frames[i].filename);
|
|
|
|
zfree(&inline_frames[i].funcname);
|
|
|
|
}
|
|
|
|
zfree(&inline_frames);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static int addr2line(const char *dso_name, u64 addr,
|
|
|
|
char **file, unsigned int *line, struct dso *dso,
|
|
|
|
bool unwind_inlines, struct inline_node *node,
|
|
|
|
struct symbol *sym)
|
|
|
|
{
|
|
|
|
struct llvm_a2l_frame *inline_frames = NULL;
|
|
|
|
int num_frames = llvm_addr2line(dso_name, addr, file, line,
|
|
|
|
node && unwind_inlines, &inline_frames);
|
|
|
|
|
|
|
|
if (num_frames == 0 || !inline_frames) {
|
|
|
|
/* Error, or we didn't want inlines. */
|
|
|
|
return num_frames;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (int i = 0; i < num_frames; ++i) {
|
|
|
|
struct symbol *inline_sym =
|
|
|
|
new_inline_sym(dso, sym, inline_frames[i].funcname);
|
|
|
|
char *srcline = NULL;
|
|
|
|
|
|
|
|
if (inline_frames[i].filename) {
|
|
|
|
srcline =
|
|
|
|
srcline_from_fileline(inline_frames[i].filename,
|
|
|
|
inline_frames[i].line);
|
|
|
|
}
|
|
|
|
if (inline_list__append(inline_sym, srcline, node) != 0) {
|
|
|
|
free_llvm_inline_frames(inline_frames, num_frames);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
free_llvm_inline_frames(inline_frames, num_frames);
|
|
|
|
|
|
|
|
return num_frames;
|
|
|
|
}
|
|
|
|
|
|
|
|
void dso__free_a2l(struct dso *dso __maybe_unused)
|
|
|
|
{
|
|
|
|
/* Nothing to free. */
|
|
|
|
}
|
|
|
|
|
|
|
|
#elif defined(HAVE_LIBBFD_SUPPORT)
|
2013-09-10 22:09:32 -07:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Implement addr2line using libbfd.
|
|
|
|
*/
|
|
|
|
#define PACKAGE "perf"
|
|
|
|
#include <bfd.h>
|
|
|
|
|
|
|
|
struct a2l_data {
|
|
|
|
const char *input;
|
2014-12-15 23:19:06 -07:00
|
|
|
u64 addr;
|
2013-09-10 22:09:32 -07:00
|
|
|
|
|
|
|
bool found;
|
|
|
|
const char *filename;
|
|
|
|
const char *funcname;
|
|
|
|
unsigned line;
|
|
|
|
|
|
|
|
bfd *abfd;
|
|
|
|
asymbol **syms;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int bfd_error(const char *string)
|
|
|
|
{
|
|
|
|
const char *errmsg;
|
|
|
|
|
|
|
|
errmsg = bfd_errmsg(bfd_get_error());
|
|
|
|
fflush(stdout);
|
|
|
|
|
|
|
|
if (string)
|
|
|
|
pr_debug("%s: %s\n", string, errmsg);
|
|
|
|
else
|
|
|
|
pr_debug("%s\n", errmsg);
|
|
|
|
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int slurp_symtab(bfd *abfd, struct a2l_data *a2l)
|
|
|
|
{
|
|
|
|
long storage;
|
|
|
|
long symcount;
|
|
|
|
asymbol **syms;
|
|
|
|
bfd_boolean dynamic = FALSE;
|
|
|
|
|
|
|
|
if ((bfd_get_file_flags(abfd) & HAS_SYMS) == 0)
|
|
|
|
return bfd_error(bfd_get_filename(abfd));
|
|
|
|
|
|
|
|
storage = bfd_get_symtab_upper_bound(abfd);
|
|
|
|
if (storage == 0L) {
|
|
|
|
storage = bfd_get_dynamic_symtab_upper_bound(abfd);
|
|
|
|
dynamic = TRUE;
|
|
|
|
}
|
|
|
|
if (storage < 0L)
|
|
|
|
return bfd_error(bfd_get_filename(abfd));
|
|
|
|
|
|
|
|
syms = malloc(storage);
|
|
|
|
if (dynamic)
|
|
|
|
symcount = bfd_canonicalize_dynamic_symtab(abfd, syms);
|
|
|
|
else
|
|
|
|
symcount = bfd_canonicalize_symtab(abfd, syms);
|
|
|
|
|
|
|
|
if (symcount < 0) {
|
|
|
|
free(syms);
|
|
|
|
return bfd_error(bfd_get_filename(abfd));
|
|
|
|
}
|
|
|
|
|
|
|
|
a2l->syms = syms;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void find_address_in_section(bfd *abfd, asection *section, void *data)
|
|
|
|
{
|
|
|
|
bfd_vma pc, vma;
|
|
|
|
bfd_size_type size;
|
|
|
|
struct a2l_data *a2l = data;
|
2020-01-28 08:29:38 -07:00
|
|
|
flagword flags;
|
2013-09-10 22:09:32 -07:00
|
|
|
|
|
|
|
if (a2l->found)
|
|
|
|
return;
|
|
|
|
|
2020-01-28 08:29:38 -07:00
|
|
|
#ifdef bfd_get_section_flags
|
|
|
|
flags = bfd_get_section_flags(abfd, section);
|
|
|
|
#else
|
|
|
|
flags = bfd_section_flags(section);
|
|
|
|
#endif
|
|
|
|
if ((flags & SEC_ALLOC) == 0)
|
2013-09-10 22:09:32 -07:00
|
|
|
return;
|
|
|
|
|
|
|
|
pc = a2l->addr;
|
2020-01-28 08:29:38 -07:00
|
|
|
#ifdef bfd_get_section_vma
|
2013-09-10 22:09:32 -07:00
|
|
|
vma = bfd_get_section_vma(abfd, section);
|
2020-01-28 08:29:38 -07:00
|
|
|
#else
|
|
|
|
vma = bfd_section_vma(section);
|
|
|
|
#endif
|
|
|
|
#ifdef bfd_get_section_size
|
2013-09-10 22:09:32 -07:00
|
|
|
size = bfd_get_section_size(section);
|
2020-01-28 08:29:38 -07:00
|
|
|
#else
|
|
|
|
size = bfd_section_size(section);
|
|
|
|
#endif
|
2013-09-10 22:09:32 -07:00
|
|
|
|
|
|
|
if (pc < vma || pc >= vma + size)
|
|
|
|
return;
|
|
|
|
|
|
|
|
a2l->found = bfd_find_nearest_line(abfd, section, a2l->syms, pc - vma,
|
|
|
|
&a2l->filename, &a2l->funcname,
|
|
|
|
&a2l->line);
|
2017-08-06 14:24:45 -07:00
|
|
|
|
|
|
|
if (a2l->filename && !strlen(a2l->filename))
|
|
|
|
a2l->filename = NULL;
|
2013-09-10 22:09:32 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
static struct a2l_data *addr2line_init(const char *path)
|
|
|
|
{
|
|
|
|
bfd *abfd;
|
|
|
|
struct a2l_data *a2l = NULL;
|
|
|
|
|
|
|
|
abfd = bfd_openr(path, NULL);
|
|
|
|
if (abfd == NULL)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
if (!bfd_check_format(abfd, bfd_object))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
a2l = zalloc(sizeof(*a2l));
|
|
|
|
if (a2l == NULL)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
a2l->abfd = abfd;
|
|
|
|
a2l->input = strdup(path);
|
|
|
|
if (a2l->input == NULL)
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
if (slurp_symtab(abfd, a2l))
|
|
|
|
goto out;
|
|
|
|
|
|
|
|
return a2l;
|
|
|
|
|
|
|
|
out:
|
|
|
|
if (a2l) {
|
2014-01-09 07:07:59 -07:00
|
|
|
zfree((char **)&a2l->input);
|
2013-09-10 22:09:32 -07:00
|
|
|
free(a2l);
|
|
|
|
}
|
|
|
|
bfd_close(abfd);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void addr2line_cleanup(struct a2l_data *a2l)
|
|
|
|
{
|
|
|
|
if (a2l->abfd)
|
|
|
|
bfd_close(a2l->abfd);
|
2014-01-09 07:07:59 -07:00
|
|
|
zfree((char **)&a2l->input);
|
2013-12-27 12:55:14 -07:00
|
|
|
zfree(&a2l->syms);
|
2013-09-10 22:09:32 -07:00
|
|
|
free(a2l);
|
|
|
|
}
|
|
|
|
|
2017-05-23 23:21:28 -07:00
|
|
|
static int inline_list__append_dso_a2l(struct dso *dso,
|
2017-10-09 13:32:57 -07:00
|
|
|
struct inline_node *node,
|
|
|
|
struct symbol *sym)
|
2017-05-23 23:21:28 -07:00
|
|
|
{
|
2024-07-03 18:17:45 -07:00
|
|
|
struct a2l_data *a2l = dso__a2l(dso);
|
2017-10-09 13:32:57 -07:00
|
|
|
struct symbol *inline_sym = new_inline_sym(dso, sym, a2l->funcname);
|
2017-10-09 13:32:58 -07:00
|
|
|
char *srcline = NULL;
|
2017-05-23 23:21:28 -07:00
|
|
|
|
2017-10-09 13:32:58 -07:00
|
|
|
if (a2l->filename)
|
|
|
|
srcline = srcline_from_fileline(a2l->filename, a2l->line);
|
|
|
|
|
|
|
|
return inline_list__append(inline_sym, srcline, node);
|
2017-05-23 23:21:28 -07:00
|
|
|
}
|
|
|
|
|
2014-12-15 23:19:06 -07:00
|
|
|
static int addr2line(const char *dso_name, u64 addr,
|
2015-09-01 11:47:19 -07:00
|
|
|
char **file, unsigned int *line, struct dso *dso,
|
2017-10-09 13:32:57 -07:00
|
|
|
bool unwind_inlines, struct inline_node *node,
|
|
|
|
struct symbol *sym)
|
2013-09-10 22:09:32 -07:00
|
|
|
{
|
|
|
|
int ret = 0;
|
2024-07-03 18:17:45 -07:00
|
|
|
struct a2l_data *a2l = dso__a2l(dso);
|
2013-12-03 00:23:07 -07:00
|
|
|
|
|
|
|
if (!a2l) {
|
2024-07-03 18:17:45 -07:00
|
|
|
a2l = addr2line_init(dso_name);
|
|
|
|
dso__set_a2l(dso, a2l);
|
2013-12-03 00:23:07 -07:00
|
|
|
}
|
2013-09-10 22:09:32 -07:00
|
|
|
|
|
|
|
if (a2l == NULL) {
|
2019-06-28 02:23:03 -07:00
|
|
|
if (!symbol_conf.disable_add2line_warn)
|
|
|
|
pr_warning("addr2line_init failed for %s\n", dso_name);
|
2013-09-10 22:09:32 -07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
a2l->addr = addr;
|
2013-12-03 00:23:07 -07:00
|
|
|
a2l->found = false;
|
|
|
|
|
2013-09-10 22:09:32 -07:00
|
|
|
bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l);
|
|
|
|
|
2017-05-23 23:21:24 -07:00
|
|
|
if (!a2l->found)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (unwind_inlines) {
|
2015-09-01 11:47:19 -07:00
|
|
|
int cnt = 0;
|
|
|
|
|
2017-10-09 13:32:57 -07:00
|
|
|
if (node && inline_list__append_dso_a2l(dso, node, sym))
|
2017-05-23 23:21:28 -07:00
|
|
|
return 0;
|
|
|
|
|
2015-09-01 11:47:19 -07:00
|
|
|
while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
|
|
|
|
&a2l->funcname, &a2l->line) &&
|
2017-03-25 13:34:26 -07:00
|
|
|
cnt++ < MAX_INLINE_NEST) {
|
|
|
|
|
2017-08-06 14:24:45 -07:00
|
|
|
if (a2l->filename && !strlen(a2l->filename))
|
|
|
|
a2l->filename = NULL;
|
|
|
|
|
2017-03-25 13:34:26 -07:00
|
|
|
if (node != NULL) {
|
2017-10-09 13:32:57 -07:00
|
|
|
if (inline_list__append_dso_a2l(dso, node, sym))
|
2017-03-25 13:34:26 -07:00
|
|
|
return 0;
|
2017-05-23 23:21:24 -07:00
|
|
|
// found at least one inline frame
|
|
|
|
ret = 1;
|
2017-03-25 13:34:26 -07:00
|
|
|
}
|
|
|
|
}
|
2015-09-01 11:47:19 -07:00
|
|
|
}
|
|
|
|
|
2017-05-23 23:21:24 -07:00
|
|
|
if (file) {
|
|
|
|
*file = a2l->filename ? strdup(a2l->filename) : NULL;
|
|
|
|
ret = *file ? 1 : 0;
|
2013-09-10 22:09:32 -07:00
|
|
|
}
|
|
|
|
|
2017-05-23 23:21:24 -07:00
|
|
|
if (line)
|
|
|
|
*line = a2l->line;
|
|
|
|
|
2013-09-10 22:09:32 -07:00
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2013-12-03 00:23:07 -07:00
|
|
|
void dso__free_a2l(struct dso *dso)
|
|
|
|
{
|
2024-07-03 18:17:45 -07:00
|
|
|
struct a2l_data *a2l = dso__a2l(dso);
|
2013-12-03 00:23:07 -07:00
|
|
|
|
|
|
|
if (!a2l)
|
|
|
|
return;
|
|
|
|
|
|
|
|
addr2line_cleanup(a2l);
|
|
|
|
|
2024-07-03 18:17:45 -07:00
|
|
|
dso__set_a2l(dso, NULL);
|
2013-12-03 00:23:07 -07:00
|
|
|
}
|
|
|
|
|
2013-09-10 22:09:32 -07:00
|
|
|
#else /* HAVE_LIBBFD_SUPPORT */
|
|
|
|
|
2017-03-25 13:34:25 -07:00
|
|
|
static int filename_split(char *filename, unsigned int *line_nr)
|
|
|
|
{
|
|
|
|
char *sep;
|
|
|
|
|
|
|
|
sep = strchr(filename, '\n');
|
|
|
|
if (sep)
|
|
|
|
*sep = '\0';
|
|
|
|
|
|
|
|
if (!strcmp(filename, "??:0"))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
sep = strchr(filename, ':');
|
|
|
|
if (sep) {
|
|
|
|
*sep++ = '\0';
|
|
|
|
*line_nr = strtoul(sep, NULL, 0);
|
|
|
|
return 1;
|
|
|
|
}
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("addr2line missing ':' in filename split\n");
|
2017-03-25 13:34:25 -07:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
static void addr2line_subprocess_cleanup(struct child_process *a2l)
|
2013-09-10 22:09:28 -07:00
|
|
|
{
|
2023-04-03 11:40:31 -07:00
|
|
|
if (a2l->pid != -1) {
|
|
|
|
kill(a2l->pid, SIGKILL);
|
|
|
|
finish_command(a2l); /* ignore result, we don't care */
|
|
|
|
a2l->pid = -1;
|
2024-01-31 17:15:02 -07:00
|
|
|
close(a2l->in);
|
|
|
|
close(a2l->out);
|
2021-09-16 05:09:39 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
free(a2l);
|
|
|
|
}
|
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
static struct child_process *addr2line_subprocess_init(const char *addr2line_path,
|
2023-03-28 16:55:43 -07:00
|
|
|
const char *binary_path)
|
2021-09-16 05:09:39 -07:00
|
|
|
{
|
2023-03-28 16:55:43 -07:00
|
|
|
const char *argv[] = {
|
|
|
|
addr2line_path ?: "addr2line",
|
|
|
|
"-e", binary_path,
|
2023-06-12 20:48:17 -07:00
|
|
|
"-a", "-i", "-f", NULL
|
2023-03-28 16:55:43 -07:00
|
|
|
};
|
2023-04-03 11:40:31 -07:00
|
|
|
struct child_process *a2l = zalloc(sizeof(*a2l));
|
2021-09-16 05:09:39 -07:00
|
|
|
int start_command_status = 0;
|
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
if (a2l == NULL) {
|
|
|
|
pr_err("Failed to allocate memory for addr2line");
|
|
|
|
return NULL;
|
|
|
|
}
|
2021-09-16 05:09:39 -07:00
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
a2l->pid = -1;
|
|
|
|
a2l->in = -1;
|
|
|
|
a2l->out = -1;
|
|
|
|
a2l->no_stderr = 1;
|
2021-09-16 05:09:39 -07:00
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
a2l->argv = argv;
|
|
|
|
start_command_status = start_command(a2l);
|
|
|
|
a2l->argv = NULL; /* it's not used after start_command; avoid dangling pointers */
|
2021-09-16 05:09:39 -07:00
|
|
|
|
|
|
|
if (start_command_status != 0) {
|
2023-03-28 16:55:43 -07:00
|
|
|
pr_warning("could not start addr2line (%s) for %s: start_command return code %d\n",
|
|
|
|
addr2line_path, binary_path, start_command_status);
|
2023-04-03 11:40:31 -07:00
|
|
|
addr2line_subprocess_cleanup(a2l);
|
|
|
|
return NULL;
|
2013-09-10 22:09:28 -07:00
|
|
|
}
|
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
return a2l;
|
2013-09-10 22:09:28 -07:00
|
|
|
}
|
2013-12-03 00:23:07 -07:00
|
|
|
|
2023-04-03 11:40:32 -07:00
|
|
|
enum a2l_style {
|
|
|
|
BROKEN,
|
|
|
|
GNU_BINUTILS,
|
|
|
|
LLVM,
|
|
|
|
};
|
|
|
|
|
2023-06-12 20:48:16 -07:00
|
|
|
static enum a2l_style addr2line_configure(struct child_process *a2l, const char *dso_name)
|
2023-04-03 11:40:32 -07:00
|
|
|
{
|
|
|
|
static bool cached;
|
|
|
|
static enum a2l_style style;
|
|
|
|
|
|
|
|
if (!cached) {
|
|
|
|
char buf[128];
|
|
|
|
struct io io;
|
|
|
|
int ch;
|
2023-06-12 20:48:16 -07:00
|
|
|
int lines;
|
2023-04-03 11:40:32 -07:00
|
|
|
|
|
|
|
if (write(a2l->in, ",\n", 2) != 2)
|
|
|
|
return BROKEN;
|
|
|
|
|
|
|
|
io__init(&io, a2l->out, buf, sizeof(buf));
|
|
|
|
ch = io__get_char(&io);
|
|
|
|
if (ch == ',') {
|
|
|
|
style = LLVM;
|
|
|
|
cached = true;
|
2023-06-12 20:48:16 -07:00
|
|
|
lines = 1;
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("Detected LLVM addr2line style\n");
|
2023-06-12 20:48:17 -07:00
|
|
|
} else if (ch == '0') {
|
2023-04-03 11:40:32 -07:00
|
|
|
style = GNU_BINUTILS;
|
|
|
|
cached = true;
|
2023-06-12 20:48:17 -07:00
|
|
|
lines = 3;
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("Detected binutils addr2line style\n");
|
2023-04-03 11:40:32 -07:00
|
|
|
} else {
|
2023-06-12 20:48:16 -07:00
|
|
|
if (!symbol_conf.disable_add2line_warn) {
|
|
|
|
char *output = NULL;
|
|
|
|
size_t output_len;
|
|
|
|
|
|
|
|
io__getline(&io, &output, &output_len);
|
|
|
|
pr_warning("%s %s: addr2line configuration failed\n",
|
|
|
|
__func__, dso_name);
|
|
|
|
pr_warning("\t%c%s", ch, output);
|
|
|
|
}
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("Unknown/broken addr2line style\n");
|
2023-06-12 20:48:16 -07:00
|
|
|
return BROKEN;
|
2023-04-03 11:40:32 -07:00
|
|
|
}
|
2023-06-12 20:48:16 -07:00
|
|
|
while (lines) {
|
2023-04-03 11:40:32 -07:00
|
|
|
ch = io__get_char(&io);
|
2023-06-12 20:48:16 -07:00
|
|
|
if (ch <= 0)
|
|
|
|
break;
|
|
|
|
if (ch == '\n')
|
|
|
|
lines--;
|
2023-04-03 11:40:32 -07:00
|
|
|
}
|
2023-04-03 11:40:33 -07:00
|
|
|
/* Ignore SIGPIPE in the event addr2line exits. */
|
|
|
|
signal(SIGPIPE, SIG_IGN);
|
2023-04-03 11:40:32 -07:00
|
|
|
}
|
|
|
|
return style;
|
|
|
|
}
|
|
|
|
|
2023-04-03 11:40:31 -07:00
|
|
|
static int read_addr2line_record(struct io *io,
|
2023-04-03 11:40:32 -07:00
|
|
|
enum a2l_style style,
|
2023-06-14 19:50:41 -07:00
|
|
|
const char *dso_name,
|
|
|
|
u64 addr,
|
|
|
|
bool first,
|
2021-09-16 05:09:39 -07:00
|
|
|
char **function,
|
|
|
|
char **filename,
|
|
|
|
unsigned int *line_nr)
|
2013-12-03 00:23:07 -07:00
|
|
|
{
|
2021-09-16 05:09:39 -07:00
|
|
|
/*
|
|
|
|
* Returns:
|
|
|
|
* -1 ==> error
|
|
|
|
* 0 ==> sentinel (or other ill-formed) record read
|
|
|
|
* 1 ==> a genuine record read
|
|
|
|
*/
|
|
|
|
char *line = NULL;
|
|
|
|
size_t line_len = 0;
|
|
|
|
unsigned int dummy_line_nr = 0;
|
|
|
|
int ret = -1;
|
|
|
|
|
|
|
|
if (function != NULL)
|
|
|
|
zfree(function);
|
|
|
|
|
|
|
|
if (filename != NULL)
|
|
|
|
zfree(filename);
|
|
|
|
|
|
|
|
if (line_nr != NULL)
|
|
|
|
*line_nr = 0;
|
|
|
|
|
2023-06-12 20:48:17 -07:00
|
|
|
/*
|
2023-06-14 19:50:41 -07:00
|
|
|
* Read the first line. Without an error this will be:
|
|
|
|
* - for the first line an address like 0x1234,
|
|
|
|
* - the binutils sentinel 0x0000000000000000,
|
|
|
|
* - the llvm-addr2line the sentinel ',' character,
|
|
|
|
* - the function name line for an inlined function.
|
2023-06-12 20:48:17 -07:00
|
|
|
*/
|
2023-04-03 11:40:31 -07:00
|
|
|
if (io__getline(io, &line, &line_len) < 0 || !line_len)
|
2021-09-16 05:09:39 -07:00
|
|
|
goto error;
|
2023-04-03 11:40:32 -07:00
|
|
|
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("%s %s: addr2line read address for sentinel: %s", __func__, dso_name, line);
|
|
|
|
if (style == LLVM && line_len == 2 && line[0] == ',') {
|
|
|
|
/* Found the llvm-addr2line sentinel character. */
|
|
|
|
zfree(&line);
|
|
|
|
return 0;
|
|
|
|
} else if (style == GNU_BINUTILS && (!first || addr != 0)) {
|
2023-06-12 20:48:17 -07:00
|
|
|
int zero_count = 0, non_zero_count = 0;
|
2023-06-14 19:50:41 -07:00
|
|
|
/*
|
|
|
|
* Check for binutils sentinel ignoring it for the case the
|
|
|
|
* requested address is 0.
|
|
|
|
*/
|
2023-06-12 20:48:17 -07:00
|
|
|
|
2023-06-14 19:50:41 -07:00
|
|
|
/* A given address should always start 0x. */
|
|
|
|
if (line_len >= 2 || line[0] != '0' || line[1] != 'x') {
|
|
|
|
for (size_t i = 2; i < line_len; i++) {
|
|
|
|
if (line[i] == '0')
|
|
|
|
zero_count++;
|
|
|
|
else if (line[i] != '\n')
|
|
|
|
non_zero_count++;
|
|
|
|
}
|
|
|
|
if (!non_zero_count) {
|
|
|
|
int ch;
|
|
|
|
|
|
|
|
if (first && !zero_count) {
|
|
|
|
/* Line was erroneous just '0x'. */
|
|
|
|
goto error;
|
|
|
|
}
|
|
|
|
/*
|
|
|
|
* Line was 0x0..0, the sentinel for binutils. Remove
|
|
|
|
* the function and filename lines.
|
|
|
|
*/
|
|
|
|
zfree(&line);
|
|
|
|
do {
|
|
|
|
ch = io__get_char(io);
|
|
|
|
} while (ch > 0 && ch != '\n');
|
|
|
|
do {
|
|
|
|
ch = io__get_char(io);
|
|
|
|
} while (ch > 0 && ch != '\n');
|
|
|
|
return 0;
|
2023-06-12 20:48:17 -07:00
|
|
|
}
|
|
|
|
}
|
2023-04-03 11:40:32 -07:00
|
|
|
}
|
2023-06-14 19:50:41 -07:00
|
|
|
/* Read the second function name line (if inline data then this is the first line). */
|
|
|
|
if (first && (io__getline(io, &line, &line_len) < 0 || !line_len))
|
2023-06-12 20:48:17 -07:00
|
|
|
goto error;
|
|
|
|
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("%s %s: addr2line read line: %s", __func__, dso_name, line);
|
2021-09-16 05:09:39 -07:00
|
|
|
if (function != NULL)
|
|
|
|
*function = strdup(strim(line));
|
|
|
|
|
|
|
|
zfree(&line);
|
|
|
|
line_len = 0;
|
|
|
|
|
2023-06-12 20:48:17 -07:00
|
|
|
/* Read the third filename and line number line. */
|
2023-04-03 11:40:31 -07:00
|
|
|
if (io__getline(io, &line, &line_len) < 0 || !line_len)
|
2021-09-16 05:09:39 -07:00
|
|
|
goto error;
|
|
|
|
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_debug("%s %s: addr2line filename:number : %s", __func__, dso_name, line);
|
2023-04-03 11:40:32 -07:00
|
|
|
if (filename_split(line, line_nr == NULL ? &dummy_line_nr : line_nr) == 0 &&
|
|
|
|
style == GNU_BINUTILS) {
|
2021-09-16 05:09:39 -07:00
|
|
|
ret = 0;
|
|
|
|
goto error;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (filename != NULL)
|
|
|
|
*filename = strdup(line);
|
|
|
|
|
|
|
|
zfree(&line);
|
|
|
|
line_len = 0;
|
|
|
|
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
error:
|
|
|
|
free(line);
|
|
|
|
if (function != NULL)
|
|
|
|
zfree(function);
|
|
|
|
if (filename != NULL)
|
|
|
|
zfree(filename);
|
|
|
|
return ret;
|
2013-12-03 00:23:07 -07:00
|
|
|
}
|
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
static int inline_list__append_record(struct dso *dso,
|
|
|
|
struct inline_node *node,
|
|
|
|
struct symbol *sym,
|
|
|
|
const char *function,
|
|
|
|
const char *filename,
|
|
|
|
unsigned int line_nr)
|
2017-03-25 13:34:26 -07:00
|
|
|
{
|
2021-09-16 05:09:39 -07:00
|
|
|
struct symbol *inline_sym = new_inline_sym(dso, sym, function);
|
2017-03-25 13:34:26 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
return inline_list__append(inline_sym, srcline_from_fileline(filename, line_nr), node);
|
|
|
|
}
|
2017-03-25 13:34:26 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
static int addr2line(const char *dso_name, u64 addr,
|
|
|
|
char **file, unsigned int *line_nr,
|
|
|
|
struct dso *dso,
|
|
|
|
bool unwind_inlines,
|
|
|
|
struct inline_node *node,
|
|
|
|
struct symbol *sym __maybe_unused)
|
|
|
|
{
|
2024-05-04 14:38:01 -07:00
|
|
|
struct child_process *a2l = dso__a2l(dso);
|
2021-09-16 05:09:39 -07:00
|
|
|
char *record_function = NULL;
|
|
|
|
char *record_filename = NULL;
|
|
|
|
unsigned int record_line_nr = 0;
|
|
|
|
int record_status = -1;
|
|
|
|
int ret = 0;
|
|
|
|
size_t inline_count = 0;
|
2023-04-03 11:40:31 -07:00
|
|
|
int len;
|
|
|
|
char buf[128];
|
|
|
|
ssize_t written;
|
2023-06-07 23:18:12 -07:00
|
|
|
struct io io = { .eof = false };
|
2023-04-03 11:40:32 -07:00
|
|
|
enum a2l_style a2l_style;
|
2021-09-16 05:09:39 -07:00
|
|
|
|
|
|
|
if (!a2l) {
|
2022-12-15 12:28:12 -07:00
|
|
|
if (!filename__has_section(dso_name, ".debug_line"))
|
|
|
|
goto out;
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso__set_a2l(dso,
|
|
|
|
addr2line_subprocess_init(symbol_conf.addr2line_path, dso_name));
|
|
|
|
a2l = dso__a2l(dso);
|
2017-03-25 13:34:26 -07:00
|
|
|
}
|
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
if (a2l == NULL) {
|
|
|
|
if (!symbol_conf.disable_add2line_warn)
|
|
|
|
pr_warning("%s %s: addr2line_subprocess_init failed\n", __func__, dso_name);
|
2017-03-25 13:34:26 -07:00
|
|
|
goto out;
|
|
|
|
}
|
2023-06-12 20:48:16 -07:00
|
|
|
a2l_style = addr2line_configure(a2l, dso_name);
|
|
|
|
if (a2l_style == BROKEN)
|
2023-04-03 11:40:32 -07:00
|
|
|
goto out;
|
2017-03-25 13:34:26 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
/*
|
2023-06-14 19:50:41 -07:00
|
|
|
* Send our request and then *deliberately* send something that can't be
|
|
|
|
* interpreted as a valid address to ask addr2line about (namely,
|
|
|
|
* ","). This causes addr2line to first write out the answer to our
|
|
|
|
* request, in an unbounded/unknown number of records, and then to write
|
|
|
|
* out the lines "0x0...0", "??" and "??:0", for GNU binutils, or ","
|
|
|
|
* for llvm-addr2line, so that we can detect when it has finished giving
|
|
|
|
* us anything useful.
|
2021-09-16 05:09:39 -07:00
|
|
|
*/
|
2023-04-03 11:40:31 -07:00
|
|
|
len = snprintf(buf, sizeof(buf), "%016"PRIx64"\n,\n", addr);
|
|
|
|
written = len > 0 ? write(a2l->in, buf, len) : -1;
|
|
|
|
if (written != len) {
|
2022-12-15 12:28:13 -07:00
|
|
|
if (!symbol_conf.disable_add2line_warn)
|
|
|
|
pr_warning("%s %s: could not send request\n", __func__, dso_name);
|
2021-09-16 05:09:39 -07:00
|
|
|
goto out;
|
|
|
|
}
|
2023-04-03 11:40:31 -07:00
|
|
|
io__init(&io, a2l->out, buf, sizeof(buf));
|
2023-06-07 23:18:12 -07:00
|
|
|
io.timeout_ms = addr2line_timeout_ms;
|
2023-06-14 19:50:41 -07:00
|
|
|
switch (read_addr2line_record(&io, a2l_style, dso_name, addr, /*first=*/true,
|
2023-04-03 11:40:32 -07:00
|
|
|
&record_function, &record_filename, &record_line_nr)) {
|
2021-09-16 05:09:39 -07:00
|
|
|
case -1:
|
2022-12-15 12:28:13 -07:00
|
|
|
if (!symbol_conf.disable_add2line_warn)
|
|
|
|
pr_warning("%s %s: could not read first record\n", __func__, dso_name);
|
2021-09-16 05:09:39 -07:00
|
|
|
goto out;
|
|
|
|
case 0:
|
|
|
|
/*
|
2023-06-12 20:48:17 -07:00
|
|
|
* The first record was invalid, so return failure, but first
|
|
|
|
* read another record, since we sent a sentinel ',' for the
|
2023-06-14 19:50:41 -07:00
|
|
|
* sake of detected the last inlined function. Treat this as the
|
|
|
|
* first of a record as the ',' generates a new start with GNU
|
|
|
|
* binutils, also force a non-zero address as we're no longer
|
|
|
|
* reading that record.
|
2021-09-16 05:09:39 -07:00
|
|
|
*/
|
2023-06-14 19:50:41 -07:00
|
|
|
switch (read_addr2line_record(&io, a2l_style, dso_name,
|
|
|
|
/*addr=*/1, /*first=*/true,
|
|
|
|
NULL, NULL, NULL)) {
|
2021-09-16 05:09:39 -07:00
|
|
|
case -1:
|
2022-12-15 12:28:13 -07:00
|
|
|
if (!symbol_conf.disable_add2line_warn)
|
2023-06-14 19:50:41 -07:00
|
|
|
pr_warning("%s %s: could not read sentinel record\n",
|
2022-12-15 12:28:13 -07:00
|
|
|
__func__, dso_name);
|
2021-09-16 05:09:39 -07:00
|
|
|
break;
|
|
|
|
case 0:
|
2023-06-14 19:50:41 -07:00
|
|
|
/* The sentinel as expected. */
|
2021-09-16 05:09:39 -07:00
|
|
|
break;
|
|
|
|
default:
|
2022-12-15 12:28:13 -07:00
|
|
|
if (!symbol_conf.disable_add2line_warn)
|
|
|
|
pr_warning("%s %s: unexpected record instead of sentinel",
|
|
|
|
__func__, dso_name);
|
2021-09-16 05:09:39 -07:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
goto out;
|
|
|
|
default:
|
2023-06-14 19:50:41 -07:00
|
|
|
/* First record as expected. */
|
2021-09-16 05:09:39 -07:00
|
|
|
break;
|
|
|
|
}
|
2017-10-30 19:06:54 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
if (file) {
|
|
|
|
*file = strdup(record_filename);
|
|
|
|
ret = 1;
|
|
|
|
}
|
|
|
|
if (line_nr)
|
|
|
|
*line_nr = record_line_nr;
|
2017-10-09 13:32:57 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
if (unwind_inlines) {
|
|
|
|
if (node && inline_list__append_record(dso, node, sym,
|
|
|
|
record_function,
|
|
|
|
record_filename,
|
|
|
|
record_line_nr)) {
|
|
|
|
ret = 0;
|
2017-03-25 13:34:26 -07:00
|
|
|
goto out;
|
2021-09-16 05:09:39 -07:00
|
|
|
}
|
|
|
|
}
|
2017-03-25 13:34:26 -07:00
|
|
|
|
2023-06-14 19:50:41 -07:00
|
|
|
/*
|
|
|
|
* We have to read the records even if we don't care about the inline
|
|
|
|
* info. This isn't the first record and force the address to non-zero
|
|
|
|
* as we're reading records beyond the first.
|
|
|
|
*/
|
2023-04-03 11:40:31 -07:00
|
|
|
while ((record_status = read_addr2line_record(&io,
|
2023-04-03 11:40:32 -07:00
|
|
|
a2l_style,
|
2023-06-14 19:50:41 -07:00
|
|
|
dso_name,
|
|
|
|
/*addr=*/1,
|
|
|
|
/*first=*/false,
|
2021-09-16 05:09:39 -07:00
|
|
|
&record_function,
|
|
|
|
&record_filename,
|
|
|
|
&record_line_nr)) == 1) {
|
|
|
|
if (unwind_inlines && node && inline_count++ < MAX_INLINE_NEST) {
|
|
|
|
if (inline_list__append_record(dso, node, sym,
|
|
|
|
record_function,
|
|
|
|
record_filename,
|
|
|
|
record_line_nr)) {
|
|
|
|
ret = 0;
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
ret = 1; /* found at least one inline frame */
|
2017-10-30 19:06:54 -07:00
|
|
|
}
|
2017-03-25 13:34:26 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
2021-09-16 05:09:39 -07:00
|
|
|
free(record_function);
|
|
|
|
free(record_filename);
|
2023-06-07 23:18:12 -07:00
|
|
|
if (io.eof) {
|
2024-05-04 14:38:01 -07:00
|
|
|
dso__set_a2l(dso, NULL);
|
2023-06-07 23:18:12 -07:00
|
|
|
addr2line_subprocess_cleanup(a2l);
|
|
|
|
}
|
2021-09-16 05:09:39 -07:00
|
|
|
return ret;
|
|
|
|
}
|
2017-03-25 13:34:26 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
void dso__free_a2l(struct dso *dso)
|
|
|
|
{
|
2024-05-04 14:38:01 -07:00
|
|
|
struct child_process *a2l = dso__a2l(dso);
|
2021-09-16 05:09:39 -07:00
|
|
|
|
|
|
|
if (!a2l)
|
|
|
|
return;
|
|
|
|
|
|
|
|
addr2line_subprocess_cleanup(a2l);
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso__set_a2l(dso, NULL);
|
2017-03-25 13:34:26 -07:00
|
|
|
}
|
|
|
|
|
2013-09-10 22:09:32 -07:00
|
|
|
#endif /* HAVE_LIBBFD_SUPPORT */
|
2013-09-10 22:09:28 -07:00
|
|
|
|
2021-09-16 05:09:39 -07:00
|
|
|
static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
|
|
|
|
struct dso *dso, struct symbol *sym)
|
|
|
|
{
|
|
|
|
struct inline_node *node;
|
|
|
|
|
|
|
|
node = zalloc(sizeof(*node));
|
|
|
|
if (node == NULL) {
|
|
|
|
perror("not enough memory for the inline node");
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
INIT_LIST_HEAD(&node->val);
|
|
|
|
node->addr = addr;
|
|
|
|
|
|
|
|
addr2line(dso_name, addr, NULL, NULL, dso, true, node, sym);
|
|
|
|
return node;
|
|
|
|
}
|
|
|
|
|
2013-12-03 00:23:10 -07:00
|
|
|
/*
|
|
|
|
* Number of addr2line failures (without success) before disabling it for that
|
|
|
|
* dso.
|
|
|
|
*/
|
|
|
|
#define A2L_FAIL_LIMIT 123
|
|
|
|
|
2015-09-01 11:47:19 -07:00
|
|
|
char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
|
2017-12-29 09:26:52 -07:00
|
|
|
bool show_sym, bool show_addr, bool unwind_inlines,
|
|
|
|
u64 ip)
|
2013-09-10 22:09:28 -07:00
|
|
|
{
|
2013-10-09 20:51:31 -07:00
|
|
|
char *file = NULL;
|
|
|
|
unsigned line = 0;
|
2013-09-10 22:09:31 -07:00
|
|
|
char *srcline;
|
2013-12-10 11:19:23 -07:00
|
|
|
const char *dso_name;
|
2013-09-10 22:09:28 -07:00
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
if (!dso__has_srcline(dso))
|
2014-11-12 19:05:24 -07:00
|
|
|
goto out;
|
2013-09-10 22:09:31 -07:00
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso_name = srcline_dso_name(dso);
|
2017-03-25 13:34:25 -07:00
|
|
|
if (dso_name == NULL)
|
2024-05-04 14:38:01 -07:00
|
|
|
goto out_err;
|
2013-09-10 22:09:29 -07:00
|
|
|
|
2017-10-09 13:32:57 -07:00
|
|
|
if (!addr2line(dso_name, addr, &file, &line, dso,
|
|
|
|
unwind_inlines, NULL, sym))
|
2024-05-04 14:38:01 -07:00
|
|
|
goto out_err;
|
2013-09-10 22:09:28 -07:00
|
|
|
|
2017-10-09 13:32:58 -07:00
|
|
|
srcline = srcline_from_fileline(file, line);
|
|
|
|
free(file);
|
|
|
|
|
|
|
|
if (!srcline)
|
2024-05-04 14:38:01 -07:00
|
|
|
goto out_err;
|
2013-12-03 00:23:10 -07:00
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso__set_a2l_fails(dso, 0);
|
2013-09-10 22:09:28 -07:00
|
|
|
|
|
|
|
return srcline;
|
2013-09-10 22:09:31 -07:00
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
out_err:
|
|
|
|
dso__set_a2l_fails(dso, dso__a2l_fails(dso) + 1);
|
|
|
|
if (dso__a2l_fails(dso) > A2L_FAIL_LIMIT) {
|
|
|
|
dso__set_has_srcline(dso, false);
|
2013-12-03 00:23:10 -07:00
|
|
|
dso__free_a2l(dso);
|
|
|
|
}
|
2024-05-04 14:38:01 -07:00
|
|
|
out:
|
2017-03-18 14:49:28 -07:00
|
|
|
if (!show_addr)
|
|
|
|
return (show_sym && sym) ?
|
2022-12-15 12:28:09 -07:00
|
|
|
strndup(sym->name, sym->namelen) : SRCLINE_UNKNOWN;
|
2017-03-18 14:49:28 -07:00
|
|
|
|
2014-11-12 19:05:27 -07:00
|
|
|
if (sym) {
|
2014-12-15 23:19:06 -07:00
|
|
|
if (asprintf(&srcline, "%s+%" PRIu64, show_sym ? sym->name : "",
|
2017-12-29 09:26:52 -07:00
|
|
|
ip - sym->start) < 0)
|
2014-11-12 19:05:27 -07:00
|
|
|
return SRCLINE_UNKNOWN;
|
2024-05-04 14:38:01 -07:00
|
|
|
} else if (asprintf(&srcline, "%s[%" PRIx64 "]", dso__short_name(dso), addr) < 0)
|
2014-11-12 19:05:24 -07:00
|
|
|
return SRCLINE_UNKNOWN;
|
|
|
|
return srcline;
|
2013-09-10 22:09:28 -07:00
|
|
|
}
|
|
|
|
|
2018-12-03 17:18:48 -07:00
|
|
|
/* Returns filename and fills in line number in line */
|
|
|
|
char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line)
|
|
|
|
{
|
|
|
|
char *file = NULL;
|
|
|
|
const char *dso_name;
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
if (!dso__has_srcline(dso))
|
|
|
|
return NULL;
|
2018-12-03 17:18:48 -07:00
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso_name = srcline_dso_name(dso);
|
2018-12-03 17:18:48 -07:00
|
|
|
if (dso_name == NULL)
|
2024-05-04 14:38:01 -07:00
|
|
|
goto out_err;
|
2018-12-03 17:18:48 -07:00
|
|
|
|
|
|
|
if (!addr2line(dso_name, addr, &file, line, dso, true, NULL, NULL))
|
2024-05-04 14:38:01 -07:00
|
|
|
goto out_err;
|
2018-12-03 17:18:48 -07:00
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso__set_a2l_fails(dso, 0);
|
2018-12-03 17:18:48 -07:00
|
|
|
return file;
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
out_err:
|
|
|
|
dso__set_a2l_fails(dso, dso__a2l_fails(dso) + 1);
|
|
|
|
if (dso__a2l_fails(dso) > A2L_FAIL_LIMIT) {
|
|
|
|
dso__set_has_srcline(dso, false);
|
2018-12-03 17:18:48 -07:00
|
|
|
dso__free_a2l(dso);
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2023-06-08 16:28:22 -07:00
|
|
|
void zfree_srcline(char **srcline)
|
2013-09-10 22:09:28 -07:00
|
|
|
{
|
2023-06-08 16:28:22 -07:00
|
|
|
if (*srcline == NULL)
|
|
|
|
return;
|
|
|
|
|
2023-06-12 07:10:46 -07:00
|
|
|
if (*srcline != SRCLINE_UNKNOWN)
|
2023-06-08 16:28:22 -07:00
|
|
|
free(*srcline);
|
|
|
|
|
|
|
|
*srcline = NULL;
|
2013-09-10 22:09:28 -07:00
|
|
|
}
|
2015-09-01 11:47:19 -07:00
|
|
|
|
|
|
|
char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
|
2017-12-29 09:26:52 -07:00
|
|
|
bool show_sym, bool show_addr, u64 ip)
|
2015-09-01 11:47:19 -07:00
|
|
|
{
|
2017-12-29 09:26:52 -07:00
|
|
|
return __get_srcline(dso, addr, sym, show_sym, show_addr, false, ip);
|
2015-09-01 11:47:19 -07:00
|
|
|
}
|
2017-03-25 13:34:26 -07:00
|
|
|
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
struct srcline_node {
|
|
|
|
u64 addr;
|
|
|
|
char *srcline;
|
|
|
|
struct rb_node rb_node;
|
|
|
|
};
|
|
|
|
|
2018-12-06 12:18:15 -07:00
|
|
|
void srcline__tree_insert(struct rb_root_cached *tree, u64 addr, char *srcline)
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
{
|
2018-12-06 12:18:15 -07:00
|
|
|
struct rb_node **p = &tree->rb_root.rb_node;
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
struct rb_node *parent = NULL;
|
|
|
|
struct srcline_node *i, *node;
|
2018-12-06 12:18:15 -07:00
|
|
|
bool leftmost = true;
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
|
|
|
|
node = zalloc(sizeof(struct srcline_node));
|
|
|
|
if (!node) {
|
|
|
|
perror("not enough memory for the srcline node");
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
node->addr = addr;
|
|
|
|
node->srcline = srcline;
|
|
|
|
|
|
|
|
while (*p != NULL) {
|
|
|
|
parent = *p;
|
|
|
|
i = rb_entry(parent, struct srcline_node, rb_node);
|
|
|
|
if (addr < i->addr)
|
|
|
|
p = &(*p)->rb_left;
|
2018-12-06 12:18:15 -07:00
|
|
|
else {
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
p = &(*p)->rb_right;
|
2018-12-06 12:18:15 -07:00
|
|
|
leftmost = false;
|
|
|
|
}
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
}
|
|
|
|
rb_link_node(&node->rb_node, parent, p);
|
2018-12-06 12:18:15 -07:00
|
|
|
rb_insert_color_cached(&node->rb_node, tree, leftmost);
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
}
|
|
|
|
|
2018-12-06 12:18:15 -07:00
|
|
|
char *srcline__tree_find(struct rb_root_cached *tree, u64 addr)
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
{
|
2018-12-06 12:18:15 -07:00
|
|
|
struct rb_node *n = tree->rb_root.rb_node;
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
|
|
|
|
while (n) {
|
|
|
|
struct srcline_node *i = rb_entry(n, struct srcline_node,
|
|
|
|
rb_node);
|
|
|
|
|
|
|
|
if (addr < i->addr)
|
|
|
|
n = n->rb_left;
|
|
|
|
else if (addr > i->addr)
|
|
|
|
n = n->rb_right;
|
|
|
|
else
|
|
|
|
return i->srcline;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2018-12-06 12:18:15 -07:00
|
|
|
void srcline__tree_delete(struct rb_root_cached *tree)
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
{
|
|
|
|
struct srcline_node *pos;
|
2018-12-06 12:18:15 -07:00
|
|
|
struct rb_node *next = rb_first_cached(tree);
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
|
|
|
|
while (next) {
|
|
|
|
pos = rb_entry(next, struct srcline_node, rb_node);
|
|
|
|
next = rb_next(&pos->rb_node);
|
2018-12-06 12:18:15 -07:00
|
|
|
rb_erase_cached(&pos->rb_node, tree);
|
2023-06-08 16:28:22 -07:00
|
|
|
zfree_srcline(&pos->srcline);
|
perf report: Cache srclines for callchain nodes
On one hand this ensures that the memory is properly freed when the DSO
gets freed. On the other hand this significantly speeds up the
processing of the callchain nodes when lots of srclines are requested.
For one of my data files e.g.:
Before:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
52496.495043 task-clock (msec) # 0.999 CPUs utilized
634 context-switches # 0.012 K/sec
2 cpu-migrations # 0.000 K/sec
191,561 page-faults # 0.004 M/sec
165,074,498,235 cycles # 3.144 GHz
334,170,832,408 instructions # 2.02 insn per cycle
90,220,029,745 branches # 1718.591 M/sec
654,525,177 branch-misses # 0.73% of all branches
52.533273822 seconds time elapsedProcessed 236605 events and lost 40 chunks!
After:
Performance counter stats for 'perf report -s srcline -g srcline --stdio':
22606.323706 task-clock (msec) # 1.000 CPUs utilized
31 context-switches # 0.001 K/sec
0 cpu-migrations # 0.000 K/sec
185,471 page-faults # 0.008 M/sec
71,188,113,681 cycles # 3.149 GHz
133,204,943,083 instructions # 1.87 insn per cycle
34,886,384,979 branches # 1543.214 M/sec
278,214,495 branch-misses # 0.80% of all branches
22.609857253 seconds time elapsed
Note that the difference is only this large when `--inline` is not
passed. In such situations, we would use the inliner cache and thus do
not run this code path that often.
I think that this cache should actually be used in other places, too.
When looking at the valgrind leak report for perf report, we see tons of
srclines being leaked, most notably from calls to
hist_entry__get_srcline. The problem is that get_srcline has many
different formatting options (show_sym, show_addr, potentially even
unwind_inlines when calling __get_srcline directly). As such, the
srcline cannot easily be cached for all calls, or we'd have to add
caches for all formatting combinations (6 so far). An alternative would
be to remove the formatting options and handle that on a different level
- i.e. print the sym/addr on demand wherever we actually output
something. And the unwind_inlines could be moved into a separate
function that does not return the srcline.
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171019113836.5548-4-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-10-19 04:38:34 -07:00
|
|
|
zfree(&pos);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2017-10-09 13:32:57 -07:00
|
|
|
struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr,
|
|
|
|
struct symbol *sym)
|
2017-03-25 13:34:26 -07:00
|
|
|
{
|
|
|
|
const char *dso_name;
|
|
|
|
|
2024-05-04 14:38:01 -07:00
|
|
|
dso_name = srcline_dso_name(dso);
|
2017-03-25 13:34:26 -07:00
|
|
|
if (dso_name == NULL)
|
|
|
|
return NULL;
|
|
|
|
|
2017-10-09 13:32:57 -07:00
|
|
|
return addr2inlines(dso_name, addr, dso, sym);
|
2017-03-25 13:34:26 -07:00
|
|
|
}
|
|
|
|
|
|
|
|
void inline_node__delete(struct inline_node *node)
|
|
|
|
{
|
|
|
|
struct inline_list *ilist, *tmp;
|
|
|
|
|
|
|
|
list_for_each_entry_safe(ilist, tmp, &node->val, list) {
|
|
|
|
list_del_init(&ilist->list);
|
2023-06-08 16:28:22 -07:00
|
|
|
zfree_srcline(&ilist->srcline);
|
2017-10-09 13:32:57 -07:00
|
|
|
/* only the inlined symbols are owned by the list */
|
|
|
|
if (ilist->symbol && ilist->symbol->inlined)
|
|
|
|
symbol__delete(ilist->symbol);
|
2017-03-25 13:34:26 -07:00
|
|
|
free(ilist);
|
|
|
|
}
|
|
|
|
|
|
|
|
free(node);
|
|
|
|
}
|
2017-10-09 13:32:59 -07:00
|
|
|
|
2018-12-06 12:18:15 -07:00
|
|
|
void inlines__tree_insert(struct rb_root_cached *tree,
|
|
|
|
struct inline_node *inlines)
|
2017-10-09 13:32:59 -07:00
|
|
|
{
|
2018-12-06 12:18:15 -07:00
|
|
|
struct rb_node **p = &tree->rb_root.rb_node;
|
2017-10-09 13:32:59 -07:00
|
|
|
struct rb_node *parent = NULL;
|
|
|
|
const u64 addr = inlines->addr;
|
|
|
|
struct inline_node *i;
|
2018-12-06 12:18:15 -07:00
|
|
|
bool leftmost = true;
|
2017-10-09 13:32:59 -07:00
|
|
|
|
|
|
|
while (*p != NULL) {
|
|
|
|
parent = *p;
|
|
|
|
i = rb_entry(parent, struct inline_node, rb_node);
|
|
|
|
if (addr < i->addr)
|
|
|
|
p = &(*p)->rb_left;
|
2018-12-06 12:18:15 -07:00
|
|
|
else {
|
2017-10-09 13:32:59 -07:00
|
|
|
p = &(*p)->rb_right;
|
2018-12-06 12:18:15 -07:00
|
|
|
leftmost = false;
|
|
|
|
}
|
2017-10-09 13:32:59 -07:00
|
|
|
}
|
|
|
|
rb_link_node(&inlines->rb_node, parent, p);
|
2018-12-06 12:18:15 -07:00
|
|
|
rb_insert_color_cached(&inlines->rb_node, tree, leftmost);
|
2017-10-09 13:32:59 -07:00
|
|
|
}
|
|
|
|
|
2018-12-06 12:18:15 -07:00
|
|
|
struct inline_node *inlines__tree_find(struct rb_root_cached *tree, u64 addr)
|
2017-10-09 13:32:59 -07:00
|
|
|
{
|
2018-12-06 12:18:15 -07:00
|
|
|
struct rb_node *n = tree->rb_root.rb_node;
|
2017-10-09 13:32:59 -07:00
|
|
|
|
|
|
|
while (n) {
|
|
|
|
struct inline_node *i = rb_entry(n, struct inline_node,
|
|
|
|
rb_node);
|
|
|
|
|
|
|
|
if (addr < i->addr)
|
|
|
|
n = n->rb_left;
|
|
|
|
else if (addr > i->addr)
|
|
|
|
n = n->rb_right;
|
|
|
|
else
|
|
|
|
return i;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2018-12-06 12:18:15 -07:00
|
|
|
void inlines__tree_delete(struct rb_root_cached *tree)
|
2017-10-09 13:32:59 -07:00
|
|
|
{
|
|
|
|
struct inline_node *pos;
|
2018-12-06 12:18:15 -07:00
|
|
|
struct rb_node *next = rb_first_cached(tree);
|
2017-10-09 13:32:59 -07:00
|
|
|
|
|
|
|
while (next) {
|
|
|
|
pos = rb_entry(next, struct inline_node, rb_node);
|
|
|
|
next = rb_next(&pos->rb_node);
|
2018-12-06 12:18:15 -07:00
|
|
|
rb_erase_cached(&pos->rb_node, tree);
|
2017-10-09 13:32:59 -07:00
|
|
|
inline_node__delete(pos);
|
|
|
|
}
|
|
|
|
}
|