1
linux/kernel
Steven Rostedt d01343244a ring-buffer: Fix typo of time extends per page
Time stamps for the ring buffer are created by the difference between
two events. Each page of the ring buffer holds a full 64 bit timestamp.
Each event has a 27 bit delta stamp from the last event. The unit of time
is nanoseconds, so 27 bits can hold ~134 milliseconds. If two events
happen more than 134 milliseconds apart, a time extend is inserted
to add more bits for the delta. The time extend has 59 bits, which
is good for ~18 years.

Currently the time extend is committed separately from the event.
If an event is discarded before it is committed, due to filtering,
the time extend still exists. If all events are being filtered, then
after ~134 milliseconds a new time extend will be added to the buffer.

This can only happen till the end of the page. Since each page holds
a full timestamp, there is no reason to add a time extend to the
beginning of a page. Time extends can only fill a page that has actual
data at the beginning, so there is no fear that time extends will fill
more than a page without any data.

When reading an event, a loop is made to skip over time extends
since they are only used to maintain the time stamp and are never
given to the caller. As a paranoid check to prevent the loop running
forever, with the knowledge that time extends may only fill a page,
a check is made that tests the iteration of the loop, and if the
iteration is more than the number of time extends that can fit in a page
a warning is printed and the ring buffer is disabled (all of ftrace
is also disabled with it).

There is another event type that is called a TIMESTAMP which can
hold 64 bits of data in the theoretical case that two events happen
18 years apart. This code has not been implemented, but the name
of this event exists, as well as the structure for it. The
size of a TIMESTAMP is 16 bytes, where as a time extend is only
8 bytes. The macro used to calculate how many time extends can fit on
a page used the TIMESTAMP size instead of the time extend size
cutting the amount in half.

The following test case can easily trigger the warning since we only
need to have half the page filled with time extends to trigger the
warning:

 # cd /sys/kernel/debug/tracing/
 # echo function > current_tracer
 # echo 'common_pid < 0' > events/ftrace/function/filter
 # echo > trace
 # echo 1 > trace_marker
 # sleep 120
 # cat trace

Enabling the function tracer and then setting the filter to only trace
functions where the process id is negative (no events), then clearing
the trace buffer to ensure that we have nothing in the buffer,
then write to trace_marker to add an event to the beginning of a page,
sleep for 2 minutes (only 35 seconds is probably needed, but this
guarantees the bug), and then finally reading the trace which will
trigger the bug.

This patch fixes the typo and prevents the false positive of that warning.

Reported-by: Hans J. Koch <hjk@linutronix.de>
Tested-by: Hans J. Koch <hjk@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Stable Kernel <stable@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2010-10-12 12:06:43 -04:00
..
debug Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip 2010-09-08 11:13:42 -07:00
gcov gcov: fix null-pointer dereference for certain module types 2010-09-09 18:57:23 -07:00
irq
power Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6 2010-09-11 15:50:53 -07:00
time
trace ring-buffer: Fix typo of time extends per page 2010-10-12 12:06:43 -04:00
.gitignore
acct.c
async.c
audit_tree.c
audit_watch.c
audit.c
audit.h
auditfilter.c
auditsc.c
backtracetest.c
bounds.c
capability.c
cgroup_freezer.c
cgroup.c cgroups: fix API thinko 2010-09-09 18:57:23 -07:00
compat.c compat: Make compat_alloc_user_space() incorporate the access_ok() 2010-09-14 16:08:45 -07:00
configs.c
cpu.c
cpuset.c
cred.c
delayacct.c
dma.c
early_res.c
elfcore.c
exec_domain.c
exit.c
extable.c
fork.c rmap: fix walk during fork 2010-09-22 17:22:39 -07:00
freezer.c
futex_compat.c
futex.c
groups.c kernel/groups.c: fix integer overflow in groups_search 2010-09-09 18:57:24 -07:00
hrtimer.c gcc-4.6: kernel/*: Fix unused but set warnings 2010-09-05 14:36:58 +02:00
hung_task.c
hw_breakpoint.c hw breakpoints: Fix pid namespace bug 2010-09-17 04:42:59 +02:00
itimer.c
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kexec.c
kfifo.c kfifo: fix scatterlist usage 2010-10-01 10:50:58 -07:00
kmod.c
kprobes.c
ksysfs.c
kthread.c
latencytop.c
lockdep_internals.h
lockdep_proc.c
lockdep_states.h
lockdep.c
Makefile
module.c modules: Fix module_bug_list list corruption race 2010-10-05 11:29:27 -07:00
mutex-debug.c
mutex-debug.h
mutex.c mutex: Fix annotations to include it in kernel-locking docbook 2010-09-03 08:19:51 +02:00
mutex.h
notifier.c
ns_cgroup.c
nsproxy.c
padata.c
panic.c
params.c
perf_event.c perf: Fix incorrect copy_from_user() usage 2010-10-12 11:45:01 +02:00
pid_namespace.c
pid.c
pm_qos_params.c PM QoS: Correct pr_debug() misuse and improve parameter checks 2010-09-11 00:53:05 +02:00
posix-cpu-timers.c
posix-timers.c
printk.c
profile.c
ptrace.c
range.c
rcupdate.c
rcutiny_plugin.h
rcutiny.c
rcutorture.c
rcutree_plugin.h
rcutree_trace.c
rcutree.c
rcutree.h
relay.c
res_counter.c
resource.c
rtmutex_common.h
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rwsem.c
sched_clock.c
sched_cpupri.c
sched_cpupri.h
sched_debug.c
sched_fair.c sched: Fix nohz balance kick 2010-09-21 13:50:50 +02:00
sched_features.h
sched_idletask.c
sched_rt.c
sched_stats.h
sched.c sched: Fix user time incorrectly accounted as system time on 32-bit 2010-09-15 10:41:36 +02:00
seccomp.c
semaphore.c
signal.c HWPOISON: Copy si_addr_lsb to user 2010-10-07 09:41:25 +02:00
smp.c generic-ipi: Fix deadlock in __smp_call_function_single 2010-09-10 16:48:40 +02:00
softirq.c
spinlock.c
srcu.c
stacktrace.c
stop_machine.c
sys_ni.c
sys.c pid: make setpgid() system call use RCU read-side critical section 2010-08-31 17:00:18 -07:00
sysctl_binary.c
sysctl_check.c
sysctl.c sysctl: fix min/max handling in __do_proc_doulongvec_minmax() 2010-10-07 13:31:21 -07:00
taskstats.c
test_kprobes.c
time.c
timeconst.pl
timer.c
tracepoint.c
tsacct.c
uid16.c
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
wait.c
watchdog.c lockup_detector: Sync touch_*_watchdog back to old semantics 2010-09-01 10:02:28 +02:00
workqueue_sched.h
workqueue.c workqueue: add documentation 2010-09-13 10:26:52 +02:00