linux

Author	SHA1	Message	Date
Wenji Huang	af51309845	tracing: use the more proper parameter Pass tsk to tracing_record_cmdline instead of current. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-17 12:38:23 -05:00
Steven Rostedt	b6887d7916	ftrace: rename _hook to _probe Impact: clean up Ingo Molnar did not like the _hook naming convention used by the select function tracer. Luis Claudio R. Goncalves suggested using the "_probe" extension. This patch implements the change of calling the functions and variables "_hook" and replacing them with "_probe". Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-17 12:32:04 -05:00
Steven Rostedt	6a24a244cd	ftrace: clean up coding style Ingo Molnar pointed out some coding style issues with the recent ftrace updates. This patch cleans them up. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-17 11:20:26 -05:00
Ingo Molnar	73d3fd96e7	ftrace: fix !CONFIG_DYNAMIC_FTRACE ftrace_swapper_pid definition Impact: build fix Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-17 11:50:42 +01:00
Ingo Molnar	f492d3f838	Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace	2009-02-17 11:31:01 +01:00
Ingo Molnar	c4e2b432d5	Merge branches 'tracing/hw-branch-tracing' and 'tracing/power-tracer' into tracing/core	2009-02-17 11:29:53 +01:00
Steven Rostedt	e110e3d1ea	ftrace: add pretty print function for traceon and traceoff hooks This patch adds a pretty print version of traceon and traceoff output for set_ftrace_filter. # echo 'sys_open:traceon:4' > set_ftrace_filter # cat set_ftrace_filter #### all functions enabled #### sys_open:traceon:count=4 Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 23:38:13 -05:00
Steven Rostedt	809dcf29ce	ftrace: add pretty print to selected fuction traces This patch adds a call back for the tracers that have hooks to selected functions. This allows the tracer to show better output in the set_ftrace_filter file. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 23:06:01 -05:00
Steven Rostedt	8fc0c701c5	ftrace: show selected functions in set_ftrace_filter This patch adds output to show what functions have tracer hooks attached to them. # echo 'sys_open:traceon:4' > /debug/tracing/set_ftrace_filter # cat set_ftrace_filter #### all functions enabled #### sys_open:ftrace_traceon:0000000000000004 # echo 'do_fork:traceoff:' > set_ftrace_filter # cat set_ftrace_filter #### all functions enabled #### sys_open:ftrace_traceon:0000000000000002 do_fork:ftrace_traceoff:ffffffffffffffff Note the 4 changed to a 2. This is because The code was executed twice since the traceoff was added. If a cat is done again: #### all functions enabled #### sys_open:ftrace_traceon do_fork:ftrace_traceoff:ffffffffffffffff The number disappears. That is because it will not print a NULL. Callbacks to allow the tracer to pretty print will be implemented soon. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 22:50:06 -05:00
Steven Rostedt	23b4ff3aa4	ftrace: add traceon traceoff commands to enable/disable the buffers This patch adds the new function selection commands traceon and traceoff. traceon sets the function to enable the ring buffers while traceoff disables the ring buffers. You can pass in the number of times you want the command to be executed when the function is hit. It will only execute if the state of the buffers are not already in that state. Example: # echo do_fork:traceon:4 Will enable the ring buffers if they are disabled every time it hits do_fork, up to 4 times. # echo sys_close:traceoff This will disable the ring buffers every time (unlimited) when sys_close is called. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 22:50:04 -05:00
Steven Rostedt	988ae9d6b2	ring-buffer: add tracing_is_on to test if ring buffer is enabled This patch adds the tracing_is_on() interface to tell if the ring buffer is turned on or not. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 22:50:01 -05:00
Steven Rostedt	59df055f19	ftrace: trace different functions with a different tracer Impact: new feature Currently, the function tracer only gives you an ability to hook a tracer to all functions being traced. The dynamic function trace allows you to pick and choose which of those functions will be traced, but all functions being traced will call all tracers that registered with the function tracer. This patch adds a new feature that allows a tracer to hook to specific functions, even when all functions are being traced. It allows for different functions to call different tracer hooks. The way this is accomplished is by a special function that will hook to the function tracer and will set up a hash table knowing which tracer hook to call with which function. This is the most general and easiest method to accomplish this. Later, an arch may choose to supply their own method in changing the mcount call of a function to call a different tracer. But that will be an exercise for the future. To register a function: struct ftrace_hook_ops { void (func)(unsigned long ip, unsigned long parent_ip, void data); int (callback)(unsigned long ip, void *data); void (free)(void *data); }; int register_ftrace_function_hook(char glob, struct ftrace_hook_ops ops, void data); glob is a simple glob to search for the functions to hook. ops is a pointer to the operations (listed below) data is the default data to be passed to the hook functions when traced ops: func is the hook function to call when the functions are traced callback is a callback function that is called when setting up the hash. That is, if the tracer needs to do something special for each function, that is being traced, and wants to give each function its own data. The address of the entry data is passed to this callback, so that the callback may wish to update the entry to whatever it would like. free is a callback for when the entry is freed. In case the tracer allocated any data, it is give the chance to free it. To unregister we have three functions: void unregister_ftrace_function_hook(char glob, struct ftrace_hook_ops ops, void data) This will unregister all hooks that match glob, point to ops, and have its data matching data. (note, if glob is NULL, blank or '', all functions will be tested). void unregister_ftrace_function_hook_func(char glob, struct ftrace_hook_ops ops) This will unregister all functions matching glob that has an entry pointing to ops. void unregister_ftrace_function_hook_all(char *glob) This simply unregisters all funcs. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 22:44:09 -05:00
Steven Rostedt	e6ea44e9b4	ftrace: consolidate mutexes Impact: clean up Now that ftrace_lock is a mutex, there is no reason to have three different mutexes protecting similar data. All the mutex paths are not in hot paths, so having a mutex to cover more data is not a problem. This patch removes the ftrace_sysctl_lock and ftrace_start_lock and uses the ftrace_lock to protect the locations that were protected by these locks. By doing so, this change also removes some of the lock nesting that was taking place. There are still more mutexes in ftrace.c that can probably be consolidated, but they can be dealt with later. We need to be careful about the way the locks are nested, and by consolidating, we can cause a recursive deadlock. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 18:15:31 -05:00
Steven Rostedt	52baf11922	ftrace: convert ftrace_lock from a spinlock to mutex Impact: clean up The older versions of ftrace required doing the ftrace list search under atomic context. Now all the calls are in non-atomic context. There is no reason to keep the ftrace_lock as a spinlock. This patch converts it to a mutex. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 17:33:14 -05:00
Steven Rostedt	f6180773d9	ftrace: add command interface for function selection Allow for other tracers to add their own commands for function selection. This interface gives a trace the ability to name a command for function selection. Right now it is pretty limited in what it offers, but this is a building step for more features. The :mod: command is converted to this interface and also serves as a template for other implementations. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 17:06:02 -05:00
Steven Rostedt	e68746a271	ftrace: enable filtering only when a function is filtered on Impact: fix to prevent empty set_ftrace_filter and no ftrace output The function filter is used to only trace a given set of functions. The filter is enabled when a function name is echoed into the set_ftrace_filter file. But if the name has a typo and the function is not found, the filter is enabled, but no function is listed. This makes a confusing situation where set_ftrace_filter is empty but no functions ever get enabled for tracing. For example: # cat /debug/tracing/set_ftrace_filter #### all functions enabled #### # echo bad_name > set_ftrace_filter # cat /debug/tracing/set_ftrace_filter # echo function > current_tracer # cat trace # tracer: nop # # TASK-PID CPU# TIMESTAMP FUNCTION # \| \| \| \| \| This patch changes that to only enable filtering if a function is set to be filtered on. Now, the filter is not enabled if a bad name is echoed into set_ftrace_filter. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 17:03:49 -05:00
Steven Rostedt	64e7c44061	ftrace: add module command function filter selection This patch adds a "command" syntax to the function filtering files: /debugfs/tracing/set_ftrace_filter /debugfs/tracing/set_ftrace_notrace Of the format: <function>:<command>:<parameter> The command is optional, and dependent on the command, so are the parameters. echo do_fork > set_ftrace_filter Will only trace 'do_fork'. echo 'sched_' > set_ftrace_filter Will only trace functions starting with the letters 'sched_'. echo ':mod:ext3' > set_ftrace_filter Will trace only the ext3 module functions. echo 'write:mod:ext3' > set_ftrace_notrace Will prevent the ext3 functions with the letters 'write' in the name from being traced. echo '!*_allocate:mod:ext3' > set_ftrace_filter Will remove the functions in ext3 that end with the letters '_allocate' from the ftrace filter. Although this patch implements the 'command' format, only the 'mod' command is supported. More commands to follow. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 16:55:50 -05:00
Steven Rostedt	9f4801e30a	ftrace: break up ftrace_match_records into smaller components Impact: clean up ftrace_match_records does a lot of things that other features can use. This patch breaks up ftrace_match_records and pulls out ftrace_setup_glob and ftrace_match_record. ftrace_setup_glob prepares a simple glob expression for use with ftrace_match_record. ftrace_match_record compares a single record with a glob type. Breaking this up will allow for more features to run on individual records. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 16:49:57 -05:00
Steven Rostedt	7f24b31b01	ftrace: rename ftrace_match to ftrace_match_records Impact: clean up ftrace_match is too generic of a name. What it really does is search all records and matches the records with the given string, and either sets or unsets the functions to be traced depending on if the parameter 'enable' is set or not. This allows us to make another function called ftrace_match that can be used to test a single record. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 16:33:15 -05:00
Steven Rostedt	265c831cb0	ftrace: add do_for_each_ftrace_rec and while_for_each_ftrace_rec Impact: clean up To iterate over all the functions that dynamic trace knows about it requires two for loops. One to iterate over the pages and the other to iterate over the records within the page. There are several duplications of these loops in ftrace.c. This patch creates the macros do_for_each_ftrace_rec and while_for_each_ftrace_rec to handle this logic, and removes the duplicate code. While making this change, I also discovered and fixed a small bug that one of the iterations should exit the loop after it found the record it was searching for. This used a break when it should have used a goto, since there were two loops it needed to break out from. No real harm was done by this bug since it would only continue to search the other records, and the code was in a slow path anyway. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 16:25:12 -05:00
Steven Rostedt	0c75a3ed63	ftrace: state that all functions are enabled in set_ftrace_filter Impact: clean up, make set_ftrace_filter less confusing The set_ftrace_filter shows only the functions that will be traced. But when it is empty, it will trace all functions. This can be a bit confusing. This patch makes set_ftrace_filter show: #### all functions enabled #### When all functions will be traced, and we do not filter only a select few. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-16 16:21:35 -05:00
Ingo Molnar	72b623c736	Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/power-tracer	2009-02-15 20:43:03 +01:00
Rakib Mullick	a234aa9ecd	tracing: fix section mismatch in trace_hw_branches.c The function bts_trace_init() references a variable bts_hotcpu_notifier which is marked as __cpuinitdata. Thus causes section mismatch. This patch fixes it. LD kernel/trace/built-in.o WARNING: kernel/trace/built-in.o(.text+0xc90c): Section mismatch in reference from the function bts_trace_init() to the variable .cpuinit.data:bts_hotcpu_notifier The function bts_trace_init() references the variable __cpuinitdata bts_hotcpu_notifier. This is often because bts_trace_init lacks a __cpuinitdata annotation or the annotation of bts_hotcpu_notifier is wrong. WARNING: kernel/trace/built-in.o(.text+0xc92a): Section mismatch in reference from the function bts_trace_reset() to the variable .cpuinit.data:bts_hotcpu_notifier The function bts_trace_reset() references the variable __cpuinitdata bts_hotcpu_notifier. This is often because bts_trace_reset lacks a __cpuinitdata annotation or the annotation of bts_hotcpu_notifier is wrong. Signed-off-by: Rakib Mullick <rakib.mullick@gmail.com> Cc: markus.t.metzger@gmail.com Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-15 20:41:08 +01:00
Ingo Molnar	9abd603048	Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core Conflicts: kernel/trace/trace_mmiotrace.c	2009-02-15 20:08:55 +01:00
Pekka Paalanen	6bc5c366b1	trace: mmiotrace to the tracer menu in Kconfig Impact: cosmetic change in Kconfig menu layout This patch was originally suggested by Peter Zijlstra, but seems it was forgotten. CONFIG_MMIOTRACE and CONFIG_MMIOTRACE_TEST were selectable directly under the Kernel hacking / debugging menu in the kernel configuration system. They were present only for x86 and x86_64. Other tracers that use the ftrace tracing framework are in their own sub-menu. This patch moves the mmiotrace configuration options there. Since the Kconfig file, where the tracer menu is, is not architecture specific, HAVE_MMIOTRACE_SUPPORT is introduced and provided only by x86/x86_64. CONFIG_MMIOTRACE now depends on it. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-15 20:03:28 +01:00
Pekka Paalanen	391b170f10	mmiotrace: count events lost due to not recording Impact: enhances lost events counting in mmiotrace The tracing framework, or the ring buffer facility it uses, has a switch to stop recording data. When recording is off, the trace events will be lost. The framework does not count these, so mmiotrace has to count them itself. Signed-off-by: Pekka Paalanen <pq@iki.fi> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-15 20:02:42 +01:00
Serge E. Hallyn	fb5ae64fdd	User namespaces: Only put the userns when we unhash the uid uids in namespaces other than init don't get a sysfs entry. For those in the init namespace, while we're waiting to remove the sysfs entry for the uid the uid is still hashed, and alloc_uid() may re-grab that uid without getting a new reference to the user_ns, which we've already put in free_user before scheduling remove_user_sysfs_dir(). Reported-and-tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-13 08:07:40 -08:00
Jason Baron	b5f9fd0f8a	tracing: convert c/p state power tracer to use tracepoints Convert the c/p state "power" tracer to use tracepoints. Avoids a function call when the tracer is disabled. Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-13 09:06:18 -05:00
Peter Zijlstra	3997ad317f	timers: more consistently use clock vs timer While reviewing the manpages, I noticed I'd missed some clock vs timer sites. Make sure that all timer functions call cpu_timer_sample_group() and not cpu_clock_sample_group(). This ensures that we enable the process wide timer in time, and therefore pay the O(n) thread group cost from the syscall. Not doing it here, will result in the first jiffy tick after setting the timer doing this, resulting in a very expensive tick (but only once) and a delay in actually starting the timer. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-13 13:04:05 +01:00
Ingo Molnar	d351c8db95	Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace	2009-02-13 10:26:45 +01:00
Ingo Molnar	1c511f740f	Merge branches 'tracing/ftrace', 'tracing/ring-buffer', 'tracing/sysprof', 'tracing/urgent' and 'linus' into tracing/core	2009-02-13 10:25:18 +01:00
Steven Rostedt	45141d4667	ring-buffer: rename label out_unlock to out_reset Impact: clean up While reviewing the ring buffer code, I thougth I saw a bug with if (!__raw_spin_trylock(&cpu_buffer->lock)) goto out_unlock; But I forgot that we use a variable "lock_taken" that is set if the spinlock is taken, and only unlock it if that variable is set. To avoid further confusion from other reviewers, this patch renames the label out_unlock with out_reset, which is the more appropriate name. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-12 13:39:46 -05:00
Ingo Molnar	a0490fa35d	sched: cpu hotplug fix rq_attach_root() does a kfree() with the runqueue lock held. That's not a very wise move, fix it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-12 11:57:36 +01:00
Li Zefan	cfebe563bd	cgroups: fix lockdep subclasses overflow I enabled all cgroup subsystems when compiling kernel, and then: # mount -t cgroup -o net_cls xxx /mnt # mkdir /mnt/0 This showed up immediately: BUG: MAX_LOCKDEP_SUBCLASSES too low! turning off the locking correctness validator. It's caused by the cgroup hierarchy lock: for (i = 0; i < CGROUP_SUBSYS_COUNT; i++) { struct cgroup_subsys *ss = subsys[i]; if (ss->root == root) mutex_lock_nested(&ss->hierarchy_mutex, i); } Now we have 9 cgroup subsystems, and the above 'i' for net_cls is 8, but MAX_LOCKDEP_SUBCLASSES is 8. This patch uses different lockdep keys for different subsystems. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-11 14:25:36 -08:00
Sven Wegener	fc3501d411	mm: fix dirty_bytes/dirty_background_bytes sysctls on 64bit arches We need to pass an unsigned long as the minimum, because it gets casted to an unsigned long in the sysctl handler. If we pass an int, we'll access four more bytes on 64bit arches, resulting in a random minimum value. [rientjes@google.com: fix type of `old_bytes'] Signed-off-by: Sven Wegener <sven.wegener@stealer.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-11 14:25:35 -08:00
Peter Zijlstra	2fff78c784	futex: fix reference leak Catalin noticed that (`38d47c1b70`: futex: rely on get_user_pages() for shared futexes) caused an mm_struct leak. Some tracing with the function graph tracer quickly pointed out that futex_wait() has exit paths with unbalanced reference counts. This regression was discovered by kmemleak. Reported-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com> Tested-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 18:24:08 +01:00
Linus Torvalds	6c6f1f0f4d	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: revert recent sync wakeup changes	2009-02-11 08:25:06 -08:00
Linus Torvalds	94dba89533	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers: fix TIMER_ABSTIME for process wide cpu timers timers: split process wide cpu clocks/timers, fix x86: clean up hpet timer reinit timers: split process wide cpu clocks/timers, remove spurious warning timers: split process wide cpu clocks/timers signal: re-add dead task accumulation stats. x86: fix hpet timer reinit for x86_64 sched: fix nohz load balancer on cpu offline	2009-02-11 08:24:32 -08:00
Linus Torvalds	9ce04f9238	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: ptrace, x86: fix the usage of ptrace_fork() i8327: fix outb() parameter order x86: fix math_emu register frame access x86: math_emu info cleanup x86: include correct %gs in a.out core dump x86, vmi: put a missing paravirt_release_pmd in pgd_dtor x86: find nr_irqs_gsi with mp_ioapic_routing x86: add clflush before monitor for Intel 7400 series x86: disable intel_iommu support by default x86: don't apply __supported_pte_mask to non-present ptes x86: fix grammar in user-visible BIOS warning x86/Kconfig.cpu: make Kconfig help readable in the console x86, 64-bit: print DMI info in the oops trace	2009-02-11 08:23:22 -08:00
Peter Zijlstra	fc631c82e1	sched: revert recent sync wakeup changes Intel reported a 10% regression (mysql+sysbench) on a 16-way machine with these patches: `1596e29`: sched: symmetric sync vs avg_overlap `d942fb6`: sched: fix sync wakeups Revert them. Reported-by: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> Bisected-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 14:43:35 +01:00
Peter Zijlstra	4da94d49b2	timers: fix TIMER_ABSTIME for process wide cpu timers The POSIX timer interface allows for absolute time expiry values through the TIMER_ABSTIME flag, therefore we have to synchronize the timer to the clock every time we start it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 14:04:21 +01:00
Peter Zijlstra	3fccfd67df	timers: split process wide cpu clocks/timers, fix To decrease the chance of a missed enable, always enable the timer when we sample it, we'll always disable it when we find that there are no active timers in the jiffy tick. This fixes a flood of warnings reported by Mike Galbraith. Reported-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 14:04:19 +01:00
Arnaldo Carvalho de Melo	00f62f614b	ring_buffer: pahole struct ring_buffer While fixing some bugs in pahole (built-in.o files were not being processed due to relocation problems) I found out about these packable structures: $ pahole --packable kernel/trace/ring_buffer.o \| grep ring ring_buffer 72 64 8 ring_buffer_per_cpu 112 104 8 If we take a look at the current layout of struct ring_buffer we can see that we have two 4 bytes holes. $ pahole -C ring_buffer kernel/trace/ring_buffer.o struct ring_buffer { unsigned int pages; /* 0 4 / unsigned int flags; / 4 4 / int cpus; / 8 4 / / XXX 4 bytes hole, try to pack / cpumask_var_t cpumask; / 16 8 / atomic_t record_disabled; / 24 4 / / XXX 4 bytes hole, try to pack / struct mutex mutex; / 32 32 / / --- cacheline 1 boundary (64 bytes) --- / struct ring_buffer_per_cpu * buffers; /* 64 8 / / size: 72, cachelines: 2, members: 7 / / sum members: 64, holes: 2, sum holes: 8 / / last cacheline: 8 bytes / }; So, if I ask pahole to reorganize it: $ pahole -C ring_buffer --reorganize kernel/trace/ring_buffer.o struct ring_buffer { unsigned int pages; / 0 4 / unsigned int flags; / 4 4 / int cpus; / 8 4 / atomic_t record_disabled; / 12 4 / cpumask_var_t cpumask; / 16 8 / struct mutex mutex; / 24 32 / struct ring_buffer_per_cpu * buffers; /* 56 8 / / --- cacheline 1 boundary (64 bytes) --- / / size: 64, cachelines: 1, members: 7 / }; / saved 8 bytes and 1 cacheline! / We get it using just one 64 bytes cacheline. To see what it did: $ pahole -C ring_buffer --reorganize --show_reorg_steps \ kernel/trace/ring_buffer.o \| grep \/ / Moving 'record_disabled' from after 'cpumask' to after 'cpus' */ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 13:21:40 +01:00
Frederic Weisbecker	b22f485812	tracing/sysprof: add missing tracing_{start,stop}_record_cmdline() Add the missing pair tracing_{start,stop}_record_cmdline() to record well the cmdline associated with pid. Changes in v2: - fix a build error, the sched_switch tracer is needed to record the cmdline. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 12:55:19 +01:00
Oleg Nesterov	06eb23b1ba	ptrace, x86: fix the usage of ptrace_fork() I noticed by pure accident we have ptrace_fork() and friends. This was added by "x86, bts: add fork and exit handling", commit `bf53de907d`. I can't test this, ds_request_bts() returns -EOPNOTSUPP, but I strongly believe this needs the fix. I think something like this program int main(void) { int pid = fork(); if (!pid) { ptrace(PTRACE_TRACEME, 0, NULL, NULL); kill(getpid(), SIGSTOP); fork(); } else { struct ptrace_bts_config bts = { .flags = PTRACE_BTS_O_ALLOC, .size = 4 * 4096, }; wait(NULL); ptrace(PTRACE_SETOPTIONS, pid, NULL, PTRACE_O_TRACEFORK); ptrace(PTRACE_BTS_CONFIG, pid, &bts, sizeof(bts)); ptrace(PTRACE_CONT, pid, NULL, NULL); sleep(1); } return 0; } should crash the kernel. If the task is traced by its natural parent ptrace_reparented() returns 0 but we should clear ->btsxxx anyway. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Markus Metzger <markus.t.metzger@intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 10:32:46 +01:00
Hannes Eder	e7669b8e32	tracing: fix sparse warning: attribute function with __acquires/__releases Fix this sparse warning: kernel/trace/trace.c:458:9: warning: context imbalance in 'register_tracer' - unexpected unlock Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 10:15:42 +01:00
Hannes Eder	5e39841c45	tracing: fix sparse warnings: fix (un-)signedness Fix these sparse warnings: kernel/trace/ring_buffer.c:70:37: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:84:39: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:96:43: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:2475:13: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:2475:13: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:2478:42: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:2478:42: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:2500:40: warning: incorrect type in argument 3 (different signedness) kernel/trace/ring_buffer.c:2505:44: warning: incorrect type in argument 2 (different signedness) kernel/trace/ring_buffer.c:2507:46: warning: incorrect type in argument 2 (different signedness) kernel/trace/trace.c:2130:40: warning: incorrect type in argument 3 (different signedness) kernel/trace/trace.c:2280:40: warning: incorrect type in argument 3 (different signedness) Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 10:15:42 +01:00
Hannes Eder	4fd2735881	tracing: fix sparse warnings: make symbols static Impact: make global variables and a global function static The function '__trace_userstack' does not seem to have a caller, so it is commented out. Fix this sparse warnings: kernel/trace/trace.c:82:5: warning: symbol 'tracing_disabled' was not declared. Should it be static? kernel/trace/trace.c:600:10: warning: symbol 'trace_record_cmdline_disabled' was not declared. Should it be static? kernel/trace/trace.c:957:6: warning: symbol '__trace_userstack' was not declared. Should it be static? kernel/trace/trace.c:1694:5: warning: symbol 'tracing_release' was not declared. Should it be static? Signed-off-by: Hannes Eder <hannes@hanneseder.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-11 10:15:41 +01:00
Ingo Molnar	4040068dce	Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace	2009-02-11 10:03:53 +01:00
Wenji Huang	c3706f005c	tracing: fix typos in comments Impact: clean up. Fix typos in the comments. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-10 12:32:35 -05:00
Wenji Huang	810dc73265	tracing: provide correct return value after outputting the event This patch is to make the function return early on failure, and give correct return value on success. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-10 12:32:33 -05:00
Wenji Huang	f54fc98aa6	tracing: remove unneeded variable Impact: clean up. Remove the unnecessary variable ret. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-10 12:32:18 -05:00
Tobias Klauser	4543ae7ce1	tracing: storage class should be before const qualifier The C99 specification states in section 6.11.5: The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-10 11:58:45 -05:00
Lai Jiangshan	667d241258	ring_buffer: fix ring_buffer_read_page() Impact: change API and init bpage when copy ring_buffer_read_page()/rb_remove_entries() may be called for a partially consumed page. Add a parameter for rb_remove_entries() and make it update cpu_buffer->entries correctly for partially consumed pages. ring_buffer_read_page() now returns the offset to the next event. Init the bpage's time_stamp when return value is 0. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-10 09:17:37 -05:00
Lai Jiangshan	b85fa01ed9	ring_buffer: fix typing mistake Impact: Fix bug I found several very very curious line. It's so curious that it may be brought by typing mistake. When (cpu_buffer->reader_page == cpu_buffer->commit_page): 1) We haven't copied it for bpage is changed: bpage = cpu_buffer->reader_page->page; memcpy(bpage->data, cpu_buffer->reader_page->page->data + read ... ) 2) We need update cpu_buffer->reader_page->read, but "cpu_buffer->reader_page += read;" is not right. [ This bug was a typo. The commit->reader_page is a page pointer and not an index into the page. The line should have been commit->reader_page->read += read. The other changes by Lai are nice clean ups to the code. - SDR ] Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-10 09:17:19 -05:00
Ingo Molnar	ae216dd239	Merge branch 'tip/tracing/ftrace' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace	2009-02-10 13:58:28 +01:00
Ingo Molnar	f9915bfef3	Merge branches 'tracing/ftrace' and 'tracing/urgent' into tracing/core	2009-02-10 13:25:42 +01:00
Hugh Dickins	acd895795d	profiling: fix broken profiling regression Impact: fix broken /proc/profile on UP machines Commit `c309b917ca` "cpumask: convert kernel/profile.c" broke profiling. prof_cpu_mask was previously initialized to CPU_MASK_ALL, but left uninitialized in that commit. We need to copy cpu_possible_mask (cpu_online_mask is not enough). Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-10 00:50:37 +01:00
Steven Rostedt	34cd4998d3	tracing: clean up splice code Ingo Molnar suggested a series of clean ups for the splice code. This patch implements those suggestions. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-09 12:24:58 -05:00
Eduard - Gabriel Munteanu	ff98781bab	tracing: Move pipe waiting code out of tracing_read_pipe(). This moves the pipe waiting code from tracing_read_pipe() into tracing_wait_pipe(), which is useful to implement other fops, like splice_read. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-09 12:24:51 -05:00
Eduard - Gabriel Munteanu	3c56819b14	tracing: splice support for tracing_pipe Added and implemented tracing_pipe_fops->splice_read(). This allows userspace programs to get tracing data more efficiently. Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-09 12:24:34 -05:00
Frederic Weisbecker	b91facc367	tracing/function-graph-tracer: handle the leaf functions from trace_pipe When one cats the trace file, the leaf functions are printed without brackets: function(); whereas in the trace_pipe file we'll see the following: function() { } This is because the ring_buffer handling is not the same between those two files. On the trace file, when an entry is printed, the iterator advanced and then we can check the next entry. There is no iterator with trace_pipe, the current entry to print has been peeked and not consumed. So checking the next entry will still return the current one while we don't consume it. This patch introduces a new value for the output callbacks to ask the tracing core to not consume the current entry after printing it. We need it because we will have to consume the current entry ourself to check the next one. Now the trace_pipe is able to handle well the leaf functions. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 12:37:27 +01:00
Ingo Molnar	1dfba05d0f	tracing/blktrace: move the tracing file to kernel/trace, fix Impact: build fix The BLK_DEV_IO_TRACE entry used to be in block/Kconfig - which file itself was dependent on CONFIG_BLOCK. But now the entry is in kernel/trace/Kconfig - which is present even on !CONFIG_BLOCK. So add a 'depends on BLOCK' to BLK_DEV_IO_TRACE. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 12:07:28 +01:00
Arnaldo Carvalho de Melo	b5db03c435	tracing: handle unregistering the current tracer Impact: simplification Instead of requiring that plugins have the sequence: my_tracer_stop(my_trace_array); unregister_tracer(my_tracer); it should be possible just do a: unregister_tracer(my_tracer); Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 10:56:53 +01:00
Frederic Weisbecker	3861a17bcc	tracing/function-graph-tracer: drop the kernel_text_address check When the function graph tracer picks a return address, it ensures this address is really a kernel text one by calling __kernel_text_address() Actually this path has never been taken.Its role was more likely to debug the tracer on the beginning of its development but this function is wasteful since it is called for every traced function. The fault check is already sufficient. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 10:51:38 +01:00
Frederic Weisbecker	1292211058	tracing/power: move the power trace headers to a dedicated file Impact: cleanup Move the power tracer headers to trace/power.h to keep ftrace.h and power bits more easy to maintain as separated topics. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 10:51:38 +01:00
Frederic Weisbecker	7447dce96f	tracing/function-graph-tracer: provide a selftest for the function graph tracer Making it more easy to do a basic regression test for this tracer. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 10:51:37 +01:00
Frederic Weisbecker	2db270a80b	tracing/blktrace: move the tracing file to kernel/trace Impact: cleanup Move blktrace.c to kernel/trace, also move its config entry. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-09 10:51:02 +01:00
Ingo Molnar	44b0635481	Merge branch 'tip/tracing/core/devel' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into tracing/ftrace Conflicts: kernel/trace/trace_hw_branches.c	2009-02-09 10:35:12 +01:00
Ingo Molnar	4ad476e11f	Merge commit 'v2.6.29-rc4' into tracing/core	2009-02-09 10:32:48 +01:00
Stefan Richter	f7de7621f0	async: use list_move_tail list.h provides a dedicated primitive for "list_del followed by list_add_tail"... list_move_tail. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>	2009-02-08 10:00:26 -08:00
Cornelia Huck	766ccb9ed4	async: Rename _special -> _domain for clarity. Rename the async__special() functions to async__domain(), which describes the purpose of these functions much better. [Broke up long lines to silence checkpatch] Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-02-08 09:56:11 -08:00
Cornelia Huck	f30d5b307c	async: Add some documentation. Add some kerneldoc to the async interface. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-02-08 09:56:11 -08:00
Cornelia Huck	86532d8b16	async: Handle kthread_run() return codes. If we fail to create the manager thread, fall back to non-fastboot. If we fail to create an async thread, try again after waiting for a bit. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-02-08 09:56:10 -08:00
Cornelia Huck	7a89bbc749	async: Fix running list handling. async_schedule() should pass in async_running as the running list, and run_one_entry() should put the entry to be run on the provided running list instead of always on the generic one. Reported-by: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>	2009-02-08 09:56:10 -08:00
Wenji Huang	57794a9d48	trace: trivial fixes in comment typos. Impact: clean up Fixed several typos in the comments. Signed-off-by: Wenji Huang <wenji.huang@oracle.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-07 20:03:36 -05:00
Steven Rostedt	a81bd80a0b	ring-buffer: use generic version of in_nmi Impact: clean up Now that a generic in_nmi is available, this patch removes the special code in the ring_buffer and implements the in_nmi generic version instead. With this change, I was also able to rename the "arch_ftrace_nmi_enter" back to "ftrace_nmi_enter" and remove the code from the ring buffer. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-07 20:03:33 -05:00
Steven Rostedt	78d904b46a	ring-buffer: add NMI protection for spinlocks Impact: prevent deadlock in NMI The ring buffers are not yet totally lockless with writing to the buffer. When a writer crosses a page, it grabs a per cpu spinlock to protect against a reader. The spinlocks taken by a writer are not to protect against other writers, since a writer can only write to its own per cpu buffer. The spinlocks protect against readers that can touch any cpu buffer. The writers are made to be reentrant with the spinlocks disabling interrupts. The problem arises when an NMI writes to the buffer, and that write crosses a page boundary. If it grabs a spinlock, it can be racing with another writer (since disabling interrupts does not protect against NMIs) or with a reader on the same CPU. Luckily, most of the users are not reentrant and protects against this issue. But if a user of the ring buffer becomes reentrant (which is what the ring buffers do allow), if the NMI also writes to the ring buffer then we risk the chance of a deadlock. This patch moves the ftrace_nmi_enter called by nmi_enter() to the ring buffer code. It replaces the current ftrace_nmi_enter that is used by arch specific code to arch_ftrace_nmi_enter and updates the Kconfig to handle it. When an NMI is called, it will set a per cpu variable in the ring buffer code and will clear it when the NMI exits. If a write to the ring buffer crosses page boundaries inside an NMI, a trylock is used on the spin lock instead. If the spinlock fails to be acquired, then the entry is discarded. This bug appeared in the ftrace work in the RT tree, where event tracing is reentrant. This workaround solved the deadlocks that appeared there. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-07 20:00:17 -05:00
Steven Rostedt	1830b52d0d	trace: remove deprecated entry->cpu Impact: fix to prevent developers from using entry->cpu With the new ring buffer infrastructure, the cpu for the entry is implicit with which CPU buffer it is on. The original code use to record the current cpu into the generic entry header, which can be retrieved by entry->cpu. When the ring buffer was introduced, the users were convert to use the the cpu number of which cpu ring buffer was in use (this was passed to the tracers by the iterator: iter->cpu). Unfortunately, the cpu item in the entry structure was never removed. This allowed for developers to use it instead of the proper iter->cpu, unknowingly, using an uninitialized variable. This was not the fault of the developers, since it would seem like the logical place to retrieve the cpu identifier. This patch removes the cpu item from the entry structure and fixes all the users that should have been using iter->cpu. Signed-off-by: Steven Rostedt <srostedt@redhat.com>	2009-02-07 19:38:43 -05:00
Len Brown	2d29c6a075	Merge branches 'release', 'asus', 'bugzilla-12450', 'cpuidle', 'debug', 'ec', 'misc', 'printk' and 'processor' into release	2009-02-07 01:34:56 -05:00
Li Zefan	04ec93fe9b	fork.c: fix NULL pointer dereference when nr_threads == threads-max I happened to forked lots of processes, and hit NULL pointer dereference. It is because in copy_process() after checking max_threads, 0 is returned but not -EAGAIN. The bug is introduced by "CRED: Detach the credentials from task_struct" (commit `f1752eec61`). Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: James Morris <jmorris@namei.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-06 08:43:11 -08:00
Arnaldo Carvalho de Melo	b6f11df26f	trace: Call tracing_reset_online_cpus before tracer->init() Impact: cleanup To make it easy for ftrace plugin writers, as this was open coded in the existing plugins Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-06 01:01:41 +01:00
Arnaldo Carvalho de Melo	51a763dd84	tracing: Introduce trace_buffer_{lock_reserve,unlock_commit} Impact: new API These new functions do what previously was being open coded, reducing the number of details ftrace plugin writers have to worry about. It also standardizes the handling of stacktrace, userstacktrace and other trace options we may introduce in the future. With this patch, for instance, the blk tracer (and some others already in the tree) can use the "userstacktrace" /d/tracing/trace_options facility. $ codiff /tmp/vmlinux.before /tmp/vmlinux.after linux-2.6-tip/kernel/trace/trace.c: trace_vprintk \| -5 trace_graph_return \| -22 trace_graph_entry \| -26 trace_function \| -45 __ftrace_trace_stack \| -27 ftrace_trace_userstack \| -29 tracing_sched_switch_trace \| -66 tracing_stop \| +1 trace_seq_to_user \| -1 ftrace_trace_special \| -63 ftrace_special \| +1 tracing_sched_wakeup_trace \| -70 tracing_reset_online_cpus \| -1 13 functions changed, 2 bytes added, 355 bytes removed, diff: -353 linux-2.6-tip/block/blktrace.c: __blk_add_trace \| -58 1 function changed, 58 bytes removed, diff: -58 linux-2.6-tip/kernel/trace/trace.c: trace_buffer_lock_reserve \| +88 trace_buffer_unlock_commit \| +86 2 functions changed, 174 bytes added, diff: +174 /tmp/vmlinux.after: 16 functions changed, 176 bytes added, 413 bytes removed, diff: -237 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-06 01:01:41 +01:00
Arnaldo Carvalho de Melo	0a9877514c	ring_buffer: remove unused flags parameter Impact: API change, cleanup >From ring_buffer_{lock_reserve,unlock_commit}. $ codiff /tmp/vmlinux.before /tmp/vmlinux.after linux-2.6-tip/kernel/trace/trace.c: trace_vprintk \| -14 trace_graph_return \| -14 trace_graph_entry \| -10 trace_function \| -8 __ftrace_trace_stack \| -8 ftrace_trace_userstack \| -8 tracing_sched_switch_trace \| -8 ftrace_trace_special \| -12 tracing_sched_wakeup_trace \| -8 9 functions changed, 90 bytes removed, diff: -90 linux-2.6-tip/block/blktrace.c: __blk_add_trace \| -1 1 function changed, 1 bytes removed, diff: -1 /tmp/vmlinux.after: 10 functions changed, 91 bytes removed, diff: -91 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frédéric Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-06 01:01:40 +01:00
Johannes Weiner	777c6c5f1f	wait: prevent exclusive waiter starvation With exclusive waiters, every process woken up through the wait queue must ensure that the next waiter down the line is woken when it has finished. Interruptible waiters don't do that when aborting due to a signal. And if an aborting waiter is concurrently woken up through the waitqueue, noone will ever wake up the next waiter. This has been observed with __wait_on_bit_lock() used by lock_page_killable(): the first contender on the queue was aborting when the actual lock holder woke it up concurrently. The aborted contender didn't acquire the lock and therefor never did an unlock followed by waking up the next waiter. Add abort_exclusive_wait() which removes the process' wait descriptor from the waitqueue, iff still queued, or wakes up the next waiter otherwise. It does so under the waitqueue lock. Racing with a wake up means the aborting process is either already woken (removed from the queue) and will wake up the next waiter, or it will remove itself from the queue and the concurrent wake up will apply to the next waiter after it. Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and __wait_on_bit_lock() when they were interrupted by other means than a wake up through the queue. [akpm@linux-foundation.org: coding-style fixes] Reported-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Mentored-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chuck Lever <cel@citi.umich.edu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> ["after some testing"] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-05 12:56:48 -08:00
Andrew Morton	60fd760fb9	revert "rlimit: permit setting RLIMIT_NOFILE to RLIM_INFINITY" Revert commit `0c2d64fb6c` because it causes (arguably poorly designed) existing userspace to spend interminable periods closing billions of not-open file descriptors. We could bring this back, with some sort of opt-in tunable in /proc, which defaults to "off". Peter's alanysis follows: : I spent several hours trying to get to the bottom of a serious : performance issue that appeared on one of our servers after upgrading to : 2.6.28. In the end it's what could be considered a userspace bug that : was triggered by a change in 2.6.28. Since this might also affect other : people I figured I'd at least document what I found here, and maybe we : can even do something about it: : : : So, I upgraded some of debian.org's machines to 2.6.28.1 and immediately : the team maintaining our ftp archive complained that one of their : scripts that previously ran in a few minutes still hadn't even come : close to being done after an hour or so. Downgrading to 2.6.27 fixed : that. : : Turns out that script is forking a lot and something in it or python or : whereever closes all the file descriptors it doesn't want to pass on. : That is, it starts at zero and goes up to ulimit -n/RLIMIT_NOFILE and : closes them all with a few exceptions. : : Turns out that takes a long time when your limit -n is now 2^20 (1048576). : : With 2.6.27.* the ulimit -n was the standard 1024, but with 2.6.28 it is : now a thousand times that. : : 2.6.28 included a patch titled "rlimit: permit setting RLIMIT_NOFILE to : RLIM_INFINITY" (`0c2d64fb6c`)[1] that : allows, as the title implies, to set the limit for number of files to : infinity. : : Closer investigation showed that the broken default ulimit did not apply : to "system" processes (like stuff started from init). In the end I : could establish that all processes that passed through pam_limit at one : point had the bad resource limit. : : Apparently the pam library in Debian etch (4.0) initializes the limits : to some default values when it doesn't have any settings in limit.conf : to override them. Turns out that for nofiles this is RLIM_INFINITY. : Commenting out "case RLIMIT_NOFILE" in pam_limit.c:267 of our pam : package version 0.79-5 fixes that - tho I'm not sure what side effects : that has. : : Debian lenny (the upcoming 5.0 version) doesn't have this issue as it : uses a different pam (version). Reported-by: Peter Palfrader <weasel@debian.org> Cc: Adam Tkac <vonsch@gmail.com> Cc: Michael Kerrisk <mtk.manpages@googlemail.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-05 12:56:47 -08:00
Andrew Morton	58763a2974	kernel/async.c: fix printk warnings alpha: kernel/async.c: In function 'run_one_entry': kernel/async.c:141: warning: format '%lli' expects type 'long long int', but argument 2 has type 'async_cookie_t' kernel/async.c:149: warning: format '%lli' expects type 'long long int', but argument 2 has type 'async_cookie_t' kernel/async.c:149: warning: format '%lld' expects type 'long long int', but argument 4 has type 's64' kernel/async.c: In function 'async_synchronize_cookie_special': kernel/async.c:250: warning: format '%lli' expects type 'long long int', but argument 3 has type 's64' Cc: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-02-05 12:56:46 -08:00
Steven Rostedt	dac7494028	trace: code style clean up Ingo Molnar suggested using goto logic to keep the indentation down and to be able to remove the nasty line breaks. This actually makes the code a bit more readable. Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-05 14:35:47 +01:00
Arnaldo Carvalho de Melo	7be421510b	trace: Remove unused trace_array_cpu parameter Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-05 14:35:47 +01:00
Arnaldo Carvalho de Melo	97e5b191ae	trace_branch: Remove unused function Impact: cleanup Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-05 14:35:46 +01:00
Arnaldo Carvalho de Melo	268ccda0cb	trace: assign defaults at register_ftrace_event Impact: simplification of tracers As all tracers are doing this we might as well do it in register_ftrace_event and save one branch each time we call these callbacks. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <srostedt@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-05 14:35:46 +01:00
Peter Zijlstra	4cd4c1b40d	timers: split process wide cpu clocks/timers Change the process wide cpu timers/clocks so that we: 1) don't mess up the kernel with too many threads, 2) don't have a per-cpu allocation for each process, 3) have no impact when not used. In order to accomplish this we're going to split it into two parts: - clocks; which can take all the time they want since they run from user context -- ie. sys_clock_gettime(CLOCK_PROCESS_CPUTIME_ID) - timers; which need constant time sampling but since they're explicity used, the user can pay the overhead. The clock readout will go back to a full sum of the thread group, while the timers will run of a global 'clock' that only runs when needed, so only programs that make use of the facility pay the price. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-05 13:04:33 +01:00
Peter Zijlstra	32bd671d6c	signal: re-add dead task accumulation stats. We're going to split the process wide cpu accounting into two parts: - clocks; which can take all the time they want since they run from user context. - timers; which need constant time tracing but can affort the overhead because they're default off -- and rare. The clock readout will go back to a full sum of the thread group, for this we need to re-add the exit stats that were removed in the initial itimer rework (`f06febc9`: timers: fix itimer/many thread hang). Furthermore, since that full sum can be rather slow for large thread groups and we have the complete dead task stats, revert the do_notify_parent time computation. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-05 13:04:33 +01:00
Linus Torvalds	647802d6db	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86: APIC: enable workaround on AMD Fam10h CPUs xen: disable interrupts before saving in percpu x86: add x86@kernel.org to MAINTAINERS x86: push old stack address on irqstack for unwinder irq, x86: fix lock status with numa_migrate_irq_desc x86: add cache descriptors for Intel Core i7 x86/Voyager: make it build and boot	2009-02-04 13:58:50 -08:00
Suresh Siddha	483b4ee60e	sched: fix nohz load balancer on cpu offline Christian Borntraeger reports: > After a logical cpu offline, even on a complete idle system, there > is one cpu with full ticks. It turns out that nohz.cpu_mask has the > the offlined cpu still set. > > In select_nohz_load_balancer() we check if the system is completely > idle to turn of load balancing. We compare cpu_online_map with > nohz.cpu_mask. Since cpu_online_map is updated on cpu unplug, > but nohz.cpu_mask is not, the check fails and the scheduler believes > that we need an "idle load balancer" even on a fully idle system. > Since the ilb cpu does not deactivate the timer tick this breaks NOHZ. Fix the select_nohz_load_balancer() to not set the nohz.cpu_mask while a cpu is going offline. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-04 22:31:19 +01:00
Arnaldo Carvalho de Melo	ae7462b4f1	trace: make the trace_event callbacks return enum print_line_t As they actually all return these enumerators. Reported-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-04 20:48:39 +01:00
Arnaldo Carvalho de Melo	d9793bd801	trace: judicious error checking of trace_seq results Impact: bugfix and cleanup Some callsites were returning either TRACE_ITER_PARTIAL_LINE if the trace_seq routines (trace_seq_printf, etc) returned 0 meaning its buffer was full, or zero otherwise. But... /* Return values for print_line callback / enum print_line_t { TRACE_TYPE_PARTIAL_LINE = 0, / Retry after flushing the seq / TRACE_TYPE_HANDLED = 1, TRACE_TYPE_UNHANDLED = 2 / Relay to other output functions */ }; In other cases the return value was not being relayed at all. Most of the time it didn't hurt because the page wasn't get filled, but for correctness sake, handle the return values everywhere. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-04 20:48:30 +01:00
Ingo Molnar	ce70a0b472	Merge branches 'tracing/blktrace', 'tracing/ftrace', 'tracing/urgent' and 'linus' into tracing/core	2009-02-04 20:45:41 +01:00
Ingo Molnar	bb960a1e42	Merge branch 'core/xen' into x86/urgent	2009-02-04 14:54:56 +01:00
Oleg Nesterov	229c4ef8ae	ftrace: do_each_pid_task() needs rcu lock "ftrace: use struct pid" commit `978f3a45d9` converted ftrace_pid_trace to "struct pid*". But we can't use do_each_pid_task() without rcu_read_lock() even if we know the pid itself can't go away (it was pinned in ftrace_pid_write). The exiting task can detach itself from this pid at any moment. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-02-03 22:50:58 +01:00

1 2 3 4 5 ...

6142 Commits