- 17 Aug, 2005 3 commits
-
-
Amy Griffis authored
The following patch against audit.81 prevents duplicate syscall rules in a given filter list by walking the list on each rule add. I also removed the unused struct audit_entry in audit.c and made the static inlines in auditsc.c consistent. Signed-off-by:
Amy Griffis <amy.griffis@hp.com> Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
David Woodhouse authored
It was showing up fairly high on profiles even when no rules were set. Make sure the common path stays as fast as possible. Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
David Woodhouse authored
Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
- 10 Aug, 2005 1 commit
-
-
James Bottomley authored
We have a chek in there to make sure that the name won't overflow task_struct.comm[], but it's triggering for scsi with lots of HBAs, only scsi is using single-threaded workqueues which don't append the "/%d" anyway. All too hard. Just kill the BUG_ON. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by:
Andrew Morton <akpm@osdl.org> [ kthread_create() uses vsnprintf() and limits the thing, so no actual overflow can actually happen regardless ] Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 09 Aug, 2005 1 commit
-
-
Paul Jackson authored
Fix possible cpuset_sem ABBA deadlock if 'notify_on_release' set. For a particular usage pattern, creating and destroying cpusets fairly frequently using notify_on_release, on a very large system, this deadlock can be seen every few days. If you are not using the cpuset notify_on_release feature, you will never see this deadlock. The existing code, on task exit (or cpuset deletion) did: get cpuset_sem if cpuset marked notify_on_release and is ready to release: compute cpuset path relative to /dev/cpuset mount point call_usermodehelper() forks /sbin/cpuset_release_agent with path drop cpuset_sem Unfortunately, the fork in call_usermodehelper can allocate memory, and allocating memory can require cpuset_sem, if the mems_generation values changed in the interim. This results in an ABBA deadlock, trying to obtain cpuset_sem when it is already held by the current task. To fix this, I put the cpuset path (which must be computed while holding cpuset_sem) in a temporary buffer, to be used in the call_usermodehelper call of /sbin/cpuset_release_agent only _after_ dropping cpuset_sem. So the new logic is: get cpuset_sem if cpuset marked notify_on_release and is ready to release: compute cpuset path relative to /dev/cpuset mount point stash path in kmalloc'd buffer drop cpuset_sem call_usermodehelper() forks /sbin/cpuset_release_agent with path free path The sharp eyed reader might notice that this patch does not contain any calls to kmalloc. The existing code in the check_for_release() routine was already kmalloc'ing a buffer to hold the cpuset path. In the old code, it just held the buffer for a few lines, over the cpuset_release_agent() call that in turn invoked call_usermodehelper(). In the new code, with the application of this patch, it returns that buffer via the new char **ppathbuf parameter, for later use and freeing in cpuset_release_agent(), which is called after cpuset_sem is dropped. Whereas the old code has just one call to cpuset_release_agent(), right in the check_for_release() routine, the new code has three calls to cpuset_release_agent(), from the various places that a cpuset can be released. This patch has been build and booted on SN2, and passed a stress test that previously hit the deadlock within a few seconds. Signed-off-by:
Paul Jackson <pj@sgi.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 04 Aug, 2005 2 commits
-
-
Andrew Morton authored
Revert this June 17 patch: it broke persistence of timers across execve(). Cc: Roland McGrath <roland@redhat.com> Cc: george anzinger <george@mvista.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Benjamin Herrenschmidt authored
This removes the calls to device_suspend() from the shutdown path that were added sometime during 2.6.13-rc*. They aren't working properly on a number of configs (I got reports from both ppc powerbook users and x86 users) causing the system to not shutdown anymore. I think it isn't the right approach at the moment anyway. We have already a shutdown() callback for the drivers that actually care about shutdown and the suspend() code isn't yet in a good enough shape to be so much generalized. Also, the semantics of suspend and shutdown are slightly different on a number of setups and the way this was patched in provides little way for drivers to cleanly differenciate. It should have been at least a different message. For 2.6.13, I think we should revert to 2.6.12 behaviour and have a working suspend back. Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 02 Aug, 2005 1 commit
-
-
Rusty Russell authored
The module code assumes noone will ever ask for a per-cpu area more than SMP_CACHE_BYTES aligned. However, as these cases show, gcc asks sometimes asks for 32-byte alignment for the per-cpu section on a module, and if CONFIG_X86_L1_CACHE_SHIFT is 4, we hit that BUG_ON(). This is obviously an unusual combination, as there have been few reports, but better to warn than die. See: http://www.ussg.iu.edu/hypermail/linux/kernel/0409.0/0768.html And more recently: http://bugs.gentoo.org/show_bug.cgi?id=97006 Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 01 Aug, 2005 1 commit
-
-
Ingo Molnar authored
This removes sys_set_zone_reclaim() for now. While i'm sure Martin is trying to solve a real problem, we must not hard-code an incomplete and insufficient approach into a syscall, because syscalls are pretty much for eternity. I am quite strongly convinced that this syscall must not hit v2.6.13 in its current form. Firstly, the syscall lacks basic syscall design: e.g. it allows the global setting of VM policy for unprivileged users. (!) [ Imagine an Oracle installation and a SAP installation on the same NUMA box fighting over the 'optimal' setting for this flag. What will they do? Will they try to set the flag to their own preferred value every second or so? ] Secondly, it was added based on a single datapoint from Martin: http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2 where Martin characterizes the numbers the following way: ' Run-to-run variability for "make -j" is huge, so these numbers aren't terribly useful except to see that with reclaim the benchmark still finishes in a reasonable amount of time. ' in other words: the fundamental problem has likely not been solved, only a tendential move into the right direction has been observed, and a handful of numbers were picked out of a set of hugely variable results, without showing the variability data. How much variance is there run-to-run? I'd really suggest to first walk the walk and see what's needed to get stable & predictable kernel compilation numbers on that NUMA box, before adding random syscalls to tune a particular aspect of the VM ... which approach might not even matter once the whole picture has been analyzed and understood! The third, most important point is that the syscall exposes VM tuning internals in a completely unstructured way. What sense does it make to have a _GLOBAL_ per-node setting for 'should we go to another node for reclaim'? If then it might make sense to do this per-app, via numalib or so. The change is minimalistic in that it doesnt remove the syscall and the underlying infrastructure changes, only the user-visible changes. We could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka /proc/sys/vm/swappiness, but even that looks quite counterproductive when the generic approach is that we are trying to reduce the number of external factors in the VM balance picture. Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 30 Jul, 2005 1 commit
-
-
Andrew Morton authored
This snuck in with an x86_64 change. Thanks to Richard Purdie <rpurdie@rpsys.net> for spotting it. Cc: Andi Kleen <ak@muc.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 29 Jul, 2005 3 commits
-
-
Eric W. Biederman authored
If device_suspend(PMSG_FREEZE) is not ready to be called in kernel_restart it is definitely not ready to be called in the even more fickle kernel_kexec. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
George Anzinger authored
(We found this (after a customer complained) and it is in the kernel.org kernel. Seems that for CLOCK_MONOTONIC absolute timers and clock_nanosleep calls both the request time and wall_to_monotonic are subtracted prior to the normalize resulting in an overflow in the existing normalize test. This causes the result to be shifted ~4 seconds ahead instead of ~2 seconds back in time.) The normalize code in posix-timers.c fails when the tv_nsec member is ~1.2 seconds negative. This can happen on absolute timers (and clock_nanosleeps) requested on CLOCK_MONOTONIC (both the request time and wall_to_monotonic are subtracted resulting in the possibility of a number close to -2 seconds.) This fix uses the set_normalized_timespec() (which does not have an overflow problem) to fix the problem and as a side effect makes the code cleaner. Signed-off-by:
George Anzinger <george@mvista.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Andi Kleen authored
This avoids some potential stack overflows with very deep softirq callchains. i386 does this too. TOADD CFI annotation Signed-off-by:
Andi Kleen <ak@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 27 Jul, 2005 8 commits
-
-
Andrew Morton authored
My fairly ordinary x86 test box gets stuck during reboot on the wait_for_completion() in ide_do_drive_cmd(): Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Jesper Juhl authored
`gcc -W' likes to complain if the static keyword is not at the beginning of the declaration. This patch fixes all remaining occurrences of "inline static" up with "static inline" in the entire kernel tree (140 occurrences in 47 files). While making this change I came across a few lines with trailing whitespace that I also fixed up, I have also added or removed a blank line or two here and there, but there are no functional changes in the patch. Signed-off-by:
Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Randy Dunlap authored
Add kerneldoc to kernel/crash_dump.c Signed-off-by:
Randy Dunlap <rdunlap@xenotime.net> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Randy Dunlap authored
Add kerneldoc to kernel/cpuset.c Fix cpuset typos in init/Kconfig Signed-off-by:
Randy Dunlap <rdunlap@xenotime.net> Acked-by:
Paul Jackson <pj@sgi.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Randy Dunlap authored
Add kerneldoc to kernel/capability.c Signed-off-by:
Randy Dunlap <rdunlap@xenotime.net> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Martin Schwidefsky authored
Split spin lock and r/w lock implementation into a single try which is done inline and an out of line function that repeatedly tries to get the lock before doing the cpu_relax(). Add a system control to set the number of retries before a cpu is yielded. The reason for the spin lock retry is that the diagnose 0x44 that is used to give up the virtual cpu is quite expensive. For spin locks that are held only for a short period of time the costs of the diagnoses outweights the savings for spin locks that are held for a longer timer. The default retry count is 1000. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
George Anzinger authored
Fix the recent off-by-one fix in the itimer code: 1. The repeating timer is figured using the requested time (not +1 as we know where we are in the jiffie). 2. The tests for interval too large are left to the time_val to jiffie code. Signed-off-by:
George Anzinger <george@mvista.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Nigel Cunningham authored
This patch fixes a warning in the disable_nonboot_cpus call in kernel/power/smp.c. Signed-off by: Nigel Cunningham <nigel@suspend2.net> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 26 Jul, 2005 9 commits
-
-
Steven Rostedt authored
Here's the patch again to fix the code to handle if the values between MAX_USER_RT_PRIO and MAX_RT_PRIO are different. Without this patch, an SMP system will crash if the values are different. Signed-off-by:
Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by:
Dean Nelson <dcn@sgi.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Andreas Steinmetz authored
RLIMIT_RTPRIO is supposed to grant non privileged users the right to use SCHED_FIFO/SCHED_RR scheduling policies with priorites bounded by the RLIMIT_RTPRIO value via sched_setscheduler(). This is usually used by audio users. Unfortunately this is broken in 2.6.13rc3 as you can see in the excerpt from sched_setscheduler below: /* * Allow unprivileged RT tasks to decrease priority: */ if (!capable(CAP_SYS_NICE)) { /* can't change policy */ if (policy != p->policy) return -EPERM; After the above unconditional test which causes sched_setscheduler to fail with no regard to the RLIMIT_RTPRIO value the following check is made: /* can't increase priority */ if (policy != SCHED_NORMAL && param->sched_priority > p->rt_priority && param->sched_priority > p->signal->rlim[RLIMIT_RTPRIO].rlim_cur) return -EPERM; Thus I do believe that the RLIMIT_RTPRIO value must be taken into account for the policy check, especially as the RLIMIT_RTPRIO limit is of no use without this change. The attached patch fixes this problem. Signed-off-by:
Andreas Steinmetz <ast@domdv.de> Acked-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
The suspend to disk code was a poor copy of the code in sys_reboot now that we have kernel_power_off, kernel_restart and kernel_halt use them instead of poorly duplicating them inline. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
We know the system is in trouble so there is no question if this is an emergecy :) Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
We already do all of the gymnastics to run from process context to call the power off code so call into the power off code cleanly. This especially helps acpi as part of it's shutdown logic should run acpi_shutdown called from device_shutdown which was not being called from here. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
When the kernel is working well and we want to restart cleanly kernel_restart is the function to use. But in many instances the kernel wants to reboot when thing are expected to be working very badly such as from panic or a software watchdog handler. This patch adds the function emergency_restart() so that callers can be clear what semantics they expect when calling restart. emergency_restart() is expected to be callable from interrupt context and possibly reliable in even more trying circumstances. This is an initial generic implementation for all architectures. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
It is obvious we wanted to call kernel_restart here but since we don't have it the code was expanded inline and hasn't been correct since sometime in 2.4. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
Because the factors of sys_reboot don't exist people calling into the reboot path duplicate the code badly, leading to inconsistent expectations of code in the reboot path. This patch should is just code motion. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Eric W. Biederman authored
In the recent addition of device_suspend calls into sys_reboot two code paths were missed. Signed-off-by:
Eric W. Biederman <ebiederm@xmission.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 18 Jul, 2005 1 commit
-
-
David Woodhouse authored
... by generating serial numbers only if an audit context is actually _used_, rather than doing so at syscall entry even when the context isn't necessarily marked auditable. Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
- 15 Jul, 2005 1 commit
-
-
David Woodhouse authored
The tricks with atomic_t were bizarre. Just do it sensibly instead. Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
- 14 Jul, 2005 1 commit
-
-
David Woodhouse authored
We didn't rename it to audit_tgid after all. Except once... Doh. Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
- 13 Jul, 2005 4 commits
-
-
David Woodhouse authored
When we flush a pending syscall audit record due to audit_free(), we might be doing that in the context of the idle thread. So use GFP_ATOMIC Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
David Woodhouse authored
and not just the one thread. Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
Victor Fusco authored
Fix the sparse warning "implicit cast to nocast type" Signed-off-by:
Victor Fusco <victor@cetuc.puc-rio.br> Signed-off-by:
Domen Puncer <domen@coderock.org> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
Robert Love authored
This moves the inotify sysctl knobs to "/proc/sys/fs/inotify" from "/proc/sys/fs". Also some related cleanup. Signed-off-by:
Robert Love <rml@novell.com> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 12 Jul, 2005 3 commits
-
-
Robert Love authored
inotify is intended to correct the deficiencies of dnotify, particularly its inability to scale and its terrible user interface: * dnotify requires the opening of one fd per each directory that you intend to watch. This quickly results in too many open files and pins removable media, preventing unmount. * dnotify is directory-based. You only learn about changes to directories. Sure, a change to a file in a directory affects the directory, but you are then forced to keep a cache of stat structures. * dnotify's interface to user-space is awful. Signals? inotify provides a more usable, simple, powerful solution to file change notification: * inotify's interface is a system call that returns a fd, not SIGIO. You get a single fd, which is select()-able. * inotify has an event that says "the filesystem that the item you were watching is on was unmounted." * inotify can watch directories or files. Inotify is currently used by Beagle (a desktop search infrastructure), Gamin (a FAM replacement), and other projects. See Documentation/filesystems/inotify.txt. Signed-off-by:
Robert Love <rml@novell.com> Cc: John McCutchan <ttb@tentacle.dhs.org> Cc: Christoph Hellwig <hch@lst.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Hugh Dickins authored
dup_mmap of a VM_DONTCOPY vma forgot to lower the child's total_vm. (But no way does this account for the recent report of total_vm seen too low.) Signed-off-by:
Hugh Dickins <hugh@veritas.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Andrew Morton authored
kernel/power/disk.c needs a declaration of name_to_dev_t() in scope. mount.h seems like an appropriate choice. Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-