- 13 Jun, 2014 5 commits
-
-
Devin Kim authored
temp[64] is used for internal temporary buffer in dynamic_dname(). This is for dname. But It's too small. dname's size may be > 64. In that case, it returns as -ENAMETOOLONG. So Increase the buffer size to 256 for avoiding this issue. The following was caused by the small buffer. WARNING: at /kernel/mm/page_alloc.c:2470 __alloc_pages_nodemask+0x24c/0x938() CPU: 2 PID: 505 Comm: android.bg Not tainted 3.10.0-g2f73780-00003-g2ff41d9-dirty #13 [<c010ba3c>] (unwind_backtrace+0x0/0x11c) from [<c0109cac>] (show_stack+0x10/0x14) [<c0109cac>] (show_stack+0x10/0x14) from [<c01939a0>] (warn_slowpath_common+0x48/0x68) [<c01939a0>] (warn_slowpath_common+0x48/0x68) from [<c0193a7c>] (warn_slowpath_null+0x18/0x20) [<c0193a7c>] (warn_slowpath_null+0x18/0x20) from [<c0222454>] (__alloc_pages_nodemask+0x24c/0x938) [<c0222454>] (__alloc_pages_nodemask+0x24c/0x938) from [<c0222b50>] (__get_free_pages+0x10/0x24) [<c0222b50>] (__get_free_pages+0x10/0x24) from [<c024faf8>] (kmalloc_order_trace+0x24/0xf0) [<c024faf8>] (kmalloc_order_trace+0x24/0xf0) from [<c024fe20>] (__kmalloc+0x30/0x244) [<c024fe20>] (__kmalloc+0x30/0x244) from [<c02723c8>] (seq_read+0x270/0x464) [<c02723c8>] (seq_read+0x270/0x464) from [<c0256a18>] (vfs_read+0xa4/0x134) [<c0256a18>] (vfs_read+0xa4/0x134) from [<c0256de8>] (SyS_read+0x38/0x68) [<c0256de8>] (SyS_read+0x38/0x68) from [<c0106140>] (ret_fast_syscall+0x0/0x30) Change-Id: I74f5217ba3c4be73e91f33f900f1f0c26810cc05 Signed-off-by:
Devin Kim <dojip.kim@lge.com>
-
Devin Kim authored
Once we'd freed m->buf, m->count should become zero - we have no valid contents reachable via m->buf. cherry-picked from https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/seq_file.c?id=801a76050bcf8d4e500eb8d048ff6265f37a61c8 Change-Id: I4c1d3e69db4ecf5362e2a5d05bfd7db754dc6dc6 Reported-by:
Charley (Hao Chuan) Chu <charley.chu@broadcom.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Devin Kim <dojip.kim@lge.com>
-
Devin Kim authored
This issue was first pointed out by Jiaxing Wang several months ago, but no further comments: https://lkml.org/lkml/2013/6/29/41 As we know pread() does not change f_pos, so after pread(), file->f_pos and m->read_pos become different. And seq_lseek() does not update file->f_pos if offset equals to m->read_pos, so after pread() and seq_lseek()(lseek to m->read_pos), then a subsequent read may read from a wrong position, the following program produces the problem: char str1[32] = { 0 }; char str2[32] = { 0 }; int poffset = 10; int count = 20; /*open any seq file*/ int fd = open("/proc/modules", O_RDONLY); pread(fd, str1, count, poffset); printf("pread:%s\n", str1); /*seek to where m->read_pos is*/ lseek(fd, poffset+count, SEEK_SET); /*supposed to read from poffset+count, but this read from position 0*/ read(fd, str2, count); printf("read:%s\n", str2); out put: pread: ck_netbios_ns 12665 read: nf_conntrack_netbios /proc/modules: nf_conntrack_netbios_ns 12665 0 - Live 0xffffffffa038b000 nf_conntrack_broadcast 12589 1 nf_conntrack_netbios_ns, Live 0xffffffffa0386000 So we always update file->f_pos to offset in seq_lseek() to fix this issue. cherry-picked from https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/seq_file.c?id=05e16745c0c471bba313961b605b6da3b21a853d Signed-off-by:
Jiaxing Wang <hello.wjx@gmail.com> Signed-off-by:
Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Conflicts: fs/seq_file.c Change-Id: If419b92498e2c3e08669a4342e9b9ebf99ad3768 Signed-off-by:
Devin Kim <dojip.kim@lge.com>
-
Jongrak Kwon authored
Improved ghost touch issues by - Upgraded baseline check algorithm - Adjusted touch sensitivity It doesn't help in all cases for b/9236385 Bug: 7725315 Bug: 9236385 Change-Id: I1850427bac84620f34cad61bdfd9b16bbf692e32 Signed-off-by:
Jongrak Kwon <jongrak.kwon@lge.com>
-
Naseer Ahmed authored
When there is no secure pipe configured, we can unmap secure memory instead of relying on unset with secure flag set from userspace. Bug: 11857675 Signed-off-by:
Naseer Ahmed <naseer@codeaurora.org>
-
- 06 Apr, 2014 29 commits
-
-
hellsgod authored
-
Greg Kroah-Hartman authored
staging: speakup: Prefix externally-visible symbols commit ca2beaf84d9678c12b17d92623f0e90829d6ca13 upstream. This prefixes all externally-visible symbols of speakup with "spk_". Signed-off-by:
Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Kamal Mostafa <kamal@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> ext4: atomically set inode->i_flags in ext4_set_inode_flags() commit 00a1a053ebe5febcfc2ec498bd894f035ad2aa06 upstream. Use cmpxchg() to atomically set i_flags instead of clearing out the S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race where an immutable file has the immutable flag cleared for a brief window of time. Reported-by:
John Sullivan <jsrhbz@kanargh.force9.co.uk> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Input: synaptics - add manual min/max quirk commit 421e08c41fda1f0c2ff6af81a67b491389b653a5 upstream. The new Lenovo Haswell series (-40's) contains a new Synaptics touchpad. However, these new Synaptics devices report bad axis ranges. Under Windows, it is not a problem because the Windows driver uses RMI4 over SMBus to talk to the device. Under Linux, we are using the PS/2 fallback interface and it occurs the reported ranges are wrong. Of course, it would be too easy to have only one range for the whole series, each touchpad seems to be calibrated in a different way. We can not use SMBus to get the actual range because I suspect the firmware will switch into the SMBus mode and stop talking through PS/2 (this is the case for hybrid HID over I2C / PS/2 Synaptics touchpads). So as a temporary solution (until RMI4 land into upstream), start a new list of quirks with the min/max manually set. Signed-off-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> Input: synaptics - add manual min/max quirk for ThinkPad X240 commit 8a0435d958fb36d93b8df610124a0e91e5675c82 upstream. This extends Benjamin Tissoires manual min/max quirk table with support for the ThinkPad X240. Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> x86: fix boot on uniprocessor systems commit 825600c0f20e595daaa7a6dd8970f84fa2a2ee57 upstream. On x86 uniprocessor systems topology_physical_package_id() returns -1 which causes rapl_cpu_prepare() to leave rapl_pmu variable uninitialized which leads to GPF in rapl_pmu_init(). See arch/x86/kernel/cpu/perf_event_intel_rapl.c. It turns out that physical_package_id and core_id can actually be retreived for uniprocessor systems too. Enabling them also fixes rapl_pmu code. Signed-off-by:
Artem Fetishev <artem_fetishev@epam.com> Cc: Stephane Eranian <eranian@google.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> netfilter: nf_conntrack_dccp: fix skb_header_pointer API usages commit b22f5126a24b3b2f15448c3f2a254fc10cbc2b92 upstream. Some occurences in the netfilter tree use skb_header_pointer() in the following way ... struct dccp_hdr _dh, *dh; ... skb_header_pointer(skb, dataoff, sizeof(_dh), &dh); ... where dh itself is a pointer that is being passed as the copy buffer. Instead, we need to use &_dh as the forth argument so that we're copying the data into an actual buffer that sits on the stack. Currently, we probably could overwrite memory on the stack (e.g. with a possibly mal-formed DCCP packet), but unintentionally, as we only want the buffer to be placed into _dh variable. Fixes: 2bc78049 ("[NETFILTER]: nf_conntrack: add DCCP protocol support") Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Signed-off-by:
Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hareesh Gundu authored
Decerement entry refcount, which incremented in kgsl_sharedmem_find_region. CRs-Fixed: 635747 Change-Id: I621ba8f8e119a9ab8ba5455b28a565e3cae2f7cd Signed-off-by:
Hareesh Gundu <hareeshg@codeaurora.org>
-
anarkia1976 authored
-
anarkia1976 authored
-
anarkia1976 authored
-
Paul Reioux authored
fix a merge derp. pointed out by neobuddy89 Signed-off-by:
Paul Reioux <reioux@gmail.com>
-
Paul Reioux authored
Signed-off-by:
Paul Reioux <reioux@gmail.com>
-
Oleg Nesterov authored
while_each_thread() and next_thread() should die, almost every lockless usage is wrong. 1. Unless g == current, the lockless while_each_thread() is not safe. while_each_thread(g, t) can loop forever if g exits, next_thread() can't reach the unhashed thread in this case. Note that this can happen even if g is the group leader, it can exec. 2. Even if while_each_thread() itself was correct, people often use it wrongly. It was never safe to just take rcu_read_lock() and loop unless you verify that pid_alive(g) == T, even the first next_thread() can point to the already freed/reused memory. This patch adds signal_struct->thread_head and task->thread_node to create the normal rcu-safe list with the stable head. The new for_each_thread(g, t) helper is always safe under rcu_read_lock() as long as this task_struct can't go away. Note: of course it is ugly to have both task_struct->thread_node and the old task_struct->thread_group, we will kill it later, after we change the users of while_each_thread() to use for_each_thread(). Perhaps we can kill it even before we convert all users, we can reimplement next_thread(t) using the new thread_head/thread_node. But we can't do this right now because this will lead to subtle behavioural changes. For example, do/while_each_thread() always sees at least one task, while for_each_thread() can do nothing if the whole thread group has died. Or thread_group_empty(), currently its semantics is not clear unless thread_group_leader(p) and we need to audit the callers before we can change it. So this patch adds the new interface which has to coexist with the old one for some time, hopefully the next changes will be more or less straightforward and the old one will go away soon. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Reviewed-by:
Sergey Dyasly <dserrg@gmail.com> Tested-by:
Sergey Dyasly <dserrg@gmail.com> Reviewed-by:
Sameer Nanda <snanda@chromium.org> Acked-by:
David Rientjes <rientjes@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: "Ma, Xindong" <xindong.ma@intel.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Paul Reioux <reioux@gmail.com> backported to Linux 3.4 Conflicts: kernel/fork.c
-
David Rientjes authored
A 3% of system memory bonus is sometimes too excessive in comparison to other processes. With commit a63d83f4 ("oom: badness heuristic rewrite"), the OOM killer tries to avoid killing privileged tasks by subtracting 3% of overall memory (system or cgroup) from their per-task consumption. But as a result, all root tasks that consume less than 3% of overall memory are considered equal, and so it only takes 33+ privileged tasks pushing the system out of memory for the OOM killer to do something stupid and kill dhclient or other root-owned processes. For example, on a 32G machine it can't tell the difference between the 1M agetty and the 10G fork bomb member. The changelog describes this 3% boost as the equivalent to the global overcommit limit being 3% higher for privileged tasks, but this is not the same as discounting 3% of overall memory from _every privileged task individually_ during OOM selection. Replace the 3% of system memory bonus with a 3% of current memory usage bonus. By giving root tasks a bonus that is proportional to their actual size, they remain comparable even when relatively small. In the example above, the OOM killer will discount the 1M agetty's 256 badness points down to 179, and the 10G fork bomb's 262144 points down to 183500 points and make the right choice, instead of discounting both to 0 and killing agetty because it's first in the task list. Signed-off-by:
David Rientjes <rientjes@google.com> Reported-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Rientjes authored
When two threads have the same badness score, it's preferable to kill the thread group leader so that the actual process name is printed to the kernel log rather than the thread group name which may be shared amongst several processes. This was the behavior when select_bad_process() used to do for_each_process(), but it now iterates threads instead and leads to ambiguity. Signed-off-by:
David Rientjes <rientjes@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Greg Thelen <gthelen@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
find_lock_task_mm() expects it is called under rcu or tasklist lock, but it seems that at least oom_unkillable_task()->task_in_mem_cgroup() and mem_cgroup_out_of_memory()->oom_badness() can call it lockless. Perhaps we could fix the callers, but this patch simply adds rcu lock into find_lock_task_mm(). This also allows to simplify a bit one of its callers, oom_kill_process(). Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Cc: Sergey Dyasly <dserrg@gmail.com> Cc: Sameer Nanda <snanda@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: "Ma, Xindong" <xindong.ma@intel.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com> Acked-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
At least out_of_memory() calls has_intersects_mems_allowed() without even rcu_read_lock(), this is obviously buggy. Add the necessary rcu_read_lock(). This means that we can not simply return from the loop, we need "bool ret" and "break". While at it, swap the names of task_struct's (the argument and the local). This cleans up the code a little bit and avoids the unnecessary initialization. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Reviewed-by:
Sergey Dyasly <dserrg@gmail.com> Tested-by:
Sergey Dyasly <dserrg@gmail.com> Reviewed-by:
Sameer Nanda <snanda@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: "Ma, Xindong" <xindong.ma@intel.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com> Acked-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
Change oom_kill.c to use for_each_thread() rather than the racy while_each_thread() which can loop forever if we race with exit. Note also that most users were buggy even if while_each_thread() was fine, the task can exit even _before_ rcu_read_lock(). Fortunately the new for_each_thread() only requires the stable task_struct, so this change fixes both problems. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Reviewed-by:
Sergey Dyasly <dserrg@gmail.com> Tested-by:
Sergey Dyasly <dserrg@gmail.com> Reviewed-by:
Sameer Nanda <snanda@chromium.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: "Ma, Xindong" <xindong.ma@intel.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com> Acked-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Rusty Russell authored
The normal expectation for ERR_PTR() is to put a negative errno into a pointer. oom_kill puts the magic -1 in the result (and has since pre-git), which is probably clearer with an explicit cast. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Rusty Russell <rusty@rustcorp.com.au>
-
David Rientjes authored
To lock the entire system from parallel oom killing, it's possible to pass in a zonelist with all zones rather than using for_each_populated_zone() for the iteration. This obsoletes try_set_system_oom() and clear_system_oom() so that they can be removed. Signed-off-by:
David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Lai Jiangshan authored
N_HIGH_MEMORY stands for the nodes that has normal or high memory. N_MEMORY stands for the nodes that has any memory. The code here need to handle with the nodes which have memory, we should use N_MEMORY instead. Signed-off-by:
Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by:
Hillf Danton <dhillf@gmail.com> Signed-off-by:
Wen Congyang <wency@cn.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Lin Feng <linfeng@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Rientjes authored
Exiting threads, those with PF_EXITING set, can pagefault and require memory before they can make forward progress. This happens, for instance, when a process must fault task->robust_list, a userspace structure, before detaching its memory. These threads also aren't guaranteed to get access to memory reserves unless oom killed or killed from userspace. The oom killer won't grant memory reserves if other threads are also exiting other than current and stalling at the same point. This prevents needlessly killing processes when others are already exiting. Instead of special casing all the possible situations between PF_EXITING getting set and a thread detaching its mm where it may allocate memory, which probably wouldn't get updated when a change is made to the exit path, the solution is to give all exiting threads access to memory reserves if they call the oom killer. This allows them to quickly allocate, detach its mm, and free the memory it represents. Summary of Luigi's bug report: : He had an oom condition where threads were faulting on task->robust_list : and repeatedly called the oom killer but it would defer killing a thread : because it saw other PF_EXITING threads. This can happen anytime we need : to allocate memory after setting PF_EXITING and before detaching our mm; : if there are other threads in the same state then the oom killer won't do : anything unless one of them happens to be killed from userspace. : : So instead of only deferring for PF_EXITING and !task->robust_list, it's : better to just give them access to memory reserves to prevent a potential : livelock so that any other faults that may be introduced in the future in : the exit path don't cause the same problem (and hopefully we don't allow : too many of those!). Signed-off-by:
David Rientjes <rientjes@google.com> Acked-by:
Minchan Kim <minchan@kernel.org> Tested-by:
Luigi Semenzato <semenzato@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Peter Zijlstra authored
Since commit 2f36825b ("sched: Next buddy hint on sleep and preempt path") it is likely we pick a new task from the same cgroup, doing a put and then set on all intermediate entities is a waste of time, so try to avoid this. Measured using: mount nodev /cgroup -t cgroup -o cpu cd /cgroup mkdir a; cd a mkdir b; cd b mkdir c; cd c echo $$ > tasks perf stat --repeat 10 -- taskset 1 perf bench sched pipe PRE : 4.542422684 seconds time elapsed ( +- 0.33% ) POST: 4.389409991 seconds time elapsed ( +- 0.32% ) Which shows a significant improvement of ~3.5% Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1328936700.2476.17.camel@laptop Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Paul Reioux <reioux@gmail.com> Backported for Linux 3.4 kernel Conflicts: kernel/sched/fair.c Conflicts: kernel/sched/fair.c
-
Peter Zijlstra authored
Use for_each_cpu_and() and thereby avoid computing the capacity for CPUs we know we're not interested in. Reviewed-by:
Paul Turner <pjt@google.com> Reviewed-by:
Preeti U Murthy <preeti@linux.vnet.ibm.com> Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-lppceyv6kb3a19g8spmrn20b@git.kernel.org Signed-off-by:
Ingo Molnar <mingo@kernel.org> Signed-off-by:
Paul Reioux <reioux@gmail.com> backported for Linux 3.4 Conflicts: kernel/sched/fair.c
-
hellsgod authored
By globally defining check_panic_on_oom(), the memcg oom handler can be moved entirely to mm/memcontrol.c. This removes the ugly #ifdef in the oom killer and cleans up the code. Signed-off-by:
David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by:
Michal Hocko <mhocko@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Paul Reioux <reioux@gmail.com> Conflicts: mm/oom_kill.c
-
David Rientjes authored
Since exiting tasks require write_lock_irq(&tasklist_lock) several times, try to reduce the amount of time the readside is held for oom kills. This makes the interface with the memcg oom handler more consistent since it now never needs to take tasklist_lock unnecessarily. The only time the oom killer now takes tasklist_lock is when iterating the children of the selected task, everything else is protected by rcu_read_lock(). This requires that a reference to the selected process, p, is grabbed before calling oom_kill_process(). It may release it and grab a reference on another one of p's threads if !p->mm, but it also guarantees that it will release the reference before returning. [hughd@google.com: fix duplicate put_task_struct()] Signed-off-by:
David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Paul Reioux <reioux@gmail.com> Conflicts: mm/oom_kill.c Conflicts: mm/oom_kill.c
-
David Rientjes authored
The global oom killer is serialized by the per-zonelist try_set_zonelist_oom() which is used in the page allocator. Concurrent oom kills are thus a rare event and only occur in systems using mempolicies and with a large number of nodes. Memory controller oom kills, however, can frequently be concurrent since there is no serialization once the oom killer is called for oom conditions in several different memcgs in parallel. This creates a massive contention on tasklist_lock since the oom killer requires the readside for the tasklist iteration. If several memcgs are calling the oom killer, this lock can be held for a substantial amount of time, especially if threads continue to enter it as other threads are exiting. Since the exit path grabs the writeside of the lock with irqs disabled in a few different places, this can cause a soft lockup on cpus as a result of tasklist_lock starvation. The kernel lacks unfair writelocks, and successful calls to the oom killer usually result in at least one thread entering the exit path, so an alternative solution is needed. This patch introduces a seperate oom handler for memcgs so that they do not require tasklist_lock for as much time. Instead, it iterates only over the threads attached to the oom memcg and grabs a reference to the selected thread before calling oom_kill_process() to ensure it doesn't prematurely exit. This still requires tasklist_lock for the tasklist dump, iterating children of the selected process, and killing all other threads on the system sharing the same memory as the selected victim. So while this isn't a complete solution to tasklist_lock starvation, it significantly reduces the amount of time that it is held. Acked-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by:
Michal Hocko <mhocko@suse.cz> Signed-off-by:
David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by:
Sha Zhengju <handai.szj@taobao.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Rientjes authored
This patch introduces a helper function to process each thread during the iteration over the tasklist. A new return type, enum oom_scan_t, is defined to determine the future behavior of the iteration: - OOM_SCAN_OK: continue scanning the thread and find its badness, - OOM_SCAN_CONTINUE: do not consider this thread for oom kill, it's ineligible, - OOM_SCAN_ABORT: abort the iteration and return, or - OOM_SCAN_SELECT: always select this thread with the highest badness possible. There is no functional change with this patch. This new helper function will be used in the next patch in the memory controller. Reviewed-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Signed-off-by:
David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Reviewed-by:
Sha Zhengju <handai.szj@taobao.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Rientjes authored
mem_cgroup_out_of_memory() is defined in mm/oom_kill.c, so declare it in linux/oom.h rather than linux/memcontrol.h. Acked-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by:
Michal Hocko <mhocko@suse.cz> Signed-off-by:
David Rientjes <rientjes@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Paul Reioux <reioux@gmail.com> Conflicts: include/linux/memcontrol.h
-
David Rientjes authored
The divide in p->signal->oom_score_adj * totalpages / 1000 within oom_badness() was causing an overflow of the signed long data type. This adds both the root bias and p->signal->oom_score_adj before doing the normalization which fixes the issue and also cleans up the calculation. Tested-by:
Dave Jones <davej@redhat.com> Signed-off-by:
David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Rientjes authored
If the privileges given to root threads (3% of allowable memory) or a negative value of /proc/pid/oom_score_adj happen to exceed the amount of rss of a thread, its badness score overflows as a result of commit a7f638f999ff ("mm, oom: normalize oom scores to oom_score_adj scale only for userspace"). Fix this by making the type signed and return 1, meaning the thread is still eligible for kill, if the value is negative. Reported-by:
Dave Jones <davej@redhat.com> Acked-by:
Oleg Nesterov <oleg@redhat.com> Signed-off-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Rientjes authored
The oom_score_adj scale ranges from -1000 to 1000 and represents the proportion of memory available to the process at allocation time. This means an oom_score_adj value of 300, for example, will bias a process as though it was using an extra 30.0% of available memory and a value of -350 will discount 35.0% of available memory from its usage. The oom killer badness heuristic also uses this scale to report the oom score for each eligible process in determining the "best" process to kill. Thus, it can only differentiate each process's memory usage by 0.1% of system RAM. On large systems, this can end up being a large amount of memory: 256MB on 256GB systems, for example. This can be fixed by having the badness heuristic to use the actual memory usage in scoring threads and then normalizing it to the oom_score_adj scale for userspace. This results in better comparison between eligible threads for kill and no change from the userspace perspective. Suggested-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Tested-by:
Dave Jones <davej@redhat.com> Signed-off-by:
David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Paul Reioux authored
This patch introduces three new sysctls to /proc/sys/vm: wmark_min_kbytes, wmark_low_kbytes and wmark_high_kbytes. Each entry is used to compute watermark[min], watermark[low] and watermark[high] for each zone. These parameters are also updated when min_free_kbytes are changed because originally they are set based on min_free_kbytes. On the other hand, min_free_kbytes is updated when wmark_free_kbytes changes. By using the parameters one can adjust the difference among watermark[min], watermark[low] and watermark[high] and as a result one can tune the kernel reclaim behaviour to fit their requirement. Signed-off-by:
Satoru Moriya <satoru.moriya@hds.com> modified and tuned for Hammerhead Signed-off-by:
Paul Reioux <reioux@gmail.com>
-
- 31 Mar, 2014 6 commits
-
-
Francisco Franco authored
cpufreq: break earlier if target_freq is equal to current freq. Also fetch a fix from @imoseyon to update other cpuX nodes when cpu0 gets updated (max/min/gov). Signed-off-by:
Francisco Franco <franciscofranco.1990@gmail.com>
-
hellsgod authored
-
Stephen Boyd authored
Now that we have clk_prepare/unprepare we can make the RPM clocks sleepable. This allows us to move the sometimes costly busy wait that RPM clocks incur when enabling and disabling or changing rates. CRs-Fixed: 552223 Change-Id: I8ac53c0b7fc79e56051b19fedb6910ac3f1cda42 Signed-off-by:
Stephen Boyd <sboyd@codeaurora.org> Git-commit: b500badb5dc821dd92f93833003170cb9ae106b0 Git-repo: https://android.googlesource.com/kernel/msm/ Signed-off-by:
Srinivasarao P <spathi@codeaurora.org>
-
myfluxi authored
cyanogens uevent commit when the governor changes was rendered non-working since kitkat (or so) as the uevent filter function caused events to be dropped. Now hook into cpufreq_core_init() and cpufreq_add_dev_interface() to create our basic ksets for cpufreq and cpu devices. Also, we don't need to set environmental data, so clean it up a bit. This commit requires a change in ueventd.rc that add rules for several files of interest. Change-Id: I3aafa0d4e18363e1d68535f513099ecd27024007
-
Steve Kondik authored
* Useful so userspace tools can reconfigure. Change-Id: Ib423910b8b9ac791ebe81a75bf399f58272f64f2
-
Venkatesh Yadav Abbarapu authored
Highlights of the changes are: 1. A workqueue is used to configure ADM only when the clocks need to be prepared & enabled. A function call is used to configure ADM in cases where the ADM driver is invoked for data transfer when the clock is already prepared & enabled. 2. Replaced threaded irqs with hard irqs. Change-Id: Ifaa5efdde8150f932a12d4f9eeccfb36ecb8f88f Acked-by:
John Nicholas <jnichola@qti.qualcomm.com> Signed-off-by:
Venkatesh Yadav Abbarapu <quicvenkat@codeaurora.org>
-