Commits · 20e11f2f70b2d8e588f469342812023656eb8a10 · Chris / Cam Test

06 Apr, 2014 4 commits

mm, oom: fix and cleanup oom score calculations · 20e11f2f

David Rientjes authored 13 years ago


The divide in p->signal->oom_score_adj * totalpages / 1000 within
oom_badness() was causing an overflow of the signed long data type.

This adds both the root bias and p->signal->oom_score_adj before doing the
normalization which fixes the issue and also cleans up the calculation.
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

20e11f2f

mm, oom: fix badness score underflow · 17cd5b6a

David Rientjes authored 13 years ago


If the privileges given to root threads (3% of allowable memory) or a
negative value of /proc/pid/oom_score_adj happen to exceed the amount of
rss of a thread, its badness score overflows as a result of commit
a7f638f999ff ("mm, oom: normalize oom scores to oom_score_adj scale only
for userspace").

Fix this by making the type signed and return 1, meaning the thread is
still eligible for kill, if the value is negative.
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

17cd5b6a

mm, oom: normalize oom scores to oom_score_adj scale only for userspace · 4a1ba135

David Rientjes authored 13 years ago


The oom_score_adj scale ranges from -1000 to 1000 and represents the
proportion of memory available to the process at allocation time.  This
means an oom_score_adj value of 300, for example, will bias a process as
though it was using an extra 30.0% of available memory and a value of
-350 will discount 35.0% of available memory from its usage.

The oom killer badness heuristic also uses this scale to report the oom
score for each eligible process in determining the "best" process to
kill.  Thus, it can only differentiate each process's memory usage by
0.1% of system RAM.

On large systems, this can end up being a large amount of memory: 256MB
on 256GB systems, for example.

This can be fixed by having the badness heuristic to use the actual
memory usage in scoring threads and then normalizing it to the
oom_score_adj scale for userspace.  This results in better comparison
between eligible threads for kill and no change from the userspace
perspective.
Suggested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Tested-by: Dave Jones <davej@redhat.com>
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

4a1ba135

page_alloc: Make watermarks tunable separately · 419074f5

Paul Reioux authored 11 years ago


This patch introduces three new sysctls to /proc/sys/vm:
wmark_min_kbytes, wmark_low_kbytes and wmark_high_kbytes.

Each entry is used to compute watermark[min], watermark[low]
and watermark[high] for each zone.

These parameters are also updated when min_free_kbytes are
changed because originally they are set based on min_free_kbytes.
On the other hand, min_free_kbytes is updated when wmark_free_kbytes
changes.

By using the parameters one can adjust the difference among
watermark[min], watermark[low] and watermark[high] and as a result
one can tune the kernel reclaim behaviour to fit their requirement.
Signed-off-by: Satoru Moriya <satoru.moriya@hds.com>

modified and tuned for Hammerhead
Signed-off-by: Paul Reioux <reioux@gmail.com>

419074f5

31 Mar, 2014 8 commits

cpufreq: break earlier if target_freq is equal to current freq. Also fetch a... · 9f3e8ffb

Francisco Franco authored 11 years ago

cpufreq: break earlier if target_freq is equal to current freq. Also fetch a fix from @imoseyon to update other cpuX nodes when cpu0 gets updated (max/min/gov).
Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>

9f3e8ffb

config: b46-t3 / disable SHADOW_WRITES · 21249e43
hellsgod authored 11 years ago

21249e43

msm: clock-rpm: Make rpm clocks sleeping clocks · 8b4df679

Stephen Boyd authored 12 years ago


Now that we have clk_prepare/unprepare we can make the RPM clocks
sleepable. This allows us to move the sometimes costly busy wait
that RPM clocks incur when enabling and disabling or changing
rates.

CRs-Fixed: 552223

Change-Id: I8ac53c0b7fc79e56051b19fedb6910ac3f1cda42
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Git-commit: b500badb5dc821dd92f93833003170cb9ae106b0
Git-repo: https://android.googlesource.com/kernel/msm/

Signed-off-by: Srinivasarao P <spathi@codeaurora.org>

8b4df679

cpufreq: Fix broken uevents for cpufreq governor and cpu devices · 695574a7

myfluxi authored 11 years ago

cyanogens uevent commit when the governor changes was rendered
non-working since kitkat (or so) as the uevent filter function
caused events to be dropped. Now hook into cpufreq_core_init()
and cpufreq_add_dev_interface() to create our basic ksets for
cpufreq and cpu devices. Also, we don't need to set environmental
data, so clean it up a bit.

This commit requires a change in ueventd.rc that add rules for
several files of interest.

Change-Id: I3aafa0d4e18363e1d68535f513099ecd27024007

695574a7

drivers: cpufreq: Send a uevent when governor changes · e542d63f
Steve Kondik authored 12 years ago
```
 * Useful so userspace tools can reconfigure.

Change-Id: Ib423910b8b9ac791ebe81a75bf399f58272f64f2
```
e542d63f

msm: dma: Refactored driver to avoid queuing of cmds in workq. · 7f17f9c8

Venkatesh Yadav Abbarapu authored 11 years ago


Highlights of the changes are:
1. A workqueue is used to configure ADM only when
   the clocks need to be prepared & enabled.
   A function call is used to configure ADM in
   cases where the ADM driver is invoked for data
   transfer when the clock is already prepared &
   enabled.
2. Replaced threaded irqs with hard irqs.

Change-Id: Ifaa5efdde8150f932a12d4f9eeccfb36ecb8f88f
Acked-by: John Nicholas <jnichola@qti.qualcomm.com>
Signed-off-by: Venkatesh Yadav Abbarapu <quicvenkat@codeaurora.org>

7f17f9c8

msm: dma: Moving queue_work() function within spinlock · cbc6cb9d

Utsab Bose authored 12 years ago


Currently we are adding a dma command to the staged list with in a
spinlock and then adding to workqueue using queue_work after unlocking
the spinlock. With this there is chance of executing DMA commands in out
of order in below concurrency case.

Tread1                                Thread2
__msm_dmov_enqueue_cmd_ext
   spin_lock_irqsave(..)
   list_add_tail(..)
   spin_unlock_irqrestore(..)
       --PREEMPT--
                                      __msm_dmov_enqueue_cmd_ext
                                         spin_lock_irqsave(..)
                                         list_add_tail(..)
                                         spin_unlock_irqrestore(..)
                                         queue_work()
                                         ..
   queue_work()
So adding queue_work with in spin_lock will make sure that the
work added in the work_queue is processed in the same order as they
are added in staged_commands.

CRs-Fixed: 423190
Change-Id: I2ffd1327fb5f0cd1f06db7de9c026d1c4997fe4d
Acked-by: Gopi Krishna Nedanuri <gnedanur@qti.qualcomm.com>
Signed-off-by: Utsab Bose <ubose@codeaurora.org>

cbc6cb9d

Linux 3.4.85 · 1d76b966
anarkia1976 authored 11 years ago

1d76b966

28 Mar, 2014 28 commits

config: b46-t2 · aaa26306
hellsgod authored 11 years ago

aaa26306
Add new helpers · 27d91b20
Pranav Vashi authored 11 years ago
```
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
```
27d91b20

epoll: do not take the nested ep->mtx on EPOLL_CTL_DEL · 70c5b5f3

Jason Baron authored 11 years ago


The EPOLL_CTL_DEL path of epoll contains a classic, ab-ba deadlock.
That is, epoll_ctl(a, EPOLL_CTL_DEL, b, x), will deadlock with
epoll_ctl(b, EPOLL_CTL_DEL, a, x).  The deadlock was introduced with
commmit ("epoll: do not take global 'epmutex' for simple
topologies").

The acquistion of the ep->mtx for the destination 'ep' was added such
that a concurrent EPOLL_CTL_ADD operation would see the correct state of
the ep (Specifically, the check for '!list_empty(&f.file->f_ep_links')

However, by simply not acquiring the lock, we do not serialize behind
the ep->mtx from the add path, and thus may perform a full path check
when if we had waited a little longer it may not have been necessary.
However, this is a transient state, and performing the full loop
checking in this case is not harmful.

The important point is that we wouldn't miss doing the full loop
checking when required, since EPOLL_CTL_ADD always locks any 'ep's that
its operating upon.  The reason we don't need to do lock ordering in the
add path, is that we are already are holding the global 'epmutex'
whenever we do the double lock.  Further, the original posting of this
patch, which was tested for the intended performance gains, did not
perform this additional locking.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Nathan Zimmer <nzimmer@sgi.com>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Nelson Elhage <nelhage@nelhage.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

70c5b5f3

epoll: drop EPOLLWAKEUP if PM_SLEEP is disabled · c922aa13

Amit Pundir authored 11 years ago


Drop EPOLLWAKEUP from epoll events mask if CONFIG_PM_SLEEP is disabled.
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

c922aa13

epoll: do not take global 'epmutex' for simple topologies · 626d91bd

Jason Baron authored 11 years ago


When calling EPOLL_CTL_ADD for an epoll file descriptor that is attached
directly to a wakeup source, we do not need to take the global 'epmutex',
unless the epoll file descriptor is nested.  The purpose of taking the
'epmutex' on add is to prevent complex topologies such as loops and deep
wakeup paths from forming in parallel through multiple EPOLL_CTL_ADD
operations.  However, for the simple case of an epoll file descriptor
attached directly to a wakeup source (with no nesting), we do not need to
hold the 'epmutex'.

This patch along with 'epoll: optimize EPOLL_CTL_DEL using rcu' improves
scalability on larger systems.  Quoting Nathan Zimmer's mail on SPECjbb
performance:

"On the 16 socket run the performance went from 35k jOPS to 125k jOPS.  In
addition the benchmark when from scaling well on 10 sockets to scaling
well on just over 40 sockets.

...

Currently the benchmark stops scaling at around 40-44 sockets but it seems like
I found a second unrelated bottleneck."

[akpm@linux-foundation.org: use `bool' for boolean variables, remove unneeded/undesirable cast of void*, add missed ep_scan_ready_list() kerneldoc]
Signed-off-by: Jason Baron <jbaron@akamai.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Nelson Elhage <nelhage@nelhage.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

626d91bd

epoll: optimize EPOLL_CTL_DEL using rcu · 36596fd8

Jason Baron authored 11 years ago


Nathan Zimmer found that once we get over 10+ cpus, the scalability of
SPECjbb falls over due to the contention on the global 'epmutex', which is
taken in on EPOLL_CTL_ADD and EPOLL_CTL_DEL operations.

Patch #1 removes the 'epmutex' lock completely from the EPOLL_CTL_DEL path
by using rcu to guard against any concurrent traversals.

Patch #2 remove the 'epmutex' lock from EPOLL_CTL_ADD operations for
simple topologies.  IE when adding a link from an epoll file descriptor to
a wakeup source, where the epoll file descriptor is not nested.

This patch (of 2):

Optimize EPOLL_CTL_DEL such that it does not require the 'epmutex' by
converting the file->f_ep_links list into an rcu one.  In this way, we can
traverse the epoll network on the add path in parallel with deletes.
Since deletes can't create loops or worse wakeup paths, this is safe.

This patch in combination with the patch "epoll: Do not take global 'epmutex'
for simple topologies", shows a dramatic performance improvement in
scalability for SPECjbb.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Tested-by: Nathan Zimmer <nzimmer@sgi.com>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Nelson Elhage <nelhage@nelhage.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
CC: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

36596fd8

epoll: add a reschedule point in ep_free() · 5695a8c9

Eric Dumazet authored 11 years ago


ep_free() might iterate on a huge set of epitems and hold cpu too long.
Add two cond_resched() in order to yield cpu to other tasks.  This is safe
as we only hold mutexes in this function.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Acked-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

5695a8c9

switch epoll_ctl() to fdget · bc9daa0f

Al Viro authored 11 years ago

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

bc9daa0f

signals: eventpoll: do not use sigprocmask() · ff26ecc2

Pranav Vashi authored 11 years ago

sigprocmask() should die. None of the current callers actually
need this strange interface.

Change fs/eventpoll.c to use set_current_blocked(). This also
means we should not worry about SIGKILL/SIGSTOP.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eric Wong <normalperson@yhbt.net>
Cc: Jason Baron <jbaron@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

ff26ecc2

epoll: cleanup: use RCU_INIT_POINTER when nulling · 611aeac2

Eric Wong authored 12 years ago


It is always safe to use RCU_INIT_POINTER to NULL a pointer.  This results
in slightly smaller/faster code.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

611aeac2

epoll: cleanup: hoist out f_op->poll calls · f3c417a3

hellsgod authored 11 years ago


This reduces the amount of code inside the ready list iteration loops for
better readability IMHO.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

f3c417a3

epoll: lock ep->mtx in ep_free to silence lockdep · f95a08da

Eric Wong authored 12 years ago


Technically we do not need to hold ep->mtx during ep_free since we are
certain there are no other users of ep at that point.  However, lockdep
complains with a "suspicious rcu_dereference_check() usage!" message; so
lock the mutex before ep_remove to silence the warning.
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: NeilBrown <neilb@suse.de>,
Cc: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

f95a08da

epoll: use RCU to protect wakeup_source in epitem · b74459e5

Eric Wong authored 12 years ago

This prevents wakeup_source destruction when a user hits the item with
EPOLL_CTL_MOD while ep_poll_callback is running.

Tested with CONFIG_SPARSE_RCU_POINTER=y and "make fs/eventpoll.o C=2"
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: NeilBrown <neilb@suse.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

b74459e5

epoll: trim epitem by one cache line · f36443d8

Eric Wong authored 12 years ago


It is common for epoll users to have thousands of epitems, so saving a
cache line on every allocation leads to large memory savings.

Since epitem allocations are cache-aligned, reducing sizeof(struct
epitem) from 136 bytes to 128 bytes will allow it to squeeze under a
cache line boundary on x86_64.

Via /sys/kernel/slab/eventpoll_epi, I see the following changes on my
x86_64 Core2 Duo (which has 64-byte cache alignment):

	object_size  :  192 => 128
	objs_per_slab:   21 =>  32

Also, add a BUILD_BUG_ON() to check for future accidental breakage.

[akpm@linux-foundation.org: use __packed, for all architectures]
Signed-off-by: Eric Wong <normalperson@yhbt.net>
Cc: Davide Libenzi <davidel@xmailserver.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

f36443d8

epoll: switch simple cases of fget_light to fdget · 5a3233e3
Pranav Vashi authored 11 years ago
```
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
```
5a3233e3

switch epoll_wait(2) to fget_light() · eb6a12c0

Al Viro authored 12 years ago

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

eb6a12c0

eventpoll: use-after-possible-free in epoll_create1() · 1f99e4b6

Al Viro authored 12 years ago


As soon as we'd installed the file into descriptor table, it can
get closed by another thread.  Freeing ep in process...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

1f99e4b6

PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND · 6a71a986

Michael Kerrisk authored 12 years ago

As discussed in
http://thread.gmane.org/gmane.linux.kernel/1249726/focus=1288990

,
the capability introduced in 4d7e30d98939a0340022ccd49325a3d70f7e0238
to govern EPOLLWAKEUP seems misnamed: this capability is about governing
the ability to suspend the system, not using a particular API flag
(EPOLLWAKEUP). We should make the name of the capability more general
to encourage reuse in related cases. (Whether or not this capability
should also be used to govern the use of /sys/power/wake_lock is a
question that needs to be separately resolved.)

This patch renames the capability to CAP_BLOCK_SUSPEND. In order to ensure
that the old capability name doesn't make it out into the wild, could you
please apply and push up the tree to ensure that it is incorporated
for the 3.5 release.
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

6a71a986

epoll: Fix user space breakage related to EPOLLWAKEUP · 50a326a6

Rafael J. Wysocki authored 13 years ago


Commit 324bd59a8d193cae6a4e0957ccc4460d606a9091 (epoll: Add a flag, EPOLLWAKEUP, to prevent
suspend while epoll events are ready) caused some applications to
malfunction, because they set the bit corresponding to the new
EPOLLWAKEUP flag in their eventpoll flags and they don't have the
new CAP_EPOLLWAKEUP capability.

To prevent that from happening, change epoll_ctl() to clear
EPOLLWAKEUP in epds.events if the caller doesn't have the
CAP_EPOLLWAKEUP capability instead of failing and returning an
error code, which allows the affected applications to function
normally.
Reported-and-tested-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

50a326a6

epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready · 577b71db

Arve Hjønnevåg authored 13 years ago


When an epoll_event, that has the EPOLLWAKEUP flag set, is ready, a
wakeup_source will be active to prevent suspend. This can be used to
handle wakeup events from a driver that support poll, e.g. input, if
that driver wakes up the waitqueue passed to epoll before allowing
suspend.
Signed-off-by: Arve Hjønnevåg <arve@android.com>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

577b71db

Revert "epoll: optimize EPOLL_CTL_DEL using rcu" · 78b237a6
hellsgod authored 11 years ago
```
This reverts commit a182b3b17f67027ba1d543312cf168cd62b10fe6.
```
78b237a6
Revert "epoll: Do not take global 'epmutex' for simple topologies" · 6b6722e7
hellsgod authored 11 years ago
```
This reverts commit 59fc9b6fb2418d28b35ba77d989339b65fc4a285.
```
6b6722e7
Revert "epoll: do not take globals epmutex for simple topologies fix" · ca7b605d
hellsgod authored 11 years ago
```
This reverts commit 0d6c0e635bd4e8e1042acfadad96d7e008283008.
```
ca7b605d

cpuidle: remove cross-cpu IPI by new latency request. · 9dbf4640

Guojian Chen authored 12 years ago


when drivers request new latency requirement, it's not necessary to
immediately wake up another cpu by sending cross-cpu IPI, we can consider
the new latency to be taken into effect after next wakeup from idle,
this can save the unnecessary wakeup cost, and reduce the risk that
drivers may request latency in irq disabled context.

[<c08e0cc0>] (__irq_svc+0x40/0x70) from [<c00e801c>] (smp_call_function_single+0x16c/0x240)
[<c00e801c>] (smp_call_function_single+0x16c/0x240) from [<c00e855c>] (smp_call_function+0x40/0x6c)
[<c00e855c>] (smp_call_function+0x40/0x6c) from [<c0601c9c>] (cpuidle_latency_notify+0x18/0x20)
[<c0601c9c>] (cpuidle_latency_notify+0x18/0x20) from [<c00b7c28>] (blocking_notifier_call_chain+0x74/0x94)
[<c00b7c28>] (blocking_notifier_call_chain+0x74/0x94) from [<c00d563c>] (pm_qos_update_target+0xe0/0x128)
[<c00d563c>] (pm_qos_update_target+0xe0/0x128) from [<c0620d3c>] (msmsdcc_enable+0xac/0x158)
[<c0620d3c>] (msmsdcc_enable+0xac/0x158) from [<c06050e0>] (mmc_try_claim_host+0xb0/0xb8)
[<c06050e0>] (mmc_try_claim_host+0xb0/0xb8) from [<c0605318>] (mmc_start_bkops.part.15+0x50/0x2f4)
[<c0605318>] (mmc_start_bkops.part.15+0x50/0x2f4) from [<c00ab768>] (process_one_work+0x124/0x55c)
[<c00ab768>] (process_one_work+0x124/0x55c) from [<c00abfc8>] (worker_thread+0x178/0x45c)
[<c00abfc8>] (worker_thread+0x178/0x45c) from [<c00b0b24>] (kthread+0x84/0x90)
[<c00b0b24>] (kthread+0x84/0x90) from [<c000fdd4>] (kernel_thread_exit+0x0/0x8)
Disabling lock debugging due to kernel taint
coresight-etb coresight-etb.0: ETB aborted
Kernel panic - not syncing: softlockup: hung tasks
Signed-off-by: Guojian Chen <a21757@motorola.com>
Reviewed-on: http://gerrit.pcs.mot.com/532702


SLT-Approved: Slta Waiver <sltawvr@motorola.com>
Tested-by: Jira Key <jirakey@motorola.com>
Reviewed-by: Klocwork kwcheck <klocwork-kwcheck@sourceforge.mot.com>
Reviewed-by: Christopher Fries <qcf001@motorola.com>
Reviewed-by: David Ding <dding@motorola.com>
Submit-Approved: Jira Key <jirakey@motorola.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

9dbf4640

input: evdev: daisy-chain header and client buffers · 19f798cd

Igor Kovalenko authored 12 years ago


The evdev driver calculates memory required to hold
client data using power-of-2 math, and then adds header
size, which pushes the request over the power of 2 and
requires 2x larger allocation.

Change the logic to daisy-chain the header and client array.
That way instead of single order4 allocation, we will make
order0+order2.
Signed-off-by: Igor Kovalenko <cik009@motorola.com>
SLT-Approved: Slta Waiver <sltawvr@motorola.com>
Tested-by: Jira Key <jirakey@motorola.com>
Reviewed-by: Christopher Fries <qcf001@motorola.com>
Reviewed-by: Jason Hrycay <jason.hrycay@motorola.com>
Reviewed-by: Klocwork kwcheck <klocwork-kwcheck@sourceforge.mot.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>

19f798cd

ARM: smp: Wait just 1 second for other CPU to halt · 8cf7d67b

Chris Fries authored 12 years ago


Currently, the busyloop waiting for a 2nd CPU to stop takes about 4
seconds.  Adjust for the overhead of the loop by looping every 1ms
instead of 1us.
Signed-off-by: Chris Fries <C.Fries@motorola.com>
Reviewed-on: http://gerrit.pcs.mot.com/537864


SLT-Approved: Slta Waiver <sltawvr@motorola.com>
Tested-by: Jira Key <jirakey@motorola.com>
Reviewed-by: Check Patch <CHEKPACH@motorola.com>
Reviewed-by: Klocwork kwcheck <klocwork-kwcheck@sourceforge.mot.com>
Reviewed-by: Igor Kovalenko <cik009@motorola.com>
Reviewed-by: Russell Knize <rknize2@motorola.com>
Submit-Approved: Jira Key <jirakey@motorola.com>
Signed-off-by: Pranav Vashi <neobuddy89@gmail.com>
Signed-off-by: franciscofranco <franciscofranco.1990@gmail.com>

8cf7d67b

ext4: protect group inode free counting with group lock · d6c65bec

Tao Ma authored 13 years ago


commit 6f2e9f0e7d795214b9cf5a47724a273b705fd113 upstream.

Now when we set the group inode free count, we don't have a proper
group lock so that multiple threads may decrease the inode free
count at the same time. And e2fsck will complain something like:

Free inodes count wrong for group #1 (1, counted=0).
Fix? no

Free inodes count wrong for group #2 (3, counted=0).
Fix? no

Directories count wrong for group #2 (780, counted=779).
Fix? no

Free inodes count wrong for group #3 (2272, counted=2273).
Fix? no

So this patch try to protect it with the ext4_lock_group.

btw, it is found by xfstests test case 269 and the volume is
mkfsed with the parameter
"-O ^resize_inode,^uninit_bg,extent,meta_bg,flex_bg,ext_attr"
and I have run it 100 times and the error in e2fsck doesn't
show up again.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d6c65bec

sched: Avoid throttle_cfs_rq() racing with period_timer stopping · b3943366

Ben Segall authored 11 years ago


commit f9f9ffc237dd924f048204e8799da74f9ecf40cf upstream.

throttle_cfs_rq() doesn't check to make sure that period_timer is running,
and while update_curr/assign_cfs_runtime does, a concurrently running
period_timer on another cpu could cancel itself between this cpu's
update_curr and throttle_cfs_rq(). If there are no other cfs_rqs running
in the tg to restart the timer, this causes the cfs_rq to be stranded
forever.

Fix this by calling __start_cfs_bandwidth() in throttle if the timer is
inactive.

(Also add some sched_debug lines for cfs_bandwidth.)

Tested: make a run/sleep task in a cgroup, loop switching the cgroup
between 1ms/100ms quota and unlimited, checking for timer_active=0 and
throttled=1 as a failure. With the throttle_cfs_rq() change commented out
this fails, with the full patch it passes.
Signed-off-by: Ben Segall <bsegall@google.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/20131016181632.22647.84174.stgit@sword-of-the-dawn.mtv.corp.google.com

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Chris J Arges <chris.j.arges@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b3943366