1. 15 Jan, 2009 1 commit
    • Peter Zijlstra's avatar
      sched: prefer wakers · e52fb7c0
      Peter Zijlstra authored
      Prefer tasks that wake other tasks to preempt quickly. This improves
      performance because more work is available sooner.
      
      The workload that prompted this patch was a kernel build over NFS4 (for some
      curious and not understood reason we had to revert commit:
      18de9735
      
       to make any progress at all)
      
      Without this patch a make -j8 bzImage (of x86-64 defconfig) would take
      3m30-ish, with this patch we're down to 2m50-ish.
      
      psql-sysbench/mysql-sysbench show a slight improvement in peak performance as
      well, tbench and vmark seemed to not care.
      
      It is possible to improve upon the build time (to 2m20-ish) but that seriously
      destroys other benchmarks (just shows that there's more room for tinkering).
      
      Much thanks to Mike who put in a lot of effort to benchmark things and proved
      a worthy opponent with a competing patch.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      e52fb7c0
  2. 14 Jan, 2009 1 commit
    • Peter Zijlstra's avatar
      mutex: implement adaptive spinning · 0d66bf6d
      Peter Zijlstra authored
      Change mutex contention behaviour such that it will sometimes busy wait on
      acquisition - moving its behaviour closer to that of spinlocks.
      
      This concept got ported to mainline from the -rt tree, where it was originally
      implemented for rtmutexes by Steven Rostedt, based on work by Gregory Haskins.
      
      Testing with Ingo's test-mutex application (http://lkml.org/lkml/2006/1/8/50
      
      )
      gave a 345% boost for VFS scalability on my testbox:
      
       # ./test-mutex-shm V 16 10 | grep "^avg ops"
       avg ops/sec:               296604
      
       # ./test-mutex-shm V 16 10 | grep "^avg ops"
       avg ops/sec:               85870
      
      The key criteria for the busy wait is that the lock owner has to be running on
      a (different) cpu. The idea is that as long as the owner is running, there is a
      fair chance it'll release the lock soon, and thus we'll be better off spinning
      instead of blocking/scheduling.
      
      Since regular mutexes (as opposed to rtmutexes) do not atomically track the
      owner, we add the owner in a non-atomic fashion and deal with the races in
      the slowpath.
      
      Furthermore, to ease the testing of the performance impact of this new code,
      there is means to disable this behaviour runtime (without having to reboot
      the system), when scheduler debugging is enabled (CONFIG_SCHED_DEBUG=y),
      by issuing the following command:
      
       # echo NO_OWNER_SPIN > /debug/sched_features
      
      This command re-enables spinning again (this is also the default):
      
       # echo OWNER_SPIN > /debug/sched_features
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0d66bf6d
  3. 05 Nov, 2008 1 commit
    • Peter Zijlstra's avatar
      sched: backward looking buddy · 4793241b
      Peter Zijlstra authored
      
      Impact: improve/change/fix wakeup-buddy scheduling
      
      Currently we only have a forward looking buddy, that is, we prefer to
      schedule to the task we last woke up, under the presumption that its
      going to consume the data we just produced, and therefore will have
      cache hot benefits.
      
      This allows co-waking producer/consumer task pairs to run ahead of the
      pack for a little while, keeping their cache warm. Without this, we
      would interleave all pairs, utterly trashing the cache.
      
      This patch introduces a backward looking buddy, that is, suppose that
      in the above scenario, the consumer preempts the producer before it
      can go to sleep, we will therefore miss the wakeup from consumer to
      producer (its already running, after all), breaking the cycle and
      reverting to the cache-trashing interleaved schedule pattern.
      
      The backward buddy will try to schedule back to the task that woke us
      up in case the forward buddy is not available, under the assumption
      that the last task will be the one with the most cache hot task around
      barring current.
      
      This will basically allow a task to continue after it got preempted.
      
      In order to avoid starvation, we allow either buddy to get wakeup_gran
      ahead of the pack.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4793241b
  4. 20 Oct, 2008 1 commit
  5. 22 Sep, 2008 2 commits
    • Ingo Molnar's avatar
      sched: turn off WAKEUP_OVERLAP · f681bbd6
      Ingo Molnar authored
      
      WAKEUP_OVERLAP is not a winner on a 16way box, running psql+sysbench:
      
             .27-rc7-NO_WAKEUP_OVERLAP  .27-rc7-WAKEUP_OVERLAP
      -------------------------------------------------
          1:             694              811    +14.39%
          2:            1454             1427    -1.86%
          4:            3017             3070    +1.70%
          8:            5694             5808    +1.96%
         16:           10592            10612    +0.19%
         32:            9693             9647    -0.48%
         64:            8507             8262    -2.97%
        128:            8402             7087    -18.55%
        256:            8419             5124    -64.30%
        512:            7990             3671    -117.62%
      -------------------------------------------------
        SUM:           64466            55524    -16.11%
      
      ... so turn it off by default.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f681bbd6
    • Peter Zijlstra's avatar
      sched: wakeup preempt when small overlap · 15afe09b
      Peter Zijlstra authored
      
      Lin Ming reported a 10% OLTP regression against 2.6.27-rc4.
      
      The difference seems to come from different preemption agressiveness,
      which affects the cache footprint of the workload and its effective
      cache trashing.
      
      Aggresively preempt a task if its avg overlap is very small, this should
      avoid the task going to sleep and find it still running when we schedule
      back to it - saving a wakeup.
      Reported-by: default avatarLin Ming <ming.m.lin@intel.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      15afe09b
  6. 21 Aug, 2008 1 commit
  7. 27 Jun, 2008 5 commits
  8. 10 Jun, 2008 1 commit
  9. 19 Apr, 2008 1 commit