1. 25 Apr, 2007 1 commit
  2. 23 Mar, 2007 2 commits
  3. 17 Mar, 2007 1 commit
    • Thomas Renninger's avatar
      ACPI: Only use IPI on known broken machines (AMD, Dothan/BaniasPentium M) · 25496cae
      Thomas Renninger authored
      
      Use IPI for blacklisted CPUs, add parameter IPI vs LAPIC
      
      Currently, Linux disables lapic timer for all machines with C2 and higher
      C-state support.
      
      According to Intel only specific Intel models (Banias/Dothan) are broken
      in respect of not waking up from C2 with lapic.
      
      However, I am not sure about the naming of the parameter and how it
      could/should get integrated into the dyntick part
      (CONFIG_GENERIC_CLOCKEVENTS). There, a more fine grained check (TSC
      still running?, ..) is needed? Does this make sense (always use
      CLOCK_EVT_NOTIFY_BROADCAST_ON, but use OFF if forced by use_ipi=0:
      clockevents_notify(use_ipi ? CLOCK_EVT_NOTIFY_BROADCAST_ON :
      CLOCK_EVT_NOTIFY_BROADCAST_OFF, &pr->id);
      Signed-off-by: default avatarThomas Renninger <trenn@suse.de>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      25496cae
  4. 16 Feb, 2007 3 commits
  5. 15 Feb, 2007 1 commit
  6. 12 Feb, 2007 2 commits
  7. 02 Feb, 2007 4 commits
  8. 22 Dec, 2006 1 commit
    • Ingo Molnar's avatar
      [PATCH] sched: fix bad missed wakeups in the i386, x86_64, ia64, ACPI and APM idle code · 0888f06a
      Ingo Molnar authored
      Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
      xruns starting at the 2.6.18-rt kernel, and those problems persisted all
      until current -rt kernels. The latencies were serious and unjustified by
      system load, often in the milliseconds range.
      
      After a patient and heroic multi-month effort of Fernando, where he
      tested dozens of kernels, tried various configs, boot options,
      test-patches of mine and provided latency traces of those incidents, the
      following 'smoking gun' trace was captured by him:
      
                       _------=> CPU#
                      / _-----=> irqs-off
                     | / _----=> need-resched
                     || / _---=> hardirq/softirq
                     ||| / _--=> preempt-depth
                     |||| /
                     |||||     delay
         cmd     pid ||||| time  |   caller
            \   /    |||||   \   |   /
        IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (try_to_wake_up)
        IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup <<...>-5856> (37 0)
        IRQ_19-1479  1D..1    0us : __trace_start_sched_wakeup (c01262ba 0 0)
        IRQ_19-1479  1D..1    0us : resched_task (try_to_wake_up)
        IRQ_19-1479  1D..1    0us : __spin_unlock_irqrestore (try_to_wake_up)
        ...
        <idle>-0     1...1   11us!: default_idle (cpu_idle)
        ...
        <idle>-0     0Dn.1  602us : smp_apic_timer_interrupt (c0103baf 1 0)
        ...
         <...>-5856  0D..2  618us : __switch_to (__schedule)
         <...>-5856  0D..2  618us : __schedule <<idle>-0> (20 162)
         <...>-5856  0D..2  619us : __spin_unlock_irq (__schedule)
         <...>-5856  0...1  619us : trace_stop_sched_switched (__schedule)
         <...>-5856  0D..1  619us : trace_stop_sched_switched <<...>-5856> (37 0)
      
      what is visible in this trace is that CPU#1 ran try_to_wake_up() for
      PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
      for CPU#0. But it decided to not send an IPI that no CPU - due to
      TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
      and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
      result was a 600+ usecs latency and a missed wakeup!
      
      the bug turned out to be an idle-wakeup bug introduced into the mainline
      kernel this summer via an optimization in the x86_64 tree:
      
          commit 495ab9c0
      
      
          Author: Andi Kleen <ak@suse.de>
          Date:   Mon Jun 26 13:59:11 2006 +0200
      
          [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status
      
          During some profiling I noticed that default_idle causes a lot of
          memory traffic. I think that is caused by the atomic operations
          to clear/set the polling flag in thread_info. There is actually
          no reason to make this atomic - only the idle thread does it
          to itself, other CPUs only read it. So I moved it into ti->status.
      
      the problem is this type of change:
      
              if (!hlt_counter && boot_cpu_data.hlt_works_ok) {
      -               clear_thread_flag(TIF_POLLING_NRFLAG);
      +               current_thread_info()->status &= ~TS_POLLING;
                      smp_mb__after_clear_bit();
                      while (!need_resched()) {
                              local_irq_disable();
      
      this changes clear_thread_flag() to an explicit clearing of TS_POLLING.
      clear_thread_flag() is defined as:
      
              clear_bit(flag, &ti->flags);
      
      and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms:
      
        static inline void clear_bit(int nr, volatile unsigned long * addr)
        {
                __asm__ __volatile__( LOCK_PREFIX
                        "btrl %1,%0"
      
      hence smp_mb__after_clear_bit() is defined as a simple compile barrier:
      
        #define smp_mb__after_clear_bit()       barrier()
      
      but the explicit TS_POLLING clearing introduced by the patch:
      
      +               current_thread_info()->status &= ~TS_POLLING;
      
      is not an atomic op! So the clearing of the TS_POLLING bit is freely
      reorderable with the reading of the NEED_RESCHED bit - and both now
      reside in different memory addresses.
      
      CPU idle wakeup very much depends on ordered memory ops, the clearing of
      the TS_POLLING flag must always be done before we test need_resched()
      and hit the idle instruction(s). [Symmetrically, the wakeup code needs
      to set NEED_RESCHED before it tests the TS_POLLING flag, so memory
      ordering is paramount.]
      
      Fernando's dual-core Athlon64 system has a sufficiently advanced memory
      ordering model so that it triggered this scenario very often.
      
      ( And it also turned out that the reason why these latencies never
        triggered on my testsystems is that i routinely use idle=poll, which
        was the only idle variant not affected by this bug. )
      
      The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to
      act as an absolute barrier between the TS_POLLING write and the
      NEED_RESCHED read. This affects almost all idling methods (default,
      ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Tested-by: default avatarFernando Lopez-Lezcano <nando@ccrma.Stanford.EDU>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0888f06a
  9. 20 Oct, 2006 1 commit
  10. 17 Oct, 2006 1 commit
  11. 14 Oct, 2006 3 commits
  12. 01 Oct, 2006 1 commit
    • Arjan van de Ven's avatar
      [PATCH] maximum latency tracking infrastructure · 5c87579e
      Arjan van de Ven authored
      
      Add infrastructure to track "maximum allowable latency" for power saving
      policies.
      
      The reason for adding this infrastructure is that power management in the
      idle loop needs to make a tradeoff between latency and power savings
      (deeper power save modes have a longer latency to running code again).  The
      code that today makes this tradeoff just does a rather simple algorithm;
      however this is not good enough: There are devices and use cases where a
      lower latency is required than that the higher power saving states provide.
       An example would be audio playback, but another example is the ipw2100
      wireless driver that right now has a very direct and ugly acpi hook to
      disable some higher power states randomly when it gets certain types of
      error.
      
      The proposed solution is to have an interface where drivers can
      
      * announce the maximum latency (in microseconds) that they can deal with
      * modify this latency
      * give up their constraint
      
      and a function where the code that decides on power saving strategy can
      query the current global desired maximum.
      
      This patch has a user of each side: on the consumer side, ACPI is patched
      to use this, on the producer side the ipw2100 driver is patched.
      
      A generic maximum latency is also registered of 2 timer ticks (more and you
      lose accurate time tracking after all).
      
      While the existing users of the patch are x86 specific, the infrastructure
      is not.  I'd like to ask the arch maintainers of other architectures if the
      infrastructure is generic enough for their use (assuming the architecture
      has such a tradeoff as concept at all), and the sound/multimedia driver
      owners to look at the driver facing API to see if this is something they
      can use.
      
      [akpm@osdl.org: cleanups]
      Signed-off-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarJesse Barnes <jesse.barnes@intel.com>
      Cc: "Brown, Len" <len.brown@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5c87579e
  13. 10 Jul, 2006 1 commit
  14. 30 Jun, 2006 1 commit
  15. 28 Jun, 2006 5 commits
  16. 27 Jun, 2006 2 commits
  17. 26 Jun, 2006 3 commits
  18. 14 May, 2006 1 commit
  19. 25 Mar, 2006 2 commits
  20. 04 Feb, 2006 2 commits
  21. 11 Jan, 2006 1 commit
  22. 07 Jan, 2006 1 commit