1. 10 Mar, 2010 1 commit
  2. 26 Feb, 2010 1 commit
  3. 28 Oct, 2009 1 commit
  4. 27 Oct, 2009 1 commit
  5. 22 Sep, 2009 1 commit
    • Paul Mackerras's avatar
      perf_event, powerpc: Fix compilation after big perf_counter rename · a8f90e90
      Paul Mackerras authored
      This fixes two places in the powerpc perf_event (perf_counter) code
      where 'list_entry' needs to be changed to 'group_entry', but were
      missed in commit 65abc865
      
       ("perf_counter: Rename list_entry ->
      group_entry, counter_list -> group_list").
      
      This also changes 'event' back to 'counter' in a couple of
      contexts:
      
      * Field and function names that deal with the limited-function
        counters: it's really the hardware counters whose function is
        limited, not the events that they count.  Hence:
      
        MAX_LIMITED_HWEVENTS -> MAX_LIMITED_HWCOUNTERS
        limited_event -> limited_counter
        freeze/thaw_limited_events -> freeze/thaw_limited_counters
      
      * The machine-specific PMU description struct (struct power_pmu): this
        renames 'n_event' back to 'n_counter' since it really describes how
        many hardware counters the machine has.  (Renaming this back avoids
        a compile error in each of the machine-specific PMU back-ends where
        they initialize their power_pmu struct.)
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: linuxppc-dev@ozlabs.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <19128.4280.813369.589704@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a8f90e90
  6. 21 Sep, 2009 3 commits
    • Ingo Molnar's avatar
      perf: Tidy up after the big rename · 57c0c15b
      Ingo Molnar authored
      
       - provide compatibility Kconfig entry for existing PERF_COUNTERS .config's
      
       - provide courtesy copy of old perf_counter.h, for user-space projects
      
       - small indentation fixups
      
       - fix up MAINTAINERS
      
       - fix small x86 printout fallout
      
       - fix up small PowerPC comment fallout (use 'counter' as in register)
      Reviewed-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      57c0c15b
    • Ingo Molnar's avatar
      perf: Do the big rename: Performance Counters -> Performance Events · cdd6c482
      Ingo Molnar authored
      
      Bye-bye Performance Counters, welcome Performance Events!
      
      In the past few months the perfcounters subsystem has grown out its
      initial role of counting hardware events, and has become (and is
      becoming) a much broader generic event enumeration, reporting, logging,
      monitoring, analysis facility.
      
      Naming its core object 'perf_counter' and naming the subsystem
      'perfcounters' has become more and more of a misnomer. With pending
      code like hw-breakpoints support the 'counter' name is less and
      less appropriate.
      
      All in one, we've decided to rename the subsystem to 'performance
      events' and to propagate this rename through all fields, variables
      and API names. (in an ABI compatible fashion)
      
      The word 'event' is also a bit shorter than 'counter' - which makes
      it slightly more convenient to write/handle as well.
      
      Thanks goes to Stephane Eranian who first observed this misnomer and
      suggested a rename.
      
      User-space tooling and ABI compatibility is not affected - this patch
      should be function-invariant. (Also, defconfigs were not touched to
      keep the size down.)
      
      This patch has been generated via the following script:
      
        FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')
      
        sed -i \
          -e 's/PERF_EVENT_/PERF_RECORD_/g' \
          -e 's/PERF_COUNTER/PERF_EVENT/g' \
          -e 's/perf_counter/perf_event/g' \
          -e 's/nb_counters/nb_events/g' \
          -e 's/swcounter/swevent/g' \
          -e 's/tpcounter_event/tp_event/g' \
          $FILES
      
        for N in $(find . -name perf_counter.[ch]); do
          M=$(echo $N | sed 's/perf_counter/perf_event/g')
          mv $N $M
        done
      
        FILES=$(find . -name perf_event.*)
      
        sed -i \
          -e 's/COUNTER_MASK/REG_MASK/g' \
          -e 's/COUNTER/EVENT/g' \
          -e 's/\<event\>/event_id/g' \
          -e 's/counter/event/g' \
          -e 's/Counter/Event/g' \
          $FILES
      
      ... to keep it as correct as possible. This script can also be
      used by anyone who has pending perfcounters patches - it converts
      a Linux kernel tree over to the new naming. We tried to time this
      change to the point in time where the amount of pending patches
      is the smallest: the end of the merge window.
      
      Namespace clashes were fixed up in a preparatory patch - and some
      stylistic fallout will be fixed up in a subsequent patch.
      
      ( NOTE: 'counters' are still the proper terminology when we deal
        with hardware registers - and these sed scripts are a bit
        over-eager in renaming them. I've undone some of that, but
        in case there's something left where 'counter' would be
        better than 'event' we can undo that on an individual basis
        instead of touching an otherwise nicely automated patch. )
      Suggested-by: default avatarStephane Eranian <eranian@google.com>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarPaul Mackerras <paulus@samba.org>
      Reviewed-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Kyle McMartin <kyle@mcmartin.ca>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: <linux-arch@vger.kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cdd6c482
    • Paul Mackerras's avatar
      perf_counter, powerpc, sparc: Fix compilation after perf_counter_overflow() change · cd74c86b
      Paul Mackerras authored
      Commit 5622f295
      
       ("x86, perf_counter, bts: Optimize BTS overflow
      handling") removed the regs field from struct perf_sample_data and
      added a regs parameter to perf_counter_overflow().  This breaks the
      build on powerpc (and Sparc) as reported by Sachin Sant:
      
        arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
        arch/powerpc/kernel/perf_counter.c:1165: error: unknown field 'regs' specified in initializer
      
      This adjusts arch/powerpc/kernel/perf_counter.c to correspond with the
      new struct perf_sample_data and perf_counter_overflow().
      
      [ v2: also fix Sparc, Markus Metzger <markus.t.metzger@intel.com> ]
      Reported-by: default avatarSachin Sant <sachinp@in.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: benh@kernel.crashing.org
      Cc: linuxppc-dev@ozlabs.org
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <19127.8400.376239.586120@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cd74c86b
  7. 10 Sep, 2009 2 commits
    • Paul Mackerras's avatar
      powerpc/perf_counters: Reduce stack usage of power_check_constraints · e51ee31e
      Paul Mackerras authored
      
      Michael Ellerman reported stack-frame size warnings being produced
      for power_check_constraints(), which uses an 8*8 array of u64 and
      two 8*8 arrays of unsigned long, which are currently allocated on the
      stack, along with some other smaller variables.  These arrays come
      to 1.5kB on 64-bit or 1kB on 32-bit, which is a bit too much for the
      stack.
      
      This fixes the problem by putting these arrays in the existing
      per-cpu cpu_hw_counters struct.  This is OK because two of the call
      sites have interrupts disabled already; for the third call site we
      use get_cpu_var, which disables preemption, so we know we won't
      get a context switch while we're in power_check_constraints().
      Note that power_check_constraints() can be called during context
      switch but is not called from interrupts.
      Reported-by: default avatarMichael Ellerman <michael@ellerman.id.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: <stable@kernel.org)
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e51ee31e
    • Paul Mackerras's avatar
      powerpc: Fix bug where perf_counters breaks oprofile · a6dbf93a
      Paul Mackerras authored
      
      Currently there is a bug where if you use oprofile on a pSeries
      machine, then use perf_counters, then use oprofile again, oprofile
      will not work correctly; it will lose the PMU configuration the next
      time the hypervisor does a partition context switch, and thereafter
      won't count anything.
      
      Maynard Johnson identified the sequence causing the problem:
      - oprofile setup calls ppc_enable_pmcs(), which calls
        pseries_lpar_enable_pmcs, which tells the hypervisor that we want
        to use the PMU, and sets the "PMU in use" flag in the lppaca.
        This flag tells the hypervisor whether it needs to save and restore
        the PMU config.
      - The perf_counter code sets and clears the "PMU in use" flag directly
        as it context-switches the PMU between tasks, and leaves it clear
        when it finishes.
      - oprofile setup, called for a new oprofile run, calls ppc_enable_pmcs,
        which does nothing because it has already been called.  In particular
        it doesn't set the "PMU in use" flag.
      
      This fixes the problem by arranging for ppc_enable_pmcs to always set
      the "PMU in use" flag.  It makes the perf_counter code call
      ppc_enable_pmcs also rather than calling the lower-level function
      directly, and removes the setting of the "PMU in use" flag from
      pseries_lpar_enable_pmcs, since that is now done in its caller.
      
      This also removes the declaration of pasemi_enable_pmcs because it
      isn't defined anywhere.
      Reported-by: default avatarMaynard Johnson <mpjohn@us.ibm.com>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: <stable@kernel.org)
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      a6dbf93a
  8. 09 Aug, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter/powerpc: Fix oops on cpus without perf_counter hardware support · f36a1a13
      Paul Mackerras authored
      
      If we have the powerpc perf_counter backend compiled in, but
      the cpu we are running on is one where we don't support the
      PMU, we currently oops in hw_perf_group_sched_in if we try to
      use any counters, because ppmu is NULL in that case, and we
      unconditionally dereference ppmu.
      
      This fixes the problem by adding a check if ppmu is NULL at the
      beginning of hw_perf_group_sched_in, and also at the beginning
      of the other functions that get called from the perf_counter
      core, i.e. hw_perf_disable, hw_perf_enable, and
      hw_perf_counter_setup.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: benh@kernel.crashing.org
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f36a1a13
  9. 18 Jun, 2009 3 commits
    • Paul Mackerras's avatar
      perf_counter: powerpc: Make powerpc perf_counter code safe for 32-bit kernels · 98fb1807
      Paul Mackerras authored
      
      This abstracts a few things in arch/powerpc/kernel/perf_counter.c
      that are specific to 64-bit kernels, and provides definitions for
      32-bit kernels.  In particular,
      
      * Only 64-bit has MMCRA and the bits in it that give information
        about a PMU interrupt (sampled PR, HV, slot number etc.)
      * Only 64-bit has the lppaca and the lppaca->pmcregs_in_use field
      * Use of SDAR is confined to 64-bit for now
      * Only 64-bit has soft/lazy interrupt disable and therefore
        pseudo-NMIs (interrupts that occur while interrupts are soft-disabled)
      * Only 64-bit has PMC7 and PMC8
      * Only 64-bit has the MSR_HV bit.
      
      This also fixes the types used in a couple of places, where we were
      using long types for things that need to be 64-bit.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55590.634126.876084@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      98fb1807
    • Paul Mackerras's avatar
      perf_counter: powerpc: Change how processor-specific back-ends get selected · 079b3c56
      Paul Mackerras authored
      
      At present, the powerpc generic (processor-independent) perf_counter
      code has list of processor back-end modules, and at initialization,
      it looks at the PVR (processor version register) and has a switch
      statement to select a suitable processor-specific back-end.
      
      This is going to become inconvenient as we add more processor-specific
      back-ends, so this inverts the order: now each back-end checks whether
      it applies to the current processor, and registers itself if so.
      Furthermore, instead of looking at the PVR, back-ends now check the
      cur_cpu_spec->oprofile_cpu_type string and match on that.
      
      Lastly, each back-end now specifies a name for itself so the core can
      print a nice message when a back-end registers itself.
      
      This doesn't provide any support for unregistering back-ends, but that
      wouldn't be hard to do and would allow back-ends to be modules.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55529.762227.518531@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      079b3c56
    • Paul Mackerras's avatar
      perf_counter: powerpc: Use unsigned long for register and constraint values · 448d64f8
      Paul Mackerras authored
      
      This changes the powerpc perf_counter back-end to use unsigned long
      types for hardware register values and for the value/mask pairs used
      in checking whether a given set of events fit within the hardware
      constraints.  This is in preparation for adding support for the PMU
      on some 32-bit powerpc processors.  On 32-bit processors the hardware
      registers are only 32 bits wide, and the PMU structure is generally
      simpler, so 32 bits should be ample for expressing the hardware
      constraints.  On 64-bit processors, unsigned long is 64 bits wide,
      so using unsigned long vs. u64 (unsigned long long) makes no actual
      difference.
      
      This makes some other very minor changes: adjusting whitespace to line
      things up in initialized structures, and simplifying some code in
      hw_perf_disable().
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: linuxppc-dev@ozlabs.org
      Cc: benh@kernel.crashing.org
      LKML-Reference: <19000.55473.26174.331511@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      448d64f8
  10. 15 Jun, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter: powerpc: Fix two compile warnings · 90c8f954
      Paul Mackerras authored
      
      This fixes a couple of compile warnings that crept into the powerpc
      perf_counter code recently:
      
         CC      arch/powerpc/kernel/perf_counter.o
       arch/powerpc/kernel/perf_counter.c: In function 'record_and_restart':
       arch/powerpc/kernel/perf_counter.c:1016: warning: unused variable 'addr'
       arch/powerpc/kernel/perf_counter.c: In function 'hw_perf_counter_init':
       arch/powerpc/kernel/perf_counter.c:891: warning: 'ev' may be used uninitialized in this function
      
      Stephen Rothwell reported this against linux-next as well.
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <18998.12884.787039.22202@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      90c8f954
  11. 11 Jun, 2009 2 commits
  12. 10 Jun, 2009 2 commits
    • Peter Zijlstra's avatar
      perf_counter: Accurate period data · 9e350de3
      Peter Zijlstra authored
      
      We currently log hw.sample_period for PERF_SAMPLE_PERIOD, however this is
      incorrect. When we adjust the period, it will only take effect the next
      cycle but report it for the current cycle. So when we adjust the period
      for every cycle, we're always wrong.
      
      Solve this by keeping track of the last_period.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9e350de3
    • Peter Zijlstra's avatar
      perf_counter: Introduce struct for sample data · df1a132b
      Peter Zijlstra authored
      
      For easy extension of the sample data, put it in a structure.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      df1a132b
  13. 06 Jun, 2009 1 commit
    • Ingo Molnar's avatar
      perf_counter: Separate out attr->type from attr->config · a21ca2ca
      Ingo Molnar authored
      
      Counter type is a frequently used value and we do a lot of
      bit juggling by encoding and decoding it from attr->config.
      
      Clean this up by creating a separate attr->type field.
      
      Also clean up the various similarly complex user-space bits
      all around counter attribute management.
      
      The net improvement is significant, and it will be easier
      to add a new major type (which is what triggered this cleanup).
      
      (This changes the ABI, all tools are adapted.)
      (PowerPC build-tested.)
      
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      a21ca2ca
  14. 04 Jun, 2009 1 commit
  15. 03 Jun, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter: powerpc: Fix race causing "oops trying to read PMC0" errors · dcd945e0
      Paul Mackerras authored
      
      When using interrupting counters and limited (non-interrupting)
      counters at the same time, it's possible that we get an
      interrupt in write_mmcr0() after writing MMCR0 but before we
      have set up the counters using limited PMCs.  What happens then
      is that we get into perf_counter_interrupt() with
      counter->hw.idx = 0 for the limited counters, leading to the
      "oops trying to read PMC0" error message being printed.
      
      This fixes the problem by making perf_counter_interrupt()
      robust against counter->hw.idx being zero (the counter is just
      ignored in that case) and also by changing write_mmcr0() to
      write MMCR0 initially with the counter overflow interrupt
      enable bits masked (set to 0).  If the MMCR0 value requested by
      the caller has either of those bits set, we write MMCR0 again
      with the requested value of those bits after setting up the
      limited counters properly.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Stephane Eranian <eranian@googlemail.com>
      LKML-Reference: <18982.17684.138182.954599@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dcd945e0
  16. 02 Jun, 2009 2 commits
    • Peter Zijlstra's avatar
      perf_counter: Rename perf_counter_hw_event => perf_counter_attr · 0d48696f
      Peter Zijlstra authored
      
      The structure isn't hw only and when I read event, I think about those
      things that fall out the other end. Rename the thing.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Kacur <jkacur@redhat.com>
      Cc: Stephane Eranian <eranian@googlemail.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0d48696f
    • Peter Zijlstra's avatar
      perf_counter: Rename various fields · b23f3325
      Peter Zijlstra authored
      
      A few renames:
      
        s/irq_period/sample_period/
        s/irq_freq/sample_freq/
        s/PERF_RECORD_/PERF_SAMPLE_/
        s/record_type/sample_type/
      
      And change both the new sample_type and read_format to u64.
      Reported-by: default avatarStephane Eranian <eranian@googlemail.com>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Marcelo Tosatti <mtosatti@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: John Kacur <jkacur@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      b23f3325
  17. 26 May, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter: powerpc: Implement interrupt throttling · 8a7b8cb9
      Paul Mackerras authored
      
      This implements interrupt throttling on powerpc.  Since we don't have
      individual count enable/disable or interrupt enable/disable controls
      per counter, this simply sets the hardware counter to 0, meaning that
      it will not interrupt again until it has counted 2^31 counts, which
      will take at least 2^30 cycles assuming a maximum of 2 counts per
      cycle.  Also, we set counter->hw.period_left to the maximum possible
      value (2^63 - 1), so we won't report overflows for this counter for
      the forseeable future.
      
      The unthrottle operation restores counter->hw.period_left and the
      hardware counter so that we will once again report a counter overflow
      after counter->hw.irq_period counts.
      
      [ Impact: new perfcounters robustness feature on PowerPC ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18971.35823.643362.446774@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      8a7b8cb9
  18. 18 May, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter: powerpc: initialize cpuhw pointer before use · c0daaf3f
      Paul Mackerras authored
      Commit 9e35ad38
      
       ("perf_counter: Rework the perf counter
      disable/enable") added code to the powerpc hw_perf_enable (renamed
      from hw_perf_restore) to test cpuhw->disabled and return immediately
      if it is not set (i.e. if the PMU is already enabled).
      
      Unfortunately the test got added before cpuhw was initialized,
      resulting in an oops the first time hw_perf_enable got called.
      This fixes it by moving the initialization of cpuhw to before
      cpuhw->disabled is tested.
      
      [ Impact: fix oops-causing bug on powerpc ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <18960.56772.869734.304631@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      c0daaf3f
  19. 15 May, 2009 4 commits
    • Paul Mackerras's avatar
      perf_counter: powerpc: supply more precise information on counter overflow events · 0bbd0d4b
      Paul Mackerras authored
      
      This uses values from the MMCRA, SIAR and SDAR registers on
      powerpc to supply more precise information for overflow events,
      including a data address when PERF_RECORD_ADDR is specified.
      
      Since POWER6 uses different bit positions in MMCRA from earlier
      processors, this converts the struct power_pmu limited_pmc5_6
      field, which only had 0/1 values, into a flags field and
      defines bit values for its previous use (PPMU_LIMITED_PMC5_6)
      and a new flag (PPMU_ALT_SIPR) to indicate that the processor
      uses the POWER6 bit positions rather than the earlier
      positions.  It also adds definitions in reg.h for the new and
      old positions of the bit that indicates that the SIAR and SDAR
      values come from the same instruction.
      
      For the data address, the SDAR value is supplied if we are not
      doing instruction sampling.  In that case there is no guarantee
      that the address given in the PERF_RECORD_ADDR subrecord will
      correspond to the instruction whose address is given in the
      PERF_RECORD_IP subrecord.
      
      If instruction sampling is enabled (e.g. because this counter
      is counting a marked instruction event), then we only supply
      the SDAR value for the PERF_RECORD_ADDR subrecord if it
      corresponds to the instruction whose address is in the
      PERF_RECORD_IP subrecord.  Otherwise we supply 0.
      
      [ Impact: support more PMU hardware features on PowerPC ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18955.37028.48861.555309@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      0bbd0d4b
    • Paul Mackerras's avatar
      perf_counter: powerpc: use u64 for event codes internally · ef923214
      Paul Mackerras authored
      
      Although the perf_counter API allows 63-bit raw event codes,
      internally in the powerpc back-end we had been using 32-bit
      event codes.  This expands them to 64 bits so that we can add
      bits for specifying threshold start/stop events and instruction
      sampling modes later.
      
      This also corrects the return value of can_go_on_limited_pmc;
      we were returning an event code rather than just a 0/1 value in
      some circumstances. That didn't particularly matter while event
      codes were 32-bit, but now that event codes are 64-bit it
      might, so this fixes it.
      
      [ Impact: extend PowerPC perfcounter interfaces from u32 to u64 ]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <18955.36874.472452.353104@drongo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ef923214
    • Peter Zijlstra's avatar
      perf_counter: frequency based adaptive irq_period · 60db5e09
      Peter Zijlstra authored
      
      Instead of specifying the irq_period for a counter, provide a target interrupt
      frequency and dynamically adapt the irq_period to match this frequency.
      
      [ Impact: new perf-counter attribute/feature ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <20090515132018.646195868@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      60db5e09
    • Peter Zijlstra's avatar
      perf_counter: Rework the perf counter disable/enable · 9e35ad38
      Peter Zijlstra authored
      
      The current disable/enable mechanism is:
      
      	token = hw_perf_save_disable();
      	...
      	/* do bits */
      	...
      	hw_perf_restore(token);
      
      This works well, provided that the use nests properly. Except we don't.
      
      x86 NMI/INT throttling has non-nested use of this, breaking things. Therefore
      provide a reference counter disable/enable interface, where the first disable
      disables the hardware, and the last enable enables the hardware again.
      
      [ Impact: refactor, simplify the PMU disable/enable logic ]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <new-submission>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      9e35ad38
  20. 29 Apr, 2009 2 commits
    • Paul Mackerras's avatar
      perf_counter: powerpc: allow use of limited-function counters · ab7ef2e5
      Paul Mackerras authored
      
      POWER5+ and POWER6 have two hardware counters with limited functionality:
      PMC5 counts instructions completed in run state and PMC6 counts cycles
      in run state.  (Run state is the state when a hardware RUN bit is 1;
      the idle task clears RUN while waiting for work to do and sets it when
      there is work to do.)
      
      These counters can't be written to by the kernel, can't generate
      interrupts, and don't obey the freeze conditions.  That means we can
      only use them for per-task counters (where we know we'll always be in
      run state; we can't put a per-task counter on an idle task), and only
      if we don't want interrupts and we do want to count in all processor
      modes.
      
      Obviously some counters can't go on a limited hardware counter, but there
      are also situations where we can only put a counter on a limited hardware
      counter - if there are already counters on that exclude some processor
      modes and we want to put on a per-task cycle or instruction counter that
      doesn't exclude any processor mode, it could go on if it can use a
      limited hardware counter.
      
      To keep track of these constraints, this adds a flags argument to the
      processor-specific get_alternatives() functions, with three bits defined:
      one to say that we can accept alternative event codes that go on limited
      counters, one to say we only want alternatives on limited counters, and
      one to say that this is a per-task counter and therefore events that are
      gated by run state are equivalent to those that aren't (e.g. a "cycles"
      event is equivalent to a "cycles in run state" event).  These flags
      are computed for each counter and stored in the counter->hw.counter_base
      field (slightly wonky name for what it does, but it was an existing
      unused field).
      
      Since the limited counters don't freeze when we freeze the other counters,
      we need some special handling to avoid getting skew between things counted
      on the limited counters and those counted on normal counters.  To minimize
      this skew, if we are using any limited counters, we read PMC5 and PMC6
      immediately after setting and clearing the freeze bit.  This is done in
      a single asm in the new write_mmcr0() function.
      
      The code here is specific to PMC5 and PMC6 being the limited hardware
      counters.  Being more general (e.g. having a bitmap of limited hardware
      counter numbers) would have meant more complex code to read the limited
      counters when freezing and unfreezing the normal counters, with
      conditional branches, which would have increased the skew.  Since it
      isn't necessary for the code to be more general at this stage, it isn't.
      
      This also extends the back-ends for POWER5+ and POWER6 to be able to
      handle up to 6 counters rather than the 4 they previously handled.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Robert Richter <robert.richter@amd.com>
      LKML-Reference: <18936.19035.163066.892208@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ab7ef2e5
    • Robert Richter's avatar
      perfcounters: rename struct hw_perf_counter_ops into struct pmu · 4aeb0b42
      Robert Richter authored
      
      This patch renames struct hw_perf_counter_ops into struct pmu. It
      introduces a structure to describe a cpu specific pmu (performance
      monitoring unit). It may contain ops and data. The new name of the
      structure fits better, is shorter, and thus better to handle. Where it
      was appropriate, names of function and variable have been changed too.
      
      [ Impact: cleanup ]
      Signed-off-by: default avatarRobert Richter <robert.richter@amd.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1241002046-8832-7-git-send-email-robert.richter@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      4aeb0b42
  21. 09 Apr, 2009 1 commit
    • Paul Mackerras's avatar
      perf_counter: powerpc: add nmi_enter/nmi_exit calls · ca8f2d7f
      Paul Mackerras authored
      Impact: fix potential deadlocks on powerpc
      
      Now that the core is using in_nmi() (added in e30e08f6
      
      , "perf_counter:
      fix NMI race in task clock"), we need the powerpc perf_counter_interrupt
      to call nmi_enter() and nmi_exit() in those cases where the interrupt
      happens when interrupts are soft-disabled.
      
      If interrupts were soft-enabled, we can treat it as a regular interrupt
      and do irq_enter/irq_exit around the whole routine. This lets us get rid
      of the test_perf_counter_pending() call at the end of
      perf_counter_interrupt, thus simplifying things a little.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <18909.31952.873098.336615@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      ca8f2d7f
  22. 08 Apr, 2009 3 commits
    • Peter Zijlstra's avatar
      perf_counter: allow for data addresses to be recorded · 78f13e95
      Peter Zijlstra authored
      
      Paul suggested we allow for data addresses to be recorded along with
      the traditional IPs as power can provide these.
      
      For now, only the software pagefault events provide data addresses,
      but in the future power might as well for some events.
      
      x86 doesn't seem capable of providing this atm.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <20090408130409.394816925@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      78f13e95
    • Paul Mackerras's avatar
      perf_counter: powerpc: set sample enable bit for marked instruction events · f708223d
      Paul Mackerras authored
      
      Impact: enable access to hardware feature
      
      POWER processors have the ability to "mark" a subset of the instructions
      and provide more detailed information on what happens to the marked
      instructions as they flow through the pipeline.  This marking is
      enabled by the "sample enable" bit in MMCRA, and there are
      synchronization requirements around setting and clearing the bit.
      
      This adds logic to the processor-specific back-ends so that they know
      which events relate to marked instructions and set the sampling enable
      bit if any event that we want to put on the PMU is a marked instruction
      event.  It also adds logic to the generic powerpc code to do the
      necessary synchronization if that bit is set.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <18908.31930.1024.228867@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f708223d
    • Paul Mackerras's avatar
      perf_counter: fix powerpc build · dc66270b
      Paul Mackerras authored
      Commit 4af4998b
      
       ("perf_counter: rework context time") changed struct
      perf_counter_context to have a 'time' field instead of a 'time_now'
      field, but neglected to fix the place in the powerpc perf_counter.c
      where the time_now field was accessed.  This fixes it.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <18908.31922.411398.147810@cargo.ozlabs.ibm.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      dc66270b
  23. 07 Apr, 2009 1 commit
    • Peter Zijlstra's avatar
      perf_counter: theres more to overflow than writing events · f6c7d5fe
      Peter Zijlstra authored
      
      Prepare for more generic overflow handling. The new perf_counter_overflow()
      method will handle the generic bits of the counter overflow, and can return
      a !0 return value, in which case the counter should be (soft) disabled, so
      that it won't count until it's properly disabled.
      
      XXX: do powerpc and swcounter
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
      LKML-Reference: <20090406094517.812109629@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      f6c7d5fe
  24. 06 Apr, 2009 3 commits
    • Paul Mackerras's avatar
      perf_counter: make it possible for hw_perf_counter_init to return error codes · d5d2bc0d
      Paul Mackerras authored
      
      Impact: better error reporting
      
      At present, if hw_perf_counter_init encounters an error, all it can do
      is return NULL, which causes sys_perf_counter_open to return an EINVAL
      error to userspace.  This isn't very informative for userspace; it means
      that userspace can't tell the difference between "sorry, oprofile is
      already using the PMU" and "we don't support this CPU" and "this CPU
      doesn't support the requested generic hardware event".
      
      This commit uses the PTR_ERR/ERR_PTR/IS_ERR set of macros to let
      hw_perf_counter_init return an error code on error rather than just NULL
      if it wishes.  If it does so, that error code will be returned from
      sys_perf_counter_open to userspace.  If it returns NULL, an EINVAL
      error will be returned to userspace, as before.
      
      This also adapts the powerpc hw_perf_counter_init to make use of this
      to return ENXIO, EINVAL, EBUSY, or EOPNOTSUPP as appropriate.  It would
      be good to add extra error numbers in future to allow userspace to
      distinguish the various errors that are currently reported as EINVAL,
      i.e. irq_period < 0, too many events in a group, conflict between
      exclude_* settings in a group, and PMU resource conflict in a group.
      
      [ v2: fix a bug pointed out by Corey Ashford where error returns from
            hw_perf_counter_init were not handled correctly in the case of
            raw hardware events.]
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Orig-LKML-Reference: <20090330171023.682428180@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      d5d2bc0d
    • Paul Mackerras's avatar
      perf_counter: powerpc: only reserve PMU hardware when we need it · 7595d63b
      Paul Mackerras authored
      
      Impact: cooperate with oprofile
      
      At present, on PowerPC, if you have perf_counters compiled in, oprofile
      doesn't work.  There is code to allow the PMU to be shared between
      competing subsystems, such as perf_counters and oprofile, but currently
      the perf_counter subsystem reserves the PMU for itself at boot time,
      and never releases it.
      
      This makes perf_counter play nicely with oprofile.  Now we keep a count
      of how many perf_counter instances are counting hardware events, and
      reserve the PMU when that count becomes non-zero, and release the PMU
      when that count becomes zero.  This means that it is possible to have
      perf_counters compiled in and still use oprofile, as long as there are
      no hardware perf_counters active.  This also means that if oprofile is
      active, sys_perf_counter_open will fail if the hw_event specifies a
      hardware event.
      
      To avoid races with other tasks creating and destroying perf_counters,
      we use a mutex.  We use atomic_inc_not_zero and atomic_add_unless to
      avoid having to take the mutex unless there is a possibility of the
      count going between 0 and 1.
      Signed-off-by: default avatarPaul Mackerras <paulus@samba.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Orig-LKML-Reference: <20090330171023.627912475@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      7595d63b
    • Peter Zijlstra's avatar
      perf_counter: unify and fix delayed counter wakeup · 925d519a
      Peter Zijlstra authored
      
      While going over the wakeup code I noticed delayed wakeups only work
      for hardware counters but basically all software counters rely on
      them.
      
      This patch unifies and generalizes the delayed wakeup to fix this
      issue.
      
      Since we're dealing with NMI context bits here, use a cmpxchg() based
      single link list implementation to track counters that have pending
      wakeups.
      
      [ This should really be generic code for delayed wakeups, but since we
        cannot use cmpxchg()/xchg() in generic code, I've let it live in the
        perf_counter code. -- Eric Dumazet could use it to aggregate the
        network wakeups. ]
      
      Furthermore, the x86 method of using TIF flags was flawed in that its
      quite possible to end up setting the bit on the idle task, loosing the
      wakeup.
      
      The powerpc method uses per-cpu storage and does appear to be
      sufficient.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: default avatarPaul Mackerras <paulus@samba.org>
      Orig-LKML-Reference: <20090330171023.153932974@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      925d519a