1. 07 May, 2007 10 commits
    • Christoph Lameter's avatar
      slab allocators: Remove obsolete SLAB_MUST_HWCACHE_ALIGN · 5af60839
      Christoph Lameter authored
      
      This patch was recently posted to lkml and acked by Pekka.
      
      The flag SLAB_MUST_HWCACHE_ALIGN is
      
      1. Never checked by SLAB at all.
      
      2. A duplicate of SLAB_HWCACHE_ALIGN for SLUB
      
      3. Fulfills the role of SLAB_HWCACHE_ALIGN for SLOB.
      
      The only remaining use is in sparc64 and ppc64 and their use there
      reflects some earlier role that the slab flag once may have had. If
      its specified then SLAB_HWCACHE_ALIGN is also specified.
      
      The flag is confusing, inconsistent and has no purpose.
      
      Remove it.
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5af60839
    • David Miller's avatar
      Quicklist support for sparc64 · 3a2cba99
      David Miller authored
      
      I ported this to sparc64 as per the patch below, tested on UP SunBlade1500 and
      24 cpu Niagara T1000.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Andi Kleen <ak@suse.de>
      Cc: "Luck, Tony" <tony.luck@intel.com>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3a2cba99
    • Christoph Lameter's avatar
      Make page->private usable in compound pages · d85f3385
      Christoph Lameter authored
      If we add a new flag so that we can distinguish between the first page and the
      tail pages then we can avoid to use page->private in the first page.
      page->private == page for the first page, so there is no real information in
      there.
      
      Freeing up page->private makes the use of compound pages more transparent.
      They become more usable like real pages.  Right now we have to be careful f.e.
       if we are going beyond PAGE_SIZE allocations in the slab on i386 because we
      can then no longer use the private field.  This is one of the issues that
      cause us not to support debugging for page size slabs in SLAB.
      
      Having page->private available for SLUB would allow more meta information in
      the page struct.  I can probably avoid the 16 bit ints that I have in there
      right now.
      
      Also if page->private is available then a compound page may be equipped with
      buffer heads.  This may free up the way for filesystems to support larger
      blocks than page size.
      
      We add PageTail as an alias of PageReclaim.  Compound pages cannot currently
      be reclaimed.  Because of the alias one needs to check PageCompound first.
      
      The RFC for the this approach was discussed at
      http://marc.info/?t=117574302800001&r=1&w=2
      
      
      
      [nacc@us.ibm.com: fix hugetlbfs]
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarNishanth Aravamudan <nacc@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d85f3385
    • Christoph Lameter's avatar
      PowerPC: Disable SLUB for configurations in which slab page structs are modified · 30520864
      Christoph Lameter authored
      
      PowerPC uses the slab allocator to manage the lowest level of the page
      table.  In high cpu configurations we also use the page struct to split the
      page table lock.  Disallow the selection of SLUB for that case.
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Acked-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      30520864
    • Christoph Lameter's avatar
      SLUB core · 81819f0f
      Christoph Lameter authored
      
      This is a new slab allocator which was motivated by the complexity of the
      existing code in mm/slab.c. It attempts to address a variety of concerns
      with the existing implementation.
      
      A. Management of object queues
      
         A particular concern was the complex management of the numerous object
         queues in SLAB. SLUB has no such queues. Instead we dedicate a slab for
         each allocating CPU and use objects from a slab directly instead of
         queueing them up.
      
      B. Storage overhead of object queues
      
         SLAB Object queues exist per node, per CPU. The alien cache queue even
         has a queue array that contain a queue for each processor on each
         node. For very large systems the number of queues and the number of
         objects that may be caught in those queues grows exponentially. On our
         systems with 1k nodes / processors we have several gigabytes just tied up
         for storing references to objects for those queues  This does not include
         the objects that could be on those queues. One fears that the whole
         memory of the machine could one day be consumed by those queues.
      
      C. SLAB meta data overhead
      
         SLAB has overhead at the beginning of each slab. This means that data
         cannot be naturally aligned at the beginning of a slab block. SLUB keeps
         all meta data in the corresponding page_struct. Objects can be naturally
         aligned in the slab. F.e. a 128 byte object will be aligned at 128 byte
         boundaries and can fit tightly into a 4k page with no bytes left over.
         SLAB cannot do this.
      
      D. SLAB has a complex cache reaper
      
         SLUB does not need a cache reaper for UP systems. On SMP systems
         the per CPU slab may be pushed back into partial list but that
         operation is simple and does not require an iteration over a list
         of objects. SLAB expires per CPU, shared and alien object queues
         during cache reaping which may cause strange hold offs.
      
      E. SLAB has complex NUMA policy layer support
      
         SLUB pushes NUMA policy handling into the page allocator. This means that
         allocation is coarser (SLUB does interleave on a page level) but that
         situation was also present before 2.6.13. SLABs application of
         policies to individual slab objects allocated in SLAB is
         certainly a performance concern due to the frequent references to
         memory policies which may lead a sequence of objects to come from
         one node after another. SLUB will get a slab full of objects
         from one node and then will switch to the next.
      
      F. Reduction of the size of partial slab lists
      
         SLAB has per node partial lists. This means that over time a large
         number of partial slabs may accumulate on those lists. These can
         only be reused if allocator occur on specific nodes. SLUB has a global
         pool of partial slabs and will consume slabs from that pool to
         decrease fragmentation.
      
      G. Tunables
      
         SLAB has sophisticated tuning abilities for each slab cache. One can
         manipulate the queue sizes in detail. However, filling the queues still
         requires the uses of the spin lock to check out slabs. SLUB has a global
         parameter (min_slab_order) for tuning. Increasing the minimum slab
         order can decrease the locking overhead. The bigger the slab order the
         less motions of pages between per CPU and partial lists occur and the
         better SLUB will be scaling.
      
      G. Slab merging
      
         We often have slab caches with similar parameters. SLUB detects those
         on boot up and merges them into the corresponding general caches. This
         leads to more effective memory use. About 50% of all caches can
         be eliminated through slab merging. This will also decrease
         slab fragmentation because partial allocated slabs can be filled
         up again. Slab merging can be switched off by specifying
         slub_nomerge on boot up.
      
         Note that merging can expose heretofore unknown bugs in the kernel
         because corrupted objects may now be placed differently and corrupt
         differing neighboring objects. Enable sanity checks to find those.
      
      H. Diagnostics
      
         The current slab diagnostics are difficult to use and require a
         recompilation of the kernel. SLUB contains debugging code that
         is always available (but is kept out of the hot code paths).
         SLUB diagnostics can be enabled via the "slab_debug" option.
         Parameters can be specified to select a single or a group of
         slab caches for diagnostics. This means that the system is running
         with the usual performance and it is much more likely that
         race conditions can be reproduced.
      
      I. Resiliency
      
         If basic sanity checks are on then SLUB is capable of detecting
         common error conditions and recover as best as possible to allow the
         system to continue.
      
      J. Tracing
      
         Tracing can be enabled via the slab_debug=T,<slabcache> option
         during boot. SLUB will then protocol all actions on that slabcache
         and dump the object contents on free.
      
      K. On demand DMA cache creation.
      
         Generally DMA caches are not needed. If a kmalloc is used with
         __GFP_DMA then just create this single slabcache that is needed.
         For systems that have no ZONE_DMA requirement the support is
         completely eliminated.
      
      L. Performance increase
      
         Some benchmarks have shown speed improvements on kernbench in the
         range of 5-10%. The locking overhead of slub is based on the
         underlying base allocation size. If we can reliably allocate
         larger order pages then it is possible to increase slub
         performance much further. The anti-fragmentation patches may
         enable further performance increases.
      
      Tested on:
      i386 UP + SMP, x86_64 UP + SMP + NUMA emulation, IA64 NUMA + Simulator
      
      SLUB Boot options
      
      slub_nomerge		Disable merging of slabs
      slub_min_order=x	Require a minimum order for slab caches. This
      			increases the managed chunk size and therefore
      			reduces meta data and locking overhead.
      slub_min_objects=x	Mininum objects per slab. Default is 8.
      slub_max_order=x	Avoid generating slabs larger than order specified.
      slub_debug		Enable all diagnostics for all caches
      slub_debug=<options>	Enable selective options for all caches
      slub_debug=<o>,<cache>	Enable selective options for a certain set of
      			caches
      
      Available Debug options
      F		Double Free checking, sanity and resiliency
      R		Red zoning
      P		Object / padding poisoning
      U		Track last free / alloc
      T		Trace all allocs / frees (only use for individual slabs).
      
      To use SLUB: Apply this patch and then select SLUB as the default slab
      allocator.
      
      [hugh@veritas.com: fix an oops-causing locking error]
      [akpm@linux-foundation.org: various stupid cleanups and small fixes]
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      81819f0f
    • Heiko Carstens's avatar
      Introduce CONFIG_HAS_DMA · 411f0f3e
      Heiko Carstens authored
      
      Architectures that don't support DMA can say so by adding a config NO_DMA
      to their Kconfig file.  This will prevent compilation of some dma specific
      driver code.  Also dma-mapping-broken.h isn't needed anymore on at least
      s390.  This avoids compilation and linking of otherwise dead/broken code.
      
      Other architectures that include dma-mapping-broken.h are arm26, h8300,
      m68k, m68knommu and v850.  If these could be converted as well we could get
      rid of the header file.
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      "John W. Linville" <linville@tuxdriver.com>
      Cc: Kyle McMartin <kyle@parisc-linux.org>
      Cc: <James.Bottomley@SteelEye.com>
      Cc: Tejun Heo <htejun@gmail.com>
      Cc: Jeff Garzik <jeff@garzik.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: <geert@linux-m68k.org>
      Cc: <zippel@linux-m68k.org>
      Cc: <spyro@f2s.com>
      Cc: <uclinux-v850@lsi.nec.co.jp>
      Cc: <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      411f0f3e
    • David Gibson's avatar
      serial: define FIXED_PORT flag for serial_core · abb4a239
      David Gibson authored
      
      At present, the serial core always allows setserial in userspace to change the
      port address, irq and base clock of any serial port.  That makes sense for
      legacy ISA ports, but not for (say) embedded ns16550 compatible serial ports
      at peculiar addresses.  In these cases, the kernel code configuring the ports
      must know exactly where they are, and their clocking arrangements (which can
      be unusual on embedded boards).  It doesn't make sense for userspace to change
      these settings.
      
      Therefore, this patch defines a UPF_FIXED_PORT flag for the uart_port
      structure.  If this flag is set when the serial port is configured, any
      attempts to alter the port's type, io address, irq or base clock with
      setserial are ignored.
      
      In addition this patch uses the new flag for on-chip serial ports probed in
      arch/powerpc/kernel/legacy_serial.c, and for other hard-wired serial ports
      probed by drivers/serial/of_serial.c.
      Signed-off-by: default avatarDavid Gibson <dwg@au1.ibm.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      abb4a239
    • Thomas Koeller's avatar
      RM9000 serial driver · bd71c182
      Thomas Koeller authored
      
      Add support for the integrated serial ports of the MIPS RM9122 processor
      and its relatives.
      
      The patch also does some whitespace cleanup.
      
      [akpm@linux-foundation.org: cleanups]
      Signed-off-by: default avatarThomas Koeller <thomas.koeller@baslerweb.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bd71c182
    • Marc St-Jean's avatar
      serial driver PMC MSP71xx · beab697a
      Marc St-Jean authored
      
      Serial driver patch for the PMC-Sierra MSP71xx devices.
      
      There are three different fixes:
      
      1 Fix for DesignWare APB THRE errata: In brief, this is a non-standard
        16550 in that the THRE interrupt will not re-assert itself simply by
        disabling and re-enabling the THRI bit in the IER, it is only re-enabled
        if a character is actually sent out.
      
        It appears that the "8250-uart-backup-timer.patch" in the "mm" tree
        also fixes it so we have dropped our initial workaround.  This patch now
        needs to be applied on top of that "mm" patch.
      
      2 Fix for Busy Detect on LCR write: The DesignWare APB UART has a feature
        which causes a new Busy Detect interrupt to be generated if it's busy
        when the LCR is written.  This fix saves the value of the LCR and
        rewrites it after clearing the interrupt.
      
      3 Workaround for interrupt/data concurrency issue: The SoC needs to
        ensure that writes that can cause interrupts to be cleared reach the UART
        before returning from the ISR.  This fix reads a non-destructive register
        on the UART so the read transaction completion ensures the previously
        queued write transaction has also completed.
      Signed-off-by: default avatarMarc St-Jean <Marc_St-Jean@pmc-sierra.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      beab697a
    • Linus Torvalds's avatar
      Revert "[PATCH] x86: __pa and __pa_symbol address space separation" · e3ebadd9
      Linus Torvalds authored
      
      This was broken.  It adds complexity, for no good reason.  Rather than
      separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
      and preferably __pa() too - and just use "virt_to_phys()" instead, which
      is more readable and has nicer semantics.
      
      However, right now, just undo the separation, and make __pa_symbol() be
      the exact same as __pa().  That fixes the bugs this patch introduced,
      and we can do the fairly obvious cleanups later.
      
      Do the new __phys_addr() function (which is now the actual workhorse for
      the unified __pa()/__pa_symbol()) as a real external function, that way
      all the potential issues with compile/link-time optimizations of
      constant symbol addresses go away, and we can also, if we choose to, add
      more sanity-checking of the argument.
      
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e3ebadd9
  2. 06 May, 2007 1 commit
  3. 05 May, 2007 13 commits
  4. 04 May, 2007 16 commits