1. 01 Jun, 2007 1 commit
  2. 02 May, 2007 1 commit
    • Ravikiran G Thirumalai's avatar
      [PATCH] x86-64: Set HASHDIST_DEFAULT to 1 for x86_64 NUMA · e073ae1b
      Ravikiran G Thirumalai authored
      
      Enable system hashtable memory to be distributed among nodes on x86_64 NUMA
      
      Forcing the kernel to use node interleaved vmalloc instead of bootmem for
      the system hashtable memory (alloc_large_system_hash) reduces the memory
      imbalance on node 0 by around 40MB on a 8 node x86_64 NUMA box:
      
      Before the following patch, on bootup of a 8 node box:
      
      Node 0 MemTotal:      3407488 kB
      Node 0 MemFree:       3206296 kB
      Node 0 MemUsed:        201192 kB
      Node 0 Active:           7012 kB
      Node 0 Inactive:          512 kB
      Node 0 Dirty:               0 kB
      Node 0 Writeback:           0 kB
      Node 0 FilePages:        1912 kB
      Node 0 Mapped:            420 kB
      Node 0 AnonPages:        5612 kB
      Node 0 PageTables:        468 kB
      Node 0 NFS_Unstable:        0 kB
      Node 0 Bounce:              0 kB
      Node 0 Slab:             5408 kB
      Node 0 SReclaimable:      644 kB
      Node 0 SUnreclaim:       4764 kB
      
      After the patch (or using hashdist=1 on the kernel command line):
      
      Node 0 MemTotal:      3407488 kB
      Node 0 MemFree:       3247608 kB
      Node 0 MemUsed:        159880 kB
      Node 0 Active:           3012 kB
      Node 0 Inactive:          616 kB
      Node 0 Dirty:               0 kB
      Node 0 Writeback:           0 kB
      Node 0 FilePages:        2424 kB
      Node 0 Mapped:            380 kB
      Node 0 AnonPages:        1200 kB
      Node 0 PageTables:        396 kB
      Node 0 NFS_Unstable:        0 kB
      Node 0 Bounce:              0 kB
      Node 0 Slab:             6304 kB
      Node 0 SReclaimable:     1596 kB
      Node 0 SUnreclaim:       4708 kB
      
      I guess it is a good idea to keep HASHDIST_DEFAULT "on" for x86_64 NUMA
      since x86_64 has no dearth of vmalloc space?  Or maybe enable hash
      distribution for all 64bit NUMA arches?  The following patch does it only
      for x86_64.
      
      I ran a HPC MPI benchmark -- 'Ansys wingsolid', which takes up quite a bit of
      memory and uses up tlb entries.  This was on a 4 way, 2 socket
      Tyan AMD box (non vsmp), with 8G total memory (4G pernode).
      
      The results with and without hash distribution are:
      
      1. Vanilla - runtime of 1188.000s
      2. With hashdist=1 runtime of 1154.000s
      
      Oprofile output for the duration of run is:
      
      1. Vanilla:
      PU: AMD64 processors, speed 2411.16 MHz (estimated)
      Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
      mask of 0x00 (No unit mask) count 500
      samples  %        app name                 symbol name
      163054    6.5513  libansys1.so             MultiFront::decompose(int, int,
      Elemset *, int *, int, int, int)
      162061    6.5114  libansys3.so             blockSaxpy6L_fd
      162042    6.5107  libansys3.so             blockInnerProduct6L_fd
      156286    6.2794  libansys3.so             maxb33_
      87879     3.5309  libansys1.so             elmatrixmultpcg_
      84857     3.4095  libansys4.so             saxpy_pcg
      58637     2.3560  libansys4.so             .st4560
      46612     1.8728  libansys4.so             .st4282
      43043     1.7294  vmlinux-t                copy_user_generic_string
      41326     1.6604  libansys3.so             blockSaxpyBackSolve6L_fd
      41288     1.6589  libansys3.so             blockInnerProductBackSolve6L_fd
      
      2. With hashdist=1
      CPU: AMD64 processors, speed 2411.13 MHz (estimated)
      Counted L1_AND_L2_DTLB_MISSES events (L1 and L2 DTLB misses) with a unit
      mask of 0x00 (No unit mask) count 500
      samples  %        app name                 symbol name
      162993    6.9814  libansys1.so             MultiFront::decompose(int, int,
      Elemset *, int *, int, int, int)
      160799    6.8874  libansys3.so             blockInnerProduct6L_fd
      160459    6.8729  libansys3.so             blockSaxpy6L_fd
      156018    6.6826  libansys3.so             maxb33_
      84700     3.6279  libansys4.so             saxpy_pcg
      83434     3.5737  libansys1.so             elmatrixmultpcg_
      58074     2.4875  libansys4.so             .st4560
      46000     1.9703  libansys4.so             .st4282
      41166     1.7632  libansys3.so             blockSaxpyBackSolve6L_fd
      41033     1.7575  libansys3.so             blockInnerProductBackSolve6L_fd
      35762     1.5318  libansys1.so             inner_product_sub
      35591     1.5245  libansys1.so             inner_product_sub2
      28259     1.2104  libansys4.so             addVectors
      Signed-off-by: default avatarPravin B. Shelar <pravin.shelar@calsoftinc.com>
      Signed-off-by: default avatarRavikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarShai Fultheim <shai@scalex86.org>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Acked-by: default avatarChristoph Lameter <clameter@engr.sgi.com>
      Cc: Andi Kleen <ak@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      e073ae1b
  3. 22 Mar, 2007 1 commit
  4. 07 Dec, 2006 1 commit
  5. 26 Sep, 2006 5 commits
  6. 22 Sep, 2006 1 commit
    • David S. Miller's avatar
      [XFRM]: Dynamic xfrm_state hash table sizing. · f034b5d4
      David S. Miller authored
      
      The grow algorithm is simple, we grow if:
      
      1) we see a hash chain collision at insert, and
      2) we haven't hit the hash size limit (currently 1*1024*1024 slots), and
      3) the number of xfrm_state objects is > the current hash mask
      
      All of this needs some tweaking.
      
      Remove __initdata from "hashdist" so we can use it safely at run time.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f034b5d4
  7. 10 Jul, 2006 1 commit
    • David Howells's avatar
      [PATCH] FRV: Fix FRV arch compile errors · 9dec17eb
      David Howells authored
      
      Fix some FRV arch compile errors, including:
      
       (*) Marking nr_kernel_pages as __meminitdata so that references to it end up
           being properly calculated rather than being assumed to be in the small
           data section (and thus calculated wrt the GP register).  Not doing this
           causes the linker to emit errors as the offset is too big to fit into the
           load instruction.
      
       (*) Move pm_power_off into an unconditionally compiled .c file as it's now
           unconditionally accessed.
      
       (*) Declare frv_change_cmode() in a header file rather than in a .c file, and
           declare it asmlinkage.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9dec17eb
  8. 23 Jun, 2006 1 commit
  9. 09 Apr, 2006 1 commit
    • Andi Kleen's avatar
      [PATCH] x86_64: Handle empty PXMs that only contain hotplug memory · a8062231
      Andi Kleen authored
      
      The node setup code would try to allocate the node metadata in the node
      itself, but that fails if there is no memory in there.
      
      This can happen with memory hotplug when the hotplug area defines an so
      far empty node.
      
      Now use bootmem to try to allocate the mem_map in other nodes.
      
      And if it fails don't panic, but just ignore the node.
      
      To make this work I added a new __alloc_bootmem_nopanic function that
      does what its name implies.
      
      TBD should try to use nearby nodes here.  Currently we just use any.
      It's hard to do it better because bootmem doesn't have proper fallback
      lists yet.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a8062231
  10. 27 Mar, 2006 1 commit
  11. 25 Mar, 2006 1 commit
  12. 06 Jan, 2006 1 commit
  13. 20 Oct, 2005 1 commit
    • Yasunori Goto's avatar
      [PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory · 281dd25c
      Yasunori Goto authored
      
      This introduces a limit parameter to the core bootmem allocator; The new
      parameter indicates that physical memory allocated by the bootmem
      allocator should be within the requested limit.
      
      We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
      alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
      is the only api used for swiotlb.
      
      The existing alloc_bootmem_low_pages() api could instead have been
      changed and made to pass right limit to the core allocator.  But that
      would make the patch more intrusive for 2.6.14, as other arches use
      alloc_bootmem_low_pages().  We may be done that post 2.6.14 as a
      cleanup.
      
      With this, swiotlb gets memory within 4G for both x86_64 and ia64
      arches.
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      281dd25c
  14. 25 Jun, 2005 1 commit
  15. 23 Jun, 2005 1 commit
  16. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4