1. 26 Sep, 2006 7 commits
    • Peter Zijlstra's avatar
      [PATCH] mm: balance dirty pages · edc79b2a
      Peter Zijlstra authored
      
      Now that we can detect writers of shared mappings, throttle them.  Avoids OOM
      by surprise.
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      edc79b2a
    • Peter Zijlstra's avatar
      [PATCH] mm: tracking shared dirty pages · d08b3851
      Peter Zijlstra authored
      
      Tracking of dirty pages in shared writeable mmap()s.
      
      The idea is simple: write protect clean shared writeable pages, catch the
      write-fault, make writeable and set dirty.  On page write-back clean all the
      PTE dirty bits and write protect them once again.
      
      The implementation is a tad harder, mainly because the default
      backing_dev_info capabilities were too loosely maintained.  Hence it is not
      enough to test the backing_dev_info for cap_account_dirty.
      
      The current heuristic is as follows, a VMA is eligible when:
       - its shared writeable
          (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
       - it is not a 'special' mapping
          (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
       - the backing_dev_info is cap_account_dirty
          mapping_cap_account_dirty(vma->vm_file->f_mapping)
       - f_op->mmap() didn't change the default page protection
      
      Page from remap_pfn_range() are explicitly excluded because their COW
      semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
      because they don't have a backing store anyway.
      
      mprotect() is taught about the new behaviour as well.  However it overrides
      the last condition.
      
      Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
      It can be called on any page, but is currently only implemented for mapped
      pages, if the page is found the be of a VMA that accounts dirty pages it will
      also wrprotect the PTE.
      
      Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
      under ->private_lock.  This seems to be safe, since ->private_lock is used to
      serialize access to the buffers, not the page itself.  This is needed because
      clear_page_dirty() will call into page_mkclean() and would thereby violate
      locking order.
      
      [dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d08b3851
    • Nick Piggin's avatar
      [PATCH] mm: VM_BUG_ON · 725d704e
      Nick Piggin authored
      
      Introduce a VM_BUG_ON, which is turned on with CONFIG_DEBUG_VM.  Use this
      in the lightweight, inline refcounting functions; PageLRU and PageActive
      checks in vmscan, because they're pretty well confined to vmscan.  And in
      page allocate/free fastpaths which can be the hottest parts of the kernel
      for kbuilds.
      
      Unlike BUG_ON, VM_BUG_ON must not be used to execute statements with
      side-effects, and should not be used outside core mm code.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Christoph Lameter <clameter@engr.sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      725d704e
    • James Bottomley's avatar
      [PATCH] update to the kernel kmap/kunmap API · a6ca1b99
      James Bottomley authored
      
      Give non-highmem architectures access to the kmap API for the purposes of
      overriding (this is what the attached patch does).
      
      The proposal is that we should now require all architectures with coherence
      issues to manage data coherence via the kmap/kunmap API.  Thus driver
      writers never have to write code like
      
          kmap(page)
          modify data in page
          flush_kernel_dcache_page(page)
          kunmap(page)
      
      instead, kmap/kunmap will manage the coherence and driver (and filesystem)
      writers don't need to worry about how to flush between kmap and kunmap.
      
      For most architectures, the page only needs to be flushed if it was
      actually written to *and* there are user mappings of it, so the best
      implementation looks to be: clear the page dirty pte bit in the kernel page
      tables on kmap and on kunmap, check page->mappings for user maps, and then
      the dirty bit, and only flush if it both has user mappings and is dirty.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a6ca1b99
    • Jan Kara's avatar
      [PATCH] jbd: fix commit of ordered data buffers · 3998b930
      Jan Kara authored
      
      Original commit code assumes, that when a buffer on BJ_SyncData list is
      locked, it is being written to disk.  But this is not true and hence it can
      lead to a potential data loss on crash.  Also the code didn't count with
      the fact that journal_dirty_data() can steal buffers from committing
      transaction and hence could write buffers that no longer belong to the
      committing transaction.  Finally it could possibly happen that we tried
      writing out one buffer several times.
      
      The patch below tries to solve these problems by a complete rewrite of the
      data commit code.  We go through buffers on t_sync_datalist, lock buffers
      needing write out and store them in an array.  Buffers are also immediately
      refiled to BJ_Locked list or unfiled (if the write out is completed).  When
      the array is full or we have to block on buffer lock, we submit all
      accumulated buffers for IO.
      
      [suitable for 2.6.18.x around the 2.6.19-rc2 timeframe]
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3998b930
    • Jan Blunck's avatar
      [PATCH] trigger a syntax error if percpu macros are incorrectly used · 632bbfee
      Jan Blunck authored
      
      get_cpu_var()/per_cpu()/__get_cpu_var() arguments must be simple
      identifiers.  Otherwise the arch dependent implementations might break.
      
      This patch enforces the correct usage of the macros by producing a syntax
      error if the variable is not a simple identifier.
      Signed-off-by: default avatarJan Blunck <jblunck@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      632bbfee
    • Christoph Lameter's avatar
      [PATCH] Fix longstanding load balancing bug in the scheduler · 0a2966b4
      Christoph Lameter authored
      The scheduler will stop load balancing if the most busy processor contains
      processes pinned via processor affinity.
      
      The scheduler currently only does one search for busiest cpu.  If it cannot
      pull any tasks away from the busiest cpu because they were pinned then the
      scheduler goes into a corner and sulks leaving the idle processors idle.
      
      F.e.  If you have processor 0 busy running four tasks pinned via taskset,
      there are none on processor 1 and one just started two processes on
      processor 2 then the scheduler will not move one of the two processes away
      from processor 2.
      
      This patch fixes that issue by forcing the scheduler to come out of its
      corner and retrying the load balancing by considering other processors for
      load balancing.
      
      This patch was originally developed by John Hawkes and discussed at
      
          http://marc.theaimsgroup.com/?l=linux-kernel&m=113901368523205&w=2
      
      .
      
      I have removed extraneous material and gone back to equipping struct rq
      with the cpu the queue is associated with since this makes the patch much
      easier and it is likely that others in the future will have the same
      difficulty of figuring out which processor owns which runqueue.
      
      The overhead added through these patches is a single word on the stack if
      the kernel is configured to support 32 cpus or less (32 bit).  For 32 bit
      environments the maximum number of cpus that can be configued is 255 which
      would result in the use of 32 bytes additional on the stack.  On IA64 up to
      1k cpus can be configured which will result in the use of 128 additional
      bytes on the stack.  The maximum additional cache footprint is one
      cacheline.  Typically memory use will be much less than a cacheline and the
      additional cpumask will be placed on the stack in a cacheline that already
      contains other local variable.
      Signed-off-by: default avatarChristoph Lameter <clameter@sgi.com>
      Cc: John Hawkes <hawkes@sgi.com>
      Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Cc: Peter Williams <pwil3058@bigpond.net.au>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0a2966b4
  2. 25 Sep, 2006 27 commits
  3. 24 Sep, 2006 6 commits