1. 07 May, 2009 1 commit
  2. 06 May, 2009 1 commit
    • David Howells's avatar
      nommu: make the initial mmap allocation excess behaviour Kconfig configurable · fc4d5c29
      David Howells authored
      
      NOMMU mmap() has an option controlled by a sysctl variable that determines
      whether the allocations made by do_mmap_private() should have the excess
      space trimmed off and returned to the allocator.  Make the initial setting
      of this variable a Kconfig configuration option.
      
      The reason there can be excess space is that the allocator only allocates
      in power-of-2 size chunks, but mmap()'s can be made in sizes that aren't a
      power of 2.
      
      There are two alternatives:
      
       (1) Keep the excess as dead space.  The dead space then remains unused for the
           lifetime of the mapping.  Mappings of shared objects such as libc, ld.so
           or busybox's text segment may retain their dead space forever.
      
       (2) Return the excess to the allocator.  This means that the dead space is
           limited to less than a page per mapping, but it means that for a transient
           process, there's more chance of fragmentation as the excess space may be
           reused fairly quickly.
      
      During the boot process, a lot of transient processes are created, and
      this can cause a lot of fragmentation as the pagecache and various slabs
      grow greatly during this time.
      
      By turning off the trimming of excess space during boot and disabling
      batching of frees, Coldfire can manage to boot.
      
      A better way of doing things might be to have /sbin/init turn this option
      off.  By that point libc, ld.so and init - which are all long-duration
      processes - have all been loaded and trimmed.
      Reported-by: default avatarLanttor Guo <lanttor.guo@freescale.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarLanttor Guo <lanttor.guo@freescale.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fc4d5c29
  3. 02 May, 2009 1 commit
    • KOSAKI Motohiro's avatar
      mm: fix Committed_AS underflow on large NR_CPUS environment · 00a62ce9
      KOSAKI Motohiro authored
      
      The Committed_AS field can underflow in certain situations:
      
      >         # while true; do cat /proc/meminfo  | grep _AS; sleep 1; done | uniq -c
      >               1 Committed_AS: 18446744073709323392 kB
      >              11 Committed_AS: 18446744073709455488 kB
      >               6 Committed_AS:    35136 kB
      >               5 Committed_AS: 18446744073709454400 kB
      >               7 Committed_AS:    35904 kB
      >               3 Committed_AS: 18446744073709453248 kB
      >               2 Committed_AS:    34752 kB
      >               9 Committed_AS: 18446744073709453248 kB
      >               8 Committed_AS:    34752 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               7 Committed_AS: 18446744073709454080 kB
      >               3 Committed_AS: 18446744073709320960 kB
      >               5 Committed_AS: 18446744073709454080 kB
      >               6 Committed_AS: 18446744073709320960 kB
      
      Because NR_CPUS can be greater than 1000 and meminfo_proc_show() does
      not check for underflow.
      
      But NR_CPUS proportional isn't good calculation.  In general,
      possibility of lock contention is proportional to the number of online
      cpus, not theorical maximum cpus (NR_CPUS).
      
      The current kernel has generic percpu-counter stuff.  using it is right
      way.  it makes code simplify and percpu_counter_read_positive() don't
      make underflow issue.
      Reported-by: default avatarDave Hansen <dave@linux.vnet.ibm.com>
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Eric B Munson <ebmunson@us.ibm.com>
      Cc: Mel Gorman <mel@csn.ul.ie>
      Cc: Christoph Lameter <cl@linux-foundation.org>
      Cc: <stable@kernel.org>		[All kernel versions]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00a62ce9
  4. 02 Apr, 2009 1 commit
    • David Howells's avatar
      nommu: fix a number of issues with the per-MM VMA patch · 33e5d769
      David Howells authored
      
      Fix a number of issues with the per-MM VMA patch:
      
       (1) Make mmap_pages_allocated an atomic_long_t, just in case this is used on
           a NOMMU system with more than 2G pages.  Makes no difference on a 32-bit
           system.
      
       (2) Report vma->vm_pgoff * PAGE_SIZE as a 64-bit value, not a 32-bit value,
           lest it overflow.
      
       (3) Move the allocation of the vm_area_struct slab back for fork.c.
      
       (4) Use KMEM_CACHE() for both vm_area_struct and vm_region slabs.
      
       (5) Use BUG_ON() rather than if () BUG().
      
       (6) Make the default validate_nommu_regions() a static inline rather than a
           #define.
      
       (7) Make free_page_series()'s objection to pages with a refcount != 1 more
           informative.
      
       (8) Adjust the __put_nommu_region() banner comment to indicate that the
           semaphore must be held for writing.
      
       (9) Limit the number of warnings about munmaps of non-mmapped regions.
      Reported-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Greg Ungerer <gerg@snapgear.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      33e5d769
  5. 27 Jan, 2009 1 commit
  6. 21 Jan, 2009 1 commit
  7. 14 Jan, 2009 2 commits
  8. 08 Jan, 2009 4 commits
    • Paul Mundt's avatar
      NOMMU: Teach kobjsize() about VMA regions. · ab2e83ea
      Paul Mundt authored
      
      Now that we no longer use compound pages for all large allocations,
      kobjsize() actively breaks things like binfmt_flat by always handing
      back PAGE_SIZE for mmap'ed regions. Fix this up by looking up the
      VMA region for non-compounds.
      
      Ideally binfmt_flat wants to get rid of kobjsize() completely, but
      this is an incremental step.
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMike Frysinger <vapier.adi@gmail.com>
      ab2e83ea
    • Paul Mundt's avatar
      NOMMU: Make mmap allocation page trimming behaviour configurable. · dd8632a1
      Paul Mundt authored
      
      NOMMU mmap allocates a piece of memory for an mmap that's rounded up in size to
      the nearest power-of-2 number of pages.  Currently it then discards the excess
      pages back to the page allocator, making that memory available for use by other
      things.  This can, however, cause greater amount of fragmentation.
      
      To counter this, a sysctl is added in order to fine-tune the trimming
      behaviour.  The default behaviour remains to trim pages aggressively, while
      this can either be disabled completely or set to a higher page-granular
      watermark in order to have finer-grained control.
      
      vm region vm_top bits taken from an earlier patch by David Howells.
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMike Frysinger <vapier.adi@gmail.com>
      dd8632a1
    • David Howells's avatar
      NOMMU: Make VMAs per MM as for MMU-mode linux · 8feae131
      David Howells authored
      
      Make VMAs per mm_struct as for MMU-mode linux.  This solves two problems:
      
       (1) In SYSV SHM where nattch for a segment does not reflect the number of
           shmat's (and forks) done.
      
       (2) In mmap() where the VMA's vm_mm is set to point to the parent mm by an
           exec'ing process when VM_EXECUTABLE is specified, regardless of the fact
           that a VMA might be shared and already have its vm_mm assigned to another
           process or a dead process.
      
      A new struct (vm_region) is introduced to track a mapped region and to remember
      the circumstances under which it may be shared and the vm_list_struct structure
      is discarded as it's no longer required.
      
      This patch makes the following additional changes:
      
       (1) Regions are now allocated with alloc_pages() rather than kmalloc() and
           with no recourse to __GFP_COMP, so the pages are not composite.  Instead,
           each page has a reference on it held by the region.  Anything else that is
           interested in such a page will have to get a reference on it to retain it.
           When the pages are released due to unmapping, each page is passed to
           put_page() and will be freed when the page usage count reaches zero.
      
       (2) Excess pages are trimmed after an allocation as the allocation must be
           made as a power-of-2 quantity of pages.
      
       (3) VMAs are added to the parent MM's R/B tree and mmap lists.  As an MM may
           end up with overlapping VMAs within the tree, the VMA struct address is
           appended to the sort key.
      
       (4) Non-anonymous VMAs are now added to the backing inode's prio list.
      
       (5) Holes may be punched in anonymous VMAs with munmap(), releasing parts of
           the backing region.  The VMA and region structs will be split if
           necessary.
      
       (6) sys_shmdt() only releases one attachment to a SYSV IPC shared memory
           segment instead of all the attachments at that addresss.  Multiple
           shmat()'s return the same address under NOMMU-mode instead of different
           virtual addresses as under MMU-mode.
      
       (7) Core dumping for ELF-FDPIC requires fewer exceptions for NOMMU-mode.
      
       (8) /proc/maps is now the global list of mapped regions, and may list bits
           that aren't actually mapped anywhere.
      
       (9) /proc/meminfo gains a line (tagged "MmapCopy") that indicates the amount
           of RAM currently allocated by mmap to hold mappable regions that can't be
           mapped directly.  These are copies of the backing device or file if not
           anonymous.
      
      These changes make NOMMU mode more similar to MMU mode.  The downside is that
      NOMMU mode requires some extra memory to track things over NOMMU without this
      patch (VMAs are no longer shared, and there are now region structs).
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMike Frysinger <vapier.adi@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      8feae131
    • David Howells's avatar
      NOMMU: Delete askedalloc and realalloc variables · 41836382
      David Howells authored
      
      Delete the askedalloc and realalloc variables as nothing actually uses the
      value calculated.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarMike Frysinger <vapier.adi@gmail.com>
      Acked-by: default avatarPaul Mundt <lethal@linux-sh.org>
      41836382
  9. 05 Jan, 2009 1 commit
    • Al Viro's avatar
      inode->i_op is never NULL · acfa4380
      Al Viro authored
      
      We used to have rather schizophrenic set of checks for NULL ->i_op even
      though it had been eliminated years ago.  You'd need to go out of your
      way to set it to NULL explicitly _and_ a bunch of code would die on
      such inodes anyway.  After killing two remaining places that still
      did that bogosity, all that crap can go away.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      acfa4380
  10. 30 Oct, 2008 1 commit
    • Alan Cox's avatar
      nfsd: fix vm overcommit crash · 731572d3
      Alan Cox authored
      
      Junjiro R.  Okajima reported a problem where knfsd crashes if you are
      using it to export shmemfs objects and run strict overcommit.  In this
      situation the current->mm based modifier to the overcommit goes through a
      NULL pointer.
      
      We could simply check for NULL and skip the modifier but we've caught
      other real bugs in the past from mm being NULL here - cases where we did
      need a valid mm set up (eg the exec bug about a year ago).
      
      To preserve the checks and get the logic we want shuffle the checking
      around and add a new helper to the vm_ security wrappers
      
      Also fix a current->mm reference in nommu that should use the passed mm
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix build]
      Reported-by: default avatarJunjiro R. Okajima <hooanon05@yahoo.co.jp>
      Acked-by: default avatarJames Morris <jmorris@namei.org>
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      731572d3
  11. 20 Oct, 2008 1 commit
    • Nick Piggin's avatar
      mlock: mlocked pages are unevictable · b291f000
      Nick Piggin authored
      
      Make sure that mlocked pages also live on the unevictable LRU, so kswapd
      will not scan them over and over again.
      
      This is achieved through various strategies:
      
      1) add yet another page flag--PG_mlocked--to indicate that
         the page is locked for efficient testing in vmscan and,
         optionally, fault path.  This allows early culling of
         unevictable pages, preventing them from getting to
         page_referenced()/try_to_unmap().  Also allows separate
         accounting of mlock'd pages, as Nick's original patch
         did.
      
         Note:  Nick's original mlock patch used a PG_mlocked
         flag.  I had removed this in favor of the PG_unevictable
         flag + an mlock_count [new page struct member].  I
         restored the PG_mlocked flag to eliminate the new
         count field.
      
      2) add the mlock/unevictable infrastructure to mm/mlock.c,
         with internal APIs in mm/internal.h.  This is a rework
         of Nick's original patch to these files, taking into
         account that mlocked pages are now kept on unevictable
         LRU list.
      
      3) update vmscan.c:page_evictable() to check PageMlocked()
         and, if vma passed in, the vm_flags.  Note that the vma
         will only be passed in for new pages in the fault path;
         and then only if the "cull unevictable pages in fault
         path" patch is included.
      
      4) add try_to_unlock() to rmap.c to walk a page's rmap and
         ClearPageMlocked() if no other vmas have it mlocked.
         Reuses as much of try_to_unmap() as possible.  This
         effectively replaces the use of one of the lru list links
         as an mlock count.  If this mechanism let's pages in mlocked
         vmas leak through w/o PG_mlocked set [I don't know that it
         does], we should catch them later in try_to_unmap().  One
         hopes this will be rare, as it will be relatively expensive.
      
      Original mm/internal.h, mm/rmap.c and mm/mlock.c changes:
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      
      splitlru: introduce __get_user_pages():
      
        New munlock processing need to GUP_FLAGS_IGNORE_VMA_PERMISSIONS.
        because current get_user_pages() can't grab PROT_NONE pages theresore it
        cause PROT_NONE pages can't munlock.
      
      [akpm@linux-foundation.org: fix this for pagemap-pass-mm-into-pagewalkers.patch]
      [akpm@linux-foundation.org: untangle patch interdependencies]
      [akpm@linux-foundation.org: fix things after out-of-order merging]
      [hugh@veritas.com: fix page-flags mess]
      [lee.schermerhorn@hp.com: fix munlock page table walk - now requires 'mm']
      [kosaki.motohiro@jp.fujitsu.com: build fix]
      [kosaki.motohiro@jp.fujitsu.com: fix truncate race and sevaral comments]
      [kosaki.motohiro@jp.fujitsu.com: splitlru: introduce __get_user_pages()]
      Signed-off-by: default avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Signed-off-by: default avatarRik van Riel <riel@redhat.com>
      Signed-off-by: default avatarLee Schermerhorn <lee.schermerhorn@hp.com>
      Cc: Nick Piggin <npiggin@suse.de>
      Cc: Dave Hansen <dave@linux.vnet.ibm.com>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b291f000
  12. 04 Aug, 2008 1 commit
  13. 26 Jul, 2008 1 commit
  14. 12 Jun, 2008 1 commit
    • Paul Mundt's avatar
      nommu: Correct kobjsize() page validity checks. · 5a1603be
      Paul Mundt authored
      This implements a few changes on top of the recent kobjsize() refactoring
      introduced by commit 6cfd53fc.
      
      As Christoph points out:
      
      	virt_to_head_page cannot return NULL. virt_to_page also
      	does not return NULL. pfn_valid() needs to be used to
      	figure out if a page is valid.  Otherwise the page struct
      	reference that was returned may have PageReserved() set
      	to indicate that it is not a valid page.
      
      As discussed further in the thread, virt_addr_valid() is the preferable
      way to validate the object pointer in this case. In addition to fixing
      up the reserved page case, it also has the benefit of encapsulating the
      hack introduced by commit 4016a139
      
       on
      the impacted platforms, allowing us to get rid of the extra checking in
      kobjsize() for the platforms that don't perform this type of bizarre
      memory_end abuse (every nommu platform that isn't blackfin). If blackfin
      decides to get in line with every other platform and use PageReserved
      for the DMA pages in question, kobjsize() will also continue to work
      fine.
      
      It also turns out that compound_order() will give us back 0-order for
      non-head pages, so we can get rid of the PageCompound check and just
      use compound_order() directly. Clean that up while we're at it.
      Signed-off-by: default avatarPaul Mundt <lethal@linux-sh.org>
      Reviewed-by: default avatarChristoph Lameter <clameter@sgi.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a1603be
  15. 06 Jun, 2008 1 commit
  16. 24 May, 2008 1 commit
  17. 29 Apr, 2008 1 commit
    • Matt Helsley's avatar
      procfs task exe symlink · 925d1c40
      Matt Helsley authored
      
      The kernel implements readlink of /proc/pid/exe by getting the file from
      the first executable VMA.  Then the path to the file is reconstructed and
      reported as the result.
      
      Because of the VMA walk the code is slightly different on nommu systems.
      This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      walking the VMAs to find the first executable file-backed VMA we store a
      reference to the exec'd file in the mm_struct.
      
      That reference would prevent the filesystem holding the executable file
      from being unmounted even after unmapping the VMAs.  So we track the number
      of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      unmapped.  This avoids pinning the mounted filesystem.
      
      [akpm@linux-foundation.org: improve comments]
      [yamamoto@valinux.co.jp: fix dup_mmap]
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: David Howells <dhowells@redhat.com>
      Cc:"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarYAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      925d1c40
  18. 28 Apr, 2008 1 commit
    • Michael Hennerich's avatar
      mm/nommu.c: return 0 from kobjsize with invalid objects · 4016a139
      Michael Hennerich authored
      
      Don't perform kobjsize operations on objects the kernel doesn't manage.
      
      On Blackfin, drivers can get dma coherent memory by calling a function
      dma_alloc_coherent(). We do this in nommu by configuring a chunk of uncached
      memory at the top of memory.
      
      Since we don't want the kernel to use the uncached memory, we lie to the
      kernel, and tell it that it's max memory is between 0, and the start of the
      uncached dma coherent section.
      
      this all works well, until this memory gets exposed into userspace (with a
      frame buffer), when you look at the process's maps, it shows the framebuf:
      
      root:/proc> cat maps
      [snip]
      03f0ef00-03f34700 rw-p 00000000 1f:00 192        /dev/fb0
      root:/proc>
      
      This is outside the "normal" range for the kernel. When the kernel tries to
      find the size of this object (when you run ps), it dies in nommu.c in
      kobjsize.
      
      BUG_ON(page->index >= MAX_ORDER);
      
      since the page we are referring to is outside what the kernel thinks is it's
      max valid memory.
      
      root:~> while [ 1 ]; ps > /dev/null; done
      kernel BUG at mm/nommu.c:119!
      Kernel panic - not syncing: BUG!
      
      We fixed this by adding a check to reject out of range object pointers as it
      already does that for NULL pointers.
      Signed-off-by: default avatarMichael Hennerich <Michael.Hennerich@analog.com>
      Signed-off-by: default avatarRobin Getz <rgetz@blackfin.uclinux.org>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4016a139
  19. 05 Feb, 2008 2 commits
  20. 05 Dec, 2007 1 commit
    • Eric Paris's avatar
      Security: round mmap hint address above mmap_min_addr · 7cd94146
      Eric Paris authored
      
      If mmap_min_addr is set and a process attempts to mmap (not fixed) with a
      non-null hint address less than mmap_min_addr the mapping will fail the
      security checks.  Since this is just a hint address this patch will round
      such a hint address above mmap_min_addr.
      
      gcj was found to try to be very frugal with vm usage and give hint addresses
      in the 8k-32k range.  Without this patch all such programs failed and with
      the patch they happily get a higher address.
      
      This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist
      without it and there would be no security check possible no matter what.  So
      we should not bother compiling in this rounding if it is just a waste of
      time.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      7cd94146
  21. 29 Oct, 2007 1 commit
  22. 19 Oct, 2007 1 commit
  23. 17 Oct, 2007 1 commit
  24. 22 Aug, 2007 1 commit
    • Alan Cox's avatar
      fix NULL pointer dereference in __vm_enough_memory() · 34b4e4aa
      Alan Cox authored
      
      The new exec code inserts an accounted vma into an mm struct which is not
      current->mm.  The existing memory check code has a hard coded assumption
      that this does not happen as does the security code.
      
      As the correct mm is known we pass the mm to the security method and the
      helper function.  A new security test is added for the case where we need
      to pass the mm and the existing one is modified to pass current->mm to
      avoid the need to change large amounts of code.
      
      (Thanks to Tobias for fixing rejects and testing)
      Signed-off-by: default avatarAlan Cox <alan@redhat.com>
      Cc: WU Fengguang <wfg@mail.ustc.edu.cn>
      Cc: James Morris <jmorris@redhat.com>
      Cc: Tobias Diedrich <ranma+kernel@tdiedrich.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      34b4e4aa
  25. 21 Jul, 2007 1 commit
  26. 19 Jul, 2007 2 commits
    • Nick Piggin's avatar
      mm: fault feedback #1 · d0217ac0
      Nick Piggin authored
      
      Change ->fault prototype.  We now return an int, which contains
      VM_FAULT_xxx code in the low byte, and FAULT_RET_xxx code in the next byte.
       FAULT_RET_ code tells the VM whether a page was found, whether it has been
      locked, and potentially other things.  This is not quite the way he wanted
      it yet, but that's changed in the next patch (which requires changes to
      arch code).
      
      This means we no longer set VM_CAN_INVALIDATE in the vma in order to say
      that a page is locked which requires filemap_nopage to go away (because we
      can no longer remain backward compatible without that flag), but we were
      going to do that anyway.
      
      struct fault_data is renamed to struct vm_fault as Linus asked. address
      is now a void __user * that we should firmly encourage drivers not to use
      without really good reason.
      
      The page is now returned via a page pointer in the vm_fault struct.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0217ac0
    • Nick Piggin's avatar
      mm: merge populate and nopage into fault (fixes nonlinear) · 54cb8821
      Nick Piggin authored
      
      Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
      the virtual address -> file offset differently from linear mappings.
      
      ->populate is a layering violation because the filesystem/pagecache code
      should need to know anything about the virtual memory mapping.  The hitch here
      is that the ->nopage handler didn't pass down enough information (ie.  pgoff).
       But it is more logical to pass pgoff rather than have the ->nopage function
      calculate it itself anyway (because that's a similar layering violation).
      
      Having the populate handler install the pte itself is likewise a nasty thing
      to be doing.
      
      This patch introduces a new fault handler that replaces ->nopage and
      ->populate and (later) ->nopfn.  Most of the old mechanism is still in place
      so there is a lot of duplication and nice cleanups that can be removed if
      everyone switches over.
      
      The rationale for doing this in the first place is that nonlinear mappings are
      subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
      to duplicate the synchronisation logic rather than just consolidate the two.
      
      After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
      pagecache.  Seems like a fringe functionality anyway.
      
      NOPAGE_REFAULT is removed.  This should be implemented with ->fault, and no
      users have hit mainline yet.
      
      [akpm@linux-foundation.org: cleanup]
      [randy.dunlap@oracle.com: doc. fixes for readahead]
      [akpm@linux-foundation.org: build fix]
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarRandy Dunlap <randy.dunlap@oracle.com>
      Cc: Mark Fasheh <mark.fasheh@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      54cb8821
  27. 16 Jul, 2007 1 commit
  28. 11 Jul, 2007 1 commit
    • Eric Paris's avatar
      security: Protection for exploiting null dereference using mmap · ed032189
      Eric Paris authored
      
      Add a new security check on mmap operations to see if the user is attempting
      to mmap to low area of the address space.  The amount of space protected is
      indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
      0, preserving existing behavior.
      
      This patch uses a new SELinux security class "memprotect."  Policy already
      contains a number of allow rules like a_t self:process * (unconfined_t being
      one of them) which mean that putting this check in the process class (its
      best current fit) would make it useless as all user processes, which we also
      want to protect against, would be allowed. By taking the memprotect name of
      the new class it will also make it possible for us to move some of the other
      memory protect permissions out of 'process' and into the new class next time
      we bump the policy version number (which I also think is a good future idea)
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: default avatarChris Wright <chrisw@sous-sol.org>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Signed-off-by: default avatarJames Morris <jmorris@namei.org>
      ed032189
  29. 08 May, 2007 1 commit
    • Christoph Hellwig's avatar
      move die notifier handling to common code · 1eeb66a1
      Christoph Hellwig authored
      
      This patch moves the die notifier handling to common code.  Previous
      various architectures had exactly the same code for it.  Note that the new
      code is compiled unconditionally, this should be understood as an appel to
      the other architecture maintainer to implement support for it aswell (aka
      sprinkling a notify_die or two in the proper place)
      
      arm had a notifiy_die that did something totally different, I renamed it to
      arm_notify_die as part of the patch and made it static to the file it's
      declared and used at.  avr32 used to pass slightly less information through
      this interface and I brought it into line with the other architectures.
      
      [akpm@linux-foundation.org: build fix]
      [akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
      [bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Signed-off-by: default avatarBryan Wu <bryan.wu@analog.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1eeb66a1
  30. 12 Apr, 2007 1 commit
  31. 22 Mar, 2007 2 commits
    • David Howells's avatar
      [PATCH] NOMMU: make SYSV SHM nattch work correctly · 165b2392
      David Howells authored
      Make the SYSV SHM nattch counter work correctly by forcing multiple VMAs to
      be produced to represent MAP_SHARED segments, even if they overlap exactly.
      
      Using this test program:
      
      	http://people.redhat.com/~dhowells/doshm.c
      
      
      
      Run as:
      
      	doshm sysv
      
      I can see nattch going from one before the patch:
      
      	# /doshm sysv
      	Command: sysv
      	shmid: 65536
      	memory: 0xc3700000
      	c0b00000-c0b04000 rw-p 00000000 00:00 0
      	c0bb0000-c0bba788 r-xs 00000000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so
      	c3180000-c31dede4 r-xs 00000000 00:0b 14582179  /lib/libuClibc-0.9.28.so
      	c3520000-c352278c rw-p 00000000 00:0b 13763417  /doshm
      	c3584000-c35865e8 r-xs 00000000 00:0b 13763417  /doshm
      	c3588000-c358aa00 rw-p 00008000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so
      	c3590000-c359b6c0 rw-p 00000000 00:00 0
      	c3620000-c3640000 rwxp 00000000 00:00 0
      	c3700000-c37fa000 rw-S 00000000 00:06 1411      /SYSV00000000 (deleted)
      	c3700000-c37fa000 rw-S 00000000 00:06 1411      /SYSV00000000 (deleted)
      	nattch 1
      
      To two after the patch:
      
      	# /doshm sysv
      	Command: sysv
      	shmid: 0
      	memory: 0xc3700000
      	c0bb0000-c0bba788 r-xs 00000000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so
      	c3180000-c31dede4 r-xs 00000000 00:0b 14582179  /lib/libuClibc-0.9.28.so
      	c3320000-c3340000 rwxp 00000000 00:00 0
      	c3530000-c35325e8 r-xs 00000000 00:0b 13763417  /doshm
      	c3534000-c353678c rw-p 00000000 00:0b 13763417  /doshm
      	c3538000-c353aa00 rw-p 00008000 00:0b 14582157  /lib/ld-uClibc-0.9.28.so
      	c3590000-c359b6c0 rw-p 00000000 00:00 0
      	c35a4000-c35a8000 rw-p 00000000 00:00 0
      	c3700000-c37fa000 rw-S 00000000 00:06 1369      /SYSV00000000 (deleted)
      	c3700000-c37fa000 rw-S 00000000 00:06 1369      /SYSV00000000 (deleted)
      	nattch 2
      
      That's +1 to nattch for each shmat() made.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      165b2392
    • David Howells's avatar
      [PATCH] NOMMU: supply get_unmapped_area() to fix NOMMU SYSV SHM · d56e03cd
      David Howells authored
      
      Supply a get_unmapped_area() to fix NOMMU SYSV SHM support.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAdam Litke <agl@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d56e03cd
  32. 08 Dec, 2006 1 commit
  33. 07 Dec, 2006 1 commit