1. 25 Jul, 2008 8 commits
  2. 24 Jul, 2008 1 commit
  3. 10 Jul, 2008 1 commit
    • Hugh Dickins's avatar
      exec: fix stack excutability without PT_GNU_STACK · 96a8e13e
      Hugh Dickins authored
      
      Kernel Bugzilla #11063 points out that on some architectures (e.g. x86_32)
      exec'ing an ELF without a PT_GNU_STACK program header should default to an
      executable stack; but this got broken by the unlimited argv feature because
      stack vma is now created before the right personality has been established:
      so breaking old binaries using nested function trampolines.
      
      Therefore re-evaluate VM_STACK_FLAGS in setup_arg_pages, where stack
      vm_flags used to be set, before the mprotect_fixup.  Checking through
      our existing VM_flags, none would have changed since insert_vm_struct:
      so this seems safer than finding a way through the personality labyrinth.
      
      Reported-by: pageexec@freemail.hu
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      96a8e13e
  4. 16 Jun, 2008 1 commit
  5. 26 May, 2008 1 commit
  6. 16 May, 2008 1 commit
  7. 13 May, 2008 1 commit
  8. 01 May, 2008 1 commit
  9. 30 Apr, 2008 2 commits
  10. 29 Apr, 2008 3 commits
    • Matt Helsley's avatar
      procfs task exe symlink · 925d1c40
      Matt Helsley authored
      
      The kernel implements readlink of /proc/pid/exe by getting the file from
      the first executable VMA.  Then the path to the file is reconstructed and
      reported as the result.
      
      Because of the VMA walk the code is slightly different on nommu systems.
      This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      walking the VMAs to find the first executable file-backed VMA we store a
      reference to the exec'd file in the mm_struct.
      
      That reference would prevent the filesystem holding the executable file
      from being unmounted even after unmapping the VMAs.  So we track the number
      of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      unmapped.  This avoids pinning the mounted filesystem.
      
      [akpm@linux-foundation.org: improve comments]
      [yamamoto@valinux.co.jp: fix dup_mmap]
      Signed-off-by: default avatarMatt Helsley <matthltc@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: David Howells <dhowells@redhat.com>
      Cc:"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarYAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      925d1c40
    • Balbir Singh's avatar
      cgroups: add an owner to the mm_struct · cf475ad2
      Balbir Singh authored
      
      Remove the mem_cgroup member from mm_struct and instead adds an owner.
      
      This approach was suggested by Paul Menage.  The advantage of this approach
      is that, once the mm->owner is known, using the subsystem id, the cgroup
      can be determined.  It also allows several control groups that are
      virtually grouped by mm_struct, to exist independent of the memory
      controller i.e., without adding mem_cgroup's for each controller, to
      mm_struct.
      
      A new config option CONFIG_MM_OWNER is added and the memory resource
      controller selects this config option.
      
      This patch also adds cgroup callbacks to notify subsystems when mm->owner
      changes.  The mm_cgroup_changed callback is called with the task_lock() of
      the new task held and is called just prior to changing the mm->owner.
      
      I am indebted to Paul Menage for the several reviews of this patchset and
      helping me make it lighter and simpler.
      
      This patch was tested on a powerpc box, it was compiled with both the
      MM_OWNER config turned on and off.
      
      After the thread group leader exits, it's moved to init_css_state by
      cgroup_exit(), thus all future charges from runnings threads would be
      redirected to the init_css_set's subsystem.
      Signed-off-by: default avatarBalbir Singh <balbir@linux.vnet.ibm.com>
      Cc: Pavel Emelianov <xemul@openvz.org>
      Cc: Hugh Dickins <hugh@veritas.com>
      Cc: Sudhir Kumar <skumar@linux.vnet.ibm.com>
      Cc: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Cc: Hirokazu Takahashi <taka@valinux.co.jp>
      Cc: David Rientjes <rientjes@google.com>,
      Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
      Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarPekka Enberg <penberg@cs.helsinki.fi>
      Reviewed-by: default avatarPaul Menage <menage@google.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cf475ad2
    • Tetsuo Handa's avatar
      exec: remove argv_len from struct linux_binprm · 175a06ae
      Tetsuo Handa authored
      
      I noticed that 2.6.24.2 calculates bprm->argv_len at do_execve().  But it
      doesn't update bprm->argv_len after "remove_arg_zero() +
      copy_strings_kernel()" at load_script() etc.
      
      audit_bprm() is called from search_binary_handler() and
      search_binary_handler() is called from load_script() etc.  Thus, I think the
      condition check
      
        if (bprm->argv_len > (audit_argv_kb << 10))
                return -E2BIG;
      
      in audit_bprm() might return wrong result when strlen(removed_arg) !=
      strlen(spliced_args).  Why not update bprm->argv_len at load_script() etc.  ?
      
      By the way, 2.6.25-rc3 seems to not doing the condition check.  Is the field
      bprm->argv_len no longer needed?
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Ollie Wild <aaw@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      175a06ae
  11. 25 Apr, 2008 2 commits
    • Al Viro's avatar
      [PATCH] sanitize unshare_files/reset_files_struct · 3b125388
      Al Viro authored
      
      * let unshare_files() give caller the displaced files_struct
      * don't bother with grabbing reference only to drop it in the
        caller if it hadn't been shared in the first place
      * in that form unshare_files() is trivially implemented via
        unshare_fd(), so we eliminate the duplicate logics in fork.c
      * reset_files_struct() is not just only called for current;
        it will break the system if somebody ever calls it for anything
        else (we can't modify ->files of somebody else).  Lose the
        task_struct * argument.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      3b125388
    • Al Viro's avatar
      [PATCH] sanitize handling of shared descriptor tables in failing execve() · fd8328be
      Al Viro authored
      
      * unshare_files() can fail; doing it after irreversible actions is wrong
        and de_thread() is certainly irreversible.
      * since we do it unconditionally anyway, we might as well do it in do_execve()
        and save ourselves the PITA in binfmt handlers, etc.
      * while we are at it, binfmt_som actually leaked files_struct on failure.
      
      As a side benefit, unshare_files(), put_files_struct() and reset_files_struct()
      become unexported.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      fd8328be
  12. 03 Mar, 2008 1 commit
    • Linus Torvalds's avatar
      Allow ARG_MAX execve string space even with a small stack limit · a64e715f
      Linus Torvalds authored
      The new code that removed the limitation on the execve string size
      (which was historically 32 pages) replaced it with a much softer limit
      based on RLIMIT_STACK which is usually much larger than the traditional
      limit.  See commit b6a2fea3
      
       ("mm:
      variable length argument support") for details.
      
      However, if you have a small stack limit (perhaps because you need lots
      of stacks in a threaded environment), the new heuristic of allowing up
      to 1/4th of RLIMIT_STACK to be used for argument and environment strings
      could actually be smaller than the old limit.
      
      So just say that it's ok to have up to ARG_MAX strings regardless of the
      value of RLIMIT_STACK, and check the rlimit only when going over that
      traditional limit.
      
      (Of course, if you actually have a *really* small stack limit, the whole
      stack itself will be limited before you hit ARG_MAX, but that has always
      been true and is clearly the right behaviour anyway).
      Acked-by: default avatarCarlos O'Donell <carlos@codesourcery.com>
      Cc: Michael Kerrisk <michael.kerrisk@googlemail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ollie Wild <aaw@google.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a64e715f
  13. 15 Feb, 2008 2 commits
  14. 08 Feb, 2008 3 commits
  15. 05 Feb, 2008 2 commits
  16. 28 Nov, 2007 1 commit
  17. 12 Nov, 2007 1 commit
    • Roland McGrath's avatar
      core dump: remain dumpable · 00ec99da
      Roland McGrath authored
      
      The coredump code always calls set_dumpable(0) when it starts (even
      if RLIMIT_CORE prevents any core from being dumped).  The effect of
      this (via task_dumpable) is to make /proc/pid/* files owned by root
      instead of the user, so the user can no longer examine his own
      process--in a case where there was never any privileged data to
      protect.  This affects e.g. auxv, environ, fd; in Fedora (execshield)
      kernels, also maps.  In practice, you can only notice this when a
      debugger has requested PTRACE_EVENT_EXIT tracing.
      
      set_dumpable was only used in do_coredump for synchronization and not
      intended for any security purpose.  (It doesn't secure anything that wasn't
      already unsecured when a process dies by SIGTERM instead of SIGQUIT.)
      
      This changes do_coredump to check the core_waiters count as the means of
      synchronization, which is sufficient.  Now we leave the "dumpable" bits alone.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      00ec99da
  18. 19 Oct, 2007 6 commits
  19. 17 Oct, 2007 2 commits