1. 16 Jul, 2007 1 commit
  2. 09 Jul, 2007 3 commits
  3. 08 Jun, 2007 1 commit
    • Alexey Kuznetsov's avatar
      pi-futex: fix exit races and locking problems · 778e9a9c
      Alexey Kuznetsov authored
      
      1. New entries can be added to tsk->pi_state_list after task completed
         exit_pi_state_list(). The result is memory leakage and deadlocks.
      
      2. handle_mm_fault() is called under spinlock. The result is obvious.
      
      3. results in self-inflicted deadlock inside glibc.
         Sometimes futex_lock_pi returns -ESRCH, when it is not expected
         and glibc enters to for(;;) sleep() to simulate deadlock. This problem
         is quite obvious and I think the patch is right. Though it looks like
         each "if" in futex_lock_pi() got some stupid special case "else if". :-)
      
      4. sometimes futex_lock_pi() returns -EDEADLK,
         when nobody has the lock. The reason is also obvious (see comment
         in the patch), but correct fix is far beyond my comprehension.
         I guess someone already saw this, the chunk:
      
                              if (rt_mutex_trylock(&q.pi_state->pi_mutex))
                                      ret = 0;
      
         is obviously from the same opera. But it does not work, because the
         rtmutex is really taken at this point: wake_futex_pi() of previous
         owner reassigned it to us. My fix works. But it looks very stupid.
         I would think about removal of shift of ownership in wake_futex_pi()
         and making all the work in context of process taking lock.
      
      From: Thomas Gleixner <tglx@linutronix.de>
      
      Fix 1) Avoid the tasklist lock variant of the exit race fix by adding
          an additional state transition to the exit code.
      
          This fixes also the issue, when a task with recursive segfaults
          is not able to release the futexes.
      
      Fix 2) Cleanup the lookup_pi_state() failure path and solve the -ESRCH
          problem finally.
      
      Fix 3) Solve the fixup_pi_state_owner() problem which needs to do the fixup
          in the lock protected section by using the in_atomic userspace access
          functions.
      
          This removes also the ugly lock drop / unqueue inside of fixup_pi_state()
      
      Fix 4) Fix a stale lock in the error path of futex_wake_pi()
      
      Added some error checks for verification.
      
      The -EDEADLK problem is solved by the rtmutex fixups.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Eric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      778e9a9c
  4. 23 May, 2007 1 commit
    • Roland McGrath's avatar
      recalc_sigpending_tsk fixes · 7bb44ade
      Roland McGrath authored
      
      Steve Hawkes discovered a problem where recalc_sigpending_tsk was called in
      do_sigaction but no signal_wake_up call was made, preventing later signals
      from waking up blocked threads with TIF_SIGPENDING already set.
      
      In fact, the few other calls to recalc_sigpending_tsk outside the signals
      code are also subject to this problem in other race conditions.
      
      This change makes recalc_sigpending_tsk private to the signals code.  It
      changes the outside calls, as well as do_sigaction, to use the new
      recalc_sigpending_and_wake instead.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Cc: <Steve.Hawkes@motorola.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7bb44ade
  5. 11 May, 2007 3 commits
    • Davide Libenzi's avatar
      signal/timer/event: signalfd core · fba2afaa
      Davide Libenzi authored
      This patch series implements the new signalfd() system call.
      
      I took part of the original Linus code (and you know how badly it can be
      broken :), and I added even more breakage ;) Signals are fetched from the same
      signal queue used by the process, so signalfd will compete with standard
      kernel delivery in dequeue_signal().  If you want to reliably fetch signals on
      the signalfd file, you need to block them with sigprocmask(SIG_BLOCK).  This
      seems to be working fine on my Dual Opteron machine.  I made a quick test
      program for it:
      
      http://www.xmailserver.org/signafd-test.c
      
      
      
      The signalfd() system call implements signal delivery into a file descriptor
      receiver.  The signalfd file descriptor if created with the following API:
      
      int signalfd(int ufd, const sigset_t *mask, size_t masksize);
      
      The "ufd" parameter allows to change an existing signalfd sigmask, w/out going
      to close/create cycle (Linus idea).  Use "ufd" == -1 if you want a brand new
      signalfd file.
      
      The "mask" allows to specify the signal mask of signals that we are interested
      in.  The "masksize" parameter is the size of "mask".
      
      The signalfd fd supports the poll(2) and read(2) system calls.  The poll(2)
      will return POLLIN when signals are available to be dequeued.  As a direct
      consequence of supporting the Linux poll subsystem, the signalfd fd can use
      used together with epoll(2) too.
      
      The read(2) system call will return a "struct signalfd_siginfo" structure in
      the userspace supplied buffer.  The return value is the number of bytes copied
      in the supplied buffer, or -1 in case of error.  The read(2) call can also
      return 0, in case the sighand structure to which the signalfd was attached,
      has been orphaned.  The O_NONBLOCK flag is also supported, and read(2) will
      return -EAGAIN in case no signal is available.
      
      If the size of the buffer passed to read(2) is lower than sizeof(struct
      signalfd_siginfo), -EINVAL is returned.  A read from the signalfd can also
      return -ERESTARTSYS in case a signal hits the process.  The format of the
      struct signalfd_siginfo is, and the valid fields depends of the (->code &
      __SI_MASK) value, in the same way a struct siginfo would:
      
      struct signalfd_siginfo {
      	__u32 signo;	/* si_signo */
      	__s32 err;	/* si_errno */
      	__s32 code;	/* si_code */
      	__u32 pid;	/* si_pid */
      	__u32 uid;	/* si_uid */
      	__s32 fd;	/* si_fd */
      	__u32 tid;	/* si_fd */
      	__u32 band;	/* si_band */
      	__u32 overrun;	/* si_overrun */
      	__u32 trapno;	/* si_trapno */
      	__s32 status;	/* si_status */
      	__s32 svint;	/* si_int */
      	__u64 svptr;	/* si_ptr */
      	__u64 utime;	/* si_utime */
      	__u64 stime;	/* si_stime */
      	__u64 addr;	/* si_addr */
      };
      
      [akpm@linux-foundation.org: fix signalfd_copyinfo() on i386]
      Signed-off-by: default avatarDavide Libenzi <davidel@xmailserver.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fba2afaa
    • Sukadev Bhattiprolu's avatar
      attach_pid() with struct pid parameter · e713d0da
      Sukadev Bhattiprolu authored
      
      attach_pid() currently takes a pid_t and then uses find_pid() to find the
      corresponding struct pid.  Sometimes we already have the struct pid.  We can
      then skip find_pid() if attach_pid() were to take a struct pid parameter.
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@us.ibm.com>
      Cc: Cedric Le Goater <clg@fr.ibm.com>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Cc: Serge Hallyn <serue@us.ibm.com>
      Cc: <containers@lists.osdl.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e713d0da
    • Eric Dumazet's avatar
      getrusage(): fill ru_inblock and ru_oublock fields if possible · 6eaeeaba
      Eric Dumazet authored
      
      If CONFIG_TASK_IO_ACCOUNTING is defined, we update io accounting counters for
      each task.
      
      This patch permits reporting of values using the well known getrusage()
      syscall, filling ru_inblock and ru_oublock instead of null values.
      
      As TASK_IO_ACCOUNTING currently counts bytes counts, we approximate blocks
      count doing : nr_blocks = nr_bytes / 512
      
      Example of use :
      ----------------------
      After patch is applied, /usr/bin/time command can now give a good
      approximation of IO that the process had to do.
      
      $ /usr/bin/time grep tototo /usr/include/*
      Command exited with non-zero status 1
      0.00user 0.02system 0:02.11elapsed 1%CPU (0avgtext+0avgdata 0maxresident)k
      24288inputs+0outputs (0major+259minor)pagefaults 0swaps
      
      $ /usr/bin/time dd if=/dev/zero of=/tmp/testfile count=1000
      1000+0 enregistrements lus
      1000+0 enregistrements écrits
      512000 octets (512 kB) copiés, 0,00326601 seconde, 157 MB/s
      0.00user 0.00system 0:00.00elapsed 80%CPU (0avgtext+0avgdata 0maxresident)k
      0inputs+3000outputs (0major+299minor)pagefaults 0swaps
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6eaeeaba
  6. 09 May, 2007 2 commits
  7. 08 May, 2007 1 commit
  8. 07 May, 2007 1 commit
  9. 29 Mar, 2007 1 commit
  10. 12 Feb, 2007 4 commits
  11. 11 Feb, 2007 1 commit
  12. 30 Jan, 2007 3 commits
  13. 31 Dec, 2006 1 commit
    • Oleg Nesterov's avatar
      [PATCH] restore ->pdeath_signal behaviour · 241ceee0
      Oleg Nesterov authored
      Commit b2b2cbc4
      
       introduced a user-
      visible change: ->pdeath_signal is sent only when the entire thread
      group exits.
      
      While this change is imho good, it may break things.  So restore the
      old behaviour for now.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      To: Albert Cahalan <acahalan@gmail.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Andrew Morton <akpm@osdl.org>
      Cc: Linus Torvalds <torvalds@osdl.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Qi Yong <qiyong@fc-cn.com>
      Cc: Roland McGrath <roland@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      241ceee0
  14. 22 Dec, 2006 2 commits
    • Eric W. Biederman's avatar
      [PATCH] Fix reparenting to the same thread group. (take 2) · b2b2cbc4
      Eric W. Biederman authored
      
      This patch fixes the case when we reparent to a different thread in the
      same thread group.  This modifies the code so that we do not send
      signals and do not change the signal to send to SIGCHLD unless we have
      change the thread group of our parents.  It also suppresses sending
      pdeath_sig in this cas as well since the result of geppid doesn't
      change.
      
      Thanks to Oleg for spotting my bug of only fixing this for non-ptraced
      tasks.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Albert Cahalan <acahalan@gmail.com>
      Cc: Andrew Morton <akpm@osdl.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Coywolf Qi Hunt <qiyong@fc-cn.com>
      Acked-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b2b2cbc4
    • Vadim Lobanov's avatar
      [PATCH] fdtable: Provide free_fdtable() wrapper · 01b2d93c
      Vadim Lobanov authored
      
      Christoph Hellwig has expressed concerns that the recent fdtable changes
      expose the details of the RCU methodology used to release no-longer-used
      fdtable structures to the rest of the kernel.  The trivial patch below
      addresses these concerns by introducing the appropriate free_fdtable()
      calls, which simply wrap the release RCU usage.  Since free_fdtable() is a
      one-liner, it makes sense to promote it to an inline helper.
      Signed-off-by: default avatarVadim Lobanov <vlobanov@speakeasy.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      01b2d93c
  15. 10 Dec, 2006 2 commits
    • Vadim Lobanov's avatar
      [PATCH] fdtable: Remove the free_files field · 4fd45812
      Vadim Lobanov authored
      
      An fdtable can either be embedded inside a files_struct or standalone (after
      being expanded).  When an fdtable is being discarded after all RCU references
      to it have expired, we must either free it directly, in the standalone case,
      or free the files_struct it is contained within, in the embedded case.
      
      Currently the free_files field controls this behavior, but we can get rid of
      it entirely, as all the necessary information is already recorded.  We can
      distinguish embedded and standalone fdtables using max_fds, and if it is
      embedded we can divine the relevant files_struct using container_of().
      Signed-off-by: default avatarVadim Lobanov <vlobanov@speakeasy.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4fd45812
    • Vadim Lobanov's avatar
      [PATCH] fdtable: Make fdarray and fdsets equal in size · bbea9f69
      Vadim Lobanov authored
      
      Currently, each fdtable supports three dynamically-sized arrays of data: the
      fdarray and two fdsets.  The code allows the number of fds supported by the
      fdarray (fdtable->max_fds) to differ from the number of fds supported by each
      of the fdsets (fdtable->max_fdset).
      
      In practice, it is wasteful for these two sizes to differ: whenever we hit a
      limit on the smaller-capacity structure, we will reallocate the entire fdtable
      and all the dynamic arrays within it, so any delta in the memory used by the
      larger-capacity structure will never be touched at all.
      
      Rather than hogging this excess, we shouldn't even allocate it in the first
      place, and keep the capacities of the fdarray and the fdsets equal.  This
      patch removes fdtable->max_fdset.  As an added bonus, most of the supporting
      code becomes simpler.
      Signed-off-by: default avatarVadim Lobanov <vlobanov@speakeasy.net>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      bbea9f69
  16. 08 Dec, 2006 7 commits
  17. 07 Dec, 2006 1 commit
  18. 28 Oct, 2006 1 commit
    • Oleg Nesterov's avatar
      [PATCH] taskstats_tgid_free: fix usage · 093a8e8a
      Oleg Nesterov authored
      
      taskstats_tgid_free() is called on copy_process's error path. This is wrong.
      
      	IF (clone_flags & CLONE_THREAD)
      		We should not clear ->signal->taskstats, current uses it,
      		it probably has a valid accumulated info.
      	ELSE
      		taskstats_tgid_init() set ->signal->taskstats = NULL,
      		there is nothing to free.
      
      Move the callsite to __exit_signal(). We don't need any locking, entire
      thread group is exiting, nobody should have a reference to soon to be
      released ->signal.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jay Lan <jlan@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      093a8e8a
  19. 02 Oct, 2006 3 commits
  20. 01 Oct, 2006 1 commit
    • Jay Lan's avatar
      [PATCH] csa: convert CONFIG tag for extended accounting routines · 8f0ab514
      Jay Lan authored
      
      There were a few accounting data/macros that are used in CSA but are #ifdef'ed
      inside CONFIG_BSD_PROCESS_ACCT.  This patch is to change those ifdef's from
      CONFIG_BSD_PROCESS_ACCT to CONFIG_TASK_XACCT.  A few defines are moved from
      kernel/acct.c and include/linux/acct.h to kernel/tsacct.c and
      include/linux/tsacct_kern.h.
      Signed-off-by: default avatarJay Lan <jlan@sgi.com>
      Cc: Shailabh Nagar <nagar@watson.ibm.com>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Jes Sorensen <jes@sgi.com>
      Cc: Chris Sturtivant <csturtiv@sgi.com>
      Cc: Tony Ernst <tee@sgi.com>
      Cc: Guillaume Thouvenin <guillaume.thouvenin@bull.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      8f0ab514