1. 17 Aug, 2005 3 commits
  2. 10 Aug, 2005 1 commit
    • James Bottomley's avatar
      [PATCH] remove name length check in a workqueue · 60686744
      James Bottomley authored
      
      We have a chek in there to make sure that the name won't overflow
      task_struct.comm[], but it's triggering for scsi with lots of HBAs, only
      scsi is using single-threaded workqueues which don't append the "/%d"
      anyway.
      
      All too hard.  Just kill the BUG_ON.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      
      [ kthread_create() uses vsnprintf() and limits the thing, so no
        actual overflow can actually happen regardless ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      60686744
  3. 09 Aug, 2005 1 commit
    • Paul Jackson's avatar
      [PATCH] cpuset release ABBA deadlock fix · 3077a260
      Paul Jackson authored
      
      Fix possible cpuset_sem ABBA deadlock if 'notify_on_release' set.
      
      For a particular usage pattern, creating and destroying cpusets fairly
      frequently using notify_on_release, on a very large system, this deadlock
      can be seen every few days.  If you are not using the cpuset
      notify_on_release feature, you will never see this deadlock.
      
      The existing code, on task exit (or cpuset deletion) did:
      
        get cpuset_sem
        if cpuset marked notify_on_release and is ready to release:
          compute cpuset path relative to /dev/cpuset mount point
          call_usermodehelper() forks /sbin/cpuset_release_agent with path
        drop cpuset_sem
      
      Unfortunately, the fork in call_usermodehelper can allocate memory, and
      allocating memory can require cpuset_sem, if the mems_generation values
      changed in the interim.  This results in an ABBA deadlock, trying to obtain
      cpuset_sem when it is already held by the current task.
      
      To fix this, I put the cpuset path (which must be computed while holding
      cpuset_sem) in a temporary buffer, to be used in the call_usermodehelper
      call of /sbin/cpuset_release_agent only _after_ dropping cpuset_sem.
      
      So the new logic is:
      
        get cpuset_sem
        if cpuset marked notify_on_release and is ready to release:
          compute cpuset path relative to /dev/cpuset mount point
          stash path in kmalloc'd buffer
        drop cpuset_sem
        call_usermodehelper() forks /sbin/cpuset_release_agent with path
        free path
      
      The sharp eyed reader might notice that this patch does not contain any
      calls to kmalloc.  The existing code in the check_for_release() routine was
      already kmalloc'ing a buffer to hold the cpuset path.  In the old code, it
      just held the buffer for a few lines, over the cpuset_release_agent() call
      that in turn invoked call_usermodehelper().  In the new code, with the
      application of this patch, it returns that buffer via the new char
      **ppathbuf parameter, for later use and freeing in cpuset_release_agent(),
      which is called after cpuset_sem is dropped.  Whereas the old code has just
      one call to cpuset_release_agent(), right in the check_for_release()
      routine, the new code has three calls to cpuset_release_agent(), from the
      various places that a cpuset can be released.
      
      This patch has been build and booted on SN2, and passed a stress test that
      previously hit the deadlock within a few seconds.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3077a260
  4. 04 Aug, 2005 2 commits
    • Andrew Morton's avatar
      [PATCH] revert "timer exit cleanup" · c3068951
      Andrew Morton authored
      
      Revert this June 17 patch: it broke persistence of timers across execve().
      
      Cc: Roland McGrath <roland@redhat.com>
      Cc: george anzinger <george@mvista.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c3068951
    • Benjamin Herrenschmidt's avatar
      [PATCH] Remove suspend() calls from shutdown path · c36f19e0
      Benjamin Herrenschmidt authored
      
      This removes the calls to device_suspend() from the shutdown path that
      were added sometime during 2.6.13-rc*.  They aren't working properly on
      a number of configs (I got reports from both ppc powerbook users and x86
      users) causing the system to not shutdown anymore.
      
      I think it isn't the right approach at the moment anyway.  We have
      already a shutdown() callback for the drivers that actually care about
      shutdown and the suspend() code isn't yet in a good enough shape to be
      so much generalized.  Also, the semantics of suspend and shutdown are
      slightly different on a number of setups and the way this was patched in
      provides little way for drivers to cleanly differenciate.  It should
      have been at least a different message.
      
      For 2.6.13, I think we should revert to 2.6.12 behaviour and have a
      working suspend back.
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c36f19e0
  5. 02 Aug, 2005 1 commit
  6. 01 Aug, 2005 1 commit
    • Ingo Molnar's avatar
      [PATCH] remove sys_set_zone_reclaim() · 6cb54819
      Ingo Molnar authored
      This removes sys_set_zone_reclaim() for now.  While i'm sure Martin is
      trying to solve a real problem, we must not hard-code an incomplete and
      insufficient approach into a syscall, because syscalls are pretty much
      for eternity.  I am quite strongly convinced that this syscall must not
      hit v2.6.13 in its current form.
      
      Firstly, the syscall lacks basic syscall design: e.g. it allows the
      global setting of VM policy for unprivileged users. (!) [ Imagine an
      Oracle installation and a SAP installation on the same NUMA box fighting
      over the 'optimal' setting for this flag. What will they do? Will they
      try to set the flag to their own preferred value every second or so? ]
      
      Secondly, it was added based on a single datapoint from Martin:
      
       http://marc.theaimsgroup.com/?l=linux-mm&m=111763597218177&w=2
      
      
      
      where Martin characterizes the numbers the following way:
      
       ' Run-to-run variability for "make -j" is huge, so these numbers aren't
         terribly useful except to see that with reclaim the benchmark still
         finishes in a reasonable amount of time. '
      
      in other words: the fundamental problem has likely not been solved, only
      a tendential move into the right direction has been observed, and a
      handful of numbers were picked out of a set of hugely variable results,
      without showing the variability data. How much variance is there
      run-to-run?
      
      I'd really suggest to first walk the walk and see what's needed to get
      stable & predictable kernel compilation numbers on that NUMA box, before
      adding random syscalls to tune a particular aspect of the VM ... which
      approach might not even matter once the whole picture has been analyzed
      and understood!
      
      The third, most important point is that the syscall exposes VM tuning
      internals in a completely unstructured way. What sense does it make to
      have a _GLOBAL_ per-node setting for 'should we go to another node for
      reclaim'? If then it might make sense to do this per-app, via numalib or
      so.
      
      The change is minimalistic in that it doesnt remove the syscall and the
      underlying infrastructure changes, only the user-visible changes.  We
      could perhaps add a CAP_SYS_ADMIN-only sysctl for this hack, a'ka
      /proc/sys/vm/swappiness, but even that looks quite counterproductive
      when the generic approach is that we are trying to reduce the number of
      external factors in the VM balance picture.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      6cb54819
  7. 30 Jul, 2005 1 commit
  8. 29 Jul, 2005 3 commits
  9. 27 Jul, 2005 8 commits
  10. 26 Jul, 2005 9 commits
  11. 18 Jul, 2005 1 commit
  12. 15 Jul, 2005 1 commit
  13. 14 Jul, 2005 1 commit
  14. 13 Jul, 2005 4 commits
  15. 12 Jul, 2005 3 commits
    • Robert Love's avatar
      [PATCH] inotify · 0eeca283
      Robert Love authored
      
      inotify is intended to correct the deficiencies of dnotify, particularly
      its inability to scale and its terrible user interface:
      
              * dnotify requires the opening of one fd per each directory
                that you intend to watch. This quickly results in too many
                open files and pins removable media, preventing unmount.
              * dnotify is directory-based. You only learn about changes to
                directories. Sure, a change to a file in a directory affects
                the directory, but you are then forced to keep a cache of
                stat structures.
              * dnotify's interface to user-space is awful.  Signals?
      
      inotify provides a more usable, simple, powerful solution to file change
      notification:
      
              * inotify's interface is a system call that returns a fd, not SIGIO.
      	  You get a single fd, which is select()-able.
              * inotify has an event that says "the filesystem that the item
                you were watching is on was unmounted."
              * inotify can watch directories or files.
      
      Inotify is currently used by Beagle (a desktop search infrastructure),
      Gamin (a FAM replacement), and other projects.
      
      See Documentation/filesystems/inotify.txt.
      Signed-off-by: default avatarRobert Love <rml@novell.com>
      Cc: John McCutchan <ttb@tentacle.dhs.org>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0eeca283
    • Hugh Dickins's avatar
      [PATCH] lower VM_DONTCOPY total_vm · 3b6bfcdb
      Hugh Dickins authored
      
      dup_mmap of a VM_DONTCOPY vma forgot to lower the child's total_vm.  (But
      no way does this account for the recent report of total_vm seen too low.)
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      3b6bfcdb
    • Andrew Morton's avatar
      [PATCH] name_to_dev_t warning fix · d53d9f16
      Andrew Morton authored
      
      kernel/power/disk.c needs a declaration of name_to_dev_t() in scope.  mount.h
      seems like an appropriate choice.
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d53d9f16