1. 26 Jun, 2006 1 commit
  2. 25 Jun, 2006 1 commit
  3. 23 Jun, 2006 1 commit
    • David Howells's avatar
      [PATCH] VFS: Permit filesystem to override root dentry on mount · 454e2398
      David Howells authored
      
      Extend the get_sb() filesystem operation to take an extra argument that
      permits the VFS to pass in the target vfsmount that defines the mountpoint.
      
      The filesystem is then required to manually set the superblock and root dentry
      pointers.  For most filesystems, this should be done with simple_set_mnt()
      which will set the superblock pointer and then set the root dentry to the
      superblock's s_root (as per the old default behaviour).
      
      The get_sb() op now returns an integer as there's now no need to return the
      superblock pointer.
      
      This patch permits a superblock to be implicitly shared amongst several mount
      points, such as can be done with NFS to avoid potential inode aliasing.  In
      such a case, simple_set_mnt() would not be called, and instead the mnt_root
      and mnt_sb would be set directly.
      
      The patch also makes the following changes:
      
       (*) the get_sb_*() convenience functions in the core kernel now take a vfsmount
           pointer argument and return an integer, so most filesystems have to change
           very little.
      
       (*) If one of the convenience function is not used, then get_sb() should
           normally call simple_set_mnt() to instantiate the vfsmount. This will
           always return 0, and so can be tail-called from get_sb().
      
       (*) generic_shutdown_super() now calls shrink_dcache_sb() to clean up the
           dcache upon superblock destruction rather than shrink_dcache_anon().
      
           This is required because the superblock may now have multiple trees that
           aren't actually bound to s_root, but that still need to be cleaned up. The
           currently called functions assume that the whole tree is rooted at s_root,
           and that anonymous dentries are not the roots of trees which results in
           dentries being left unculled.
      
           However, with the way NFS superblock sharing are currently set to be
           implemented, these assumptions are violated: the root of the filesystem is
           simply a dummy dentry and inode (the real inode for '/' may well be
           inaccessible), and all the vfsmounts are rooted on anonymous[*] dentries
           with child trees.
      
           [*] Anonymous until discovered from another tree.
      
       (*) The documentation has been adjusted, including the additional bit of
           changing ext2_* into foo_* in the documentation.
      
      [akpm@osdl.org: convert ipath_fs, do other stuff]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nathan Scott <nathans@sgi.com>
      Cc: Roland Dreier <rolandd@cisco.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      454e2398
  4. 22 Jun, 2006 2 commits
    • Andrew Morton's avatar
      [PATCH] prune_one_dentry() tweaks · d702ccb3
      Andrew Morton authored
      
      - Add description of d_lock handling to comments over prune_one_dentry().
      
      - It has three callsites - uninline it, saving 200 bytes of text.
      
      Cc: Jan Blunck <jblunck@suse.de>
      Cc: Kirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Cc: Balbir Singh <balbir@in.ibm.com>
      Cc: Neil Brown <neilb@suse.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d702ccb3
    • NeilBrown's avatar
      [PATCH] Fix dcache race during umount · 0feae5c4
      NeilBrown authored
      
      The race is that the shrink_dcache_memory shrinker could get called while a
      filesystem is being unmounted, and could try to prune a dentry belonging to
      that filesystem.
      
      If it does, then it will call in to iput on the inode while the dentry is
      no longer able to be found by the umounting process.  If iput takes a
      while, generic_shutdown_super could get all the way though
      shrink_dcache_parent and shrink_dcache_anon and invalidate_inodes without
      ever waiting on this particular inode.
      
      Eventually the superblock gets freed anyway and if the iput tried to touch
      it (which some filesystems certainly do), it will lose.  The promised
      "Self-destruct in 5 seconds" doesn't lead to a nice day.
      
      The race is closed by holding s_umount while calling prune_one_dentry on
      someone else's dentry.  As a down_read_trylock is used,
      shrink_dcache_memory will no longer try to prune the dentry of a filesystem
      that is being unmounted, and unmount will not be able to start until any
      such active prune_one_dentry completes.
      
      This requires that prune_dcache *knows* which filesystem (if any) it is
      doing the prune on behalf of so that it can be careful of other
      filesystems.  shrink_dcache_memory isn't called it on behalf of any
      filesystem, and so is careful of everything.
      
      shrink_dcache_anon is now passed a super_block rather than the s_anon list
      out of the superblock, so it can get the s_anon list itself, and can pass
      the superblock down to prune_dcache.
      
      If prune_dcache finds a dentry that it cannot free, it leaves it where it
      is (at the tail of the list) and exits, on the assumption that some other
      thread will be removing that dentry soon.  To try to make sure that some
      work gets done, a limited number of dnetries which are untouchable are
      skipped over while choosing the dentry to work on.
      
      I believe this race was first found by Kirill Korotaev.
      
      Cc: Jan Blunck <jblunck@suse.de>
      Acked-by: default avatarKirill Korotaev <dev@openvz.org>
      Cc: Olaf Hering <olh@suse.de>
      Acked-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Signed-off-by: default avatarBalbir Singh <balbir@in.ibm.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      0feae5c4
  5. 31 Mar, 2006 2 commits
  6. 26 Mar, 2006 3 commits
  7. 25 Mar, 2006 3 commits
    • Kirill Korotaev's avatar
      [PATCH] Reduce sched latency in shrink_dcache_sb() · 2ab13460
      Kirill Korotaev authored
      
      This patch reduces scheduling latency in shrink_dcache_sb() noticed during
      remounting of big partitions with many cached dentries.  The same latency
      fix was applied to select_parent() long ago.
      Signed-off-by: default avatarDenis Lunev <den@sw.ru>
      Signed-off-by: default avatarPavel Emelianov <xemul@sw.ru>
      Signed-off-by: default avatarKirill Korotaev <dev@openvz.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2ab13460
    • Nick Piggin's avatar
      [PATCH] inotify: lock avoidance with parent watch status in dentry · c32ccd87
      Nick Piggin authored
      
      Previous inotify work avoidance is good when inotify is completely unused,
      but it breaks down if even a single watch is in place anywhere in the
      system.  Robin Holt notices that udev is one such culprit - it slows down a
      512-thread application on a 512 CPU system from 6 seconds to 22 minutes.
      
      Solve this by adding a flag in the dentry that tells inotify whether or not
      its parent inode has a watch on it.  Event queueing to parent will skip
      taking locks if this flag is cleared.  Setting and clearing of this flag on
      all child dentries versus event delivery: this is no in terms of race
      cases, and that was shown to be equivalent to always performing the check.
      
      The essential behaviour is that activity occuring _after_ a watch has been
      added and _before_ it has been removed, will generate events.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Robert Love <rml@novell.com>
      Cc: John McCutchan <ttb@tentacle.dhs.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c32ccd87
    • David Howells's avatar
      [PATCH] Optimise d_find_alias() · 214fda1f
      David Howells authored
      
      The attached patch optimises d_find_alias() to only take the spinlock if
      there's anything in the the inode's alias list.  If there isn't, it returns
      NULL immediately.
      
      With respect to the superblock sharing patch, this should reduce by one the
      number of times the dcache_lock is taken by nfs_lookup() for ordinary
      directory lookups.
      
      Only in the case where there's already a dentry for particular directory inode
      (such as might happen when another mountpoint is rooted at that dentry) will
      the lock then be taken the extra time.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      214fda1f
  8. 24 Mar, 2006 1 commit
    • Paul Jackson's avatar
      [PATCH] cpuset memory spread slab cache hooks · b0196009
      Paul Jackson authored
      
      Change the kmem_cache_create calls for certain slab caches to support cpuset
      memory spreading.
      
      See the previous patches, cpuset_mem_spread, for an explanation of cpuset
      memory spreading, and cpuset_mem_spread_slab_cache for the slab cache support
      for memory spreading.
      
      The slab caches marked for now are: dentry_cache, inode_cache, some xfs slab
      caches, and buffer_head.  This list may change over time.  In particular,
      other file system types that are used extensively on large NUMA systems may
      want to allow for spreading their directory and inode slab cache entries.
      Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b0196009
  9. 08 Mar, 2006 1 commit
    • Dipankar Sarma's avatar
      [PATCH] fix file counting · 529bf6be
      Dipankar Sarma authored
      
      I have benchmarked this on an x86_64 NUMA system and see no significant
      performance difference on kernbench.  Tested on both x86_64 and powerpc.
      
      The way we do file struct accounting is not very suitable for batched
      freeing.  For scalability reasons, file accounting was
      constructor/destructor based.  This meant that nr_files was decremented
      only when the object was removed from the slab cache.  This is susceptible
      to slab fragmentation.  With RCU based file structure, consequent batched
      freeing and a test program like Serge's, we just speed this up and end up
      with a very fragmented slab -
      
      llm22:~ # cat /proc/sys/fs/file-nr
      587730  0       758844
      
      At the same time, I see only a 2000+ objects in filp cache.  The following
      patch I fixes this problem.
      
      This patch changes the file counting by removing the filp_count_lock.
      Instead we use a separate percpu counter, nr_files, for now and all
      accesses to it are through get_nr_files() api.  In the sysctl handler for
      nr_files, we populate files_stat.nr_files before returning to user.
      
      Counting files as an when they are created and destroyed (as opposed to
      inside slab) allows us to correctly count open files with RCU.
      Signed-off-by: default avatarDipankar Sarma <dipankar@in.ibm.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      529bf6be
  10. 03 Feb, 2006 1 commit
  11. 14 Jan, 2006 1 commit
  12. 10 Jan, 2006 1 commit
  13. 08 Jan, 2006 1 commit
    • Eric Dumazet's avatar
      [PATCH] shrink dentry struct · 5160ee6f
      Eric Dumazet authored
      
      Some long time ago, dentry struct was carefully tuned so that on 32 bits
      UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple
      of memory cache lines.
      
      Then RCU was added and dentry struct enlarged by two pointers, with nice
      results for SMP, but not so good on UP, because breaking the above tuning
      (128 + 8 = 136 bytes)
      
      This patch reverts this unwanted side effect, by using an union (d_u),
      where d_rcu and d_child are placed so that these two fields can share their
      memory needs.
      
      At the time d_free() is called (and d_rcu is really used), d_child is known
      to be empty and not touched by the dentry freeing.
      
      Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so
      the previous content of d_child is not needed if said dentry was unhashed
      but still accessed by a CPU because of RCU constraints)
      
      As dentry cache easily contains millions of entries, a size reduction is
      worth the extra complexity of the ugly C union.
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Cc: Dipankar Sarma <dipankar@in.ibm.com>
      Cc: Maneesh Soni <maneesh@in.ibm.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Paul Jackson <pj@sgi.com>
      Cc: Al Viro <viro@ftp.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Cc: Neil Brown <neilb@cse.unsw.edu.au>
      Cc: James Morris <jmorris@namei.org>
      Cc: Stephen Smalley <sds@epoch.ncsc.mil>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5160ee6f
  14. 07 Nov, 2005 1 commit
  15. 28 Oct, 2005 1 commit
    • Al Viro's avatar
      [PATCH] gfp_t: fs/* · 27496a8c
      Al Viro authored
      
       - ->releasepage() annotated (s/int/gfp_t), instances updated
       - missing gfp_t in fs/* added
       - fixed misannotation from the original sweep caught by bitwise checks:
         XFS used __nocast both for gfp_t and for flags used by XFS allocator.
         The latter left with unsigned int __nocast; we might want to add a
         different type for those but for now let's leave them alone.  That,
         BTW, is a case when __nocast use had been actively confusing - it had
         been used in the same code for two different and similar types, with
         no way to catch misuses.  Switch of gfp_t to bitwise had caught that
         immediately...
      
      One tricky bit is left alone to be dealt with later - mapping->flags is
      a mix of gfp_t and error indications.  Left alone for now.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      27496a8c
  16. 19 Sep, 2005 1 commit
  17. 10 Sep, 2005 1 commit
  18. 08 Aug, 2005 1 commit
    • John McCutchan's avatar
      [PATCH] fsnotify_name/inoderemove · 7a91bf7f
      John McCutchan authored
      
      The patch below unhooks fsnotify from vfs_unlink & vfs_rmdir.  It
      introduces two new fsnotify calls, that are hooked in at the dcache
      level.  This not only more closely matches how the VFS layer works, it
      also avoids the problem with locking and inode lifetimes.
      
      The two functions are
      
       - fsnotify_nameremove -- called when a directory entry is going away.
         It notifies the PARENT of the deletion.  This is called from
         d_delete().
      
       - inoderemove -- called when the files inode itself is going away.  It
         notifies the inode that is being deleted.  This is called from
         dentry_iput().
      Signed-off-by: default avatarJohn McCutchan <ttb@tentacle.dhs.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      7a91bf7f
  19. 05 May, 2005 1 commit
  20. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4