1. 14 May, 2009 2 commits
  2. 14 Feb, 2009 1 commit
    • Wei Yongjun's avatar
      ext4: New rec_len encoding for very large blocksizes · 3d0518f4
      Wei Yongjun authored
      
      The rec_len field in the directory entry is 16 bits, so to encode
      blocksizes larger than 64k becomes problematic.  This patch allows us
      to supprot block sizes up to 256k, by using the low 2 bits to extend
      the range of rec_len to 2**18-1 (since valid rec_len sizes must be a
      multiple of 4).  We use the convention that a rec_len of 0 or 65535
      means the filesystem block size, for compatibility with older kernels.
      
      It's unlikely we'll see VM pages of up to 256k, but at some point we
      might find that the Linux VM has been enhanced to support filesystem
      block sizes > than the VM page size, at which point it might be useful
      for some applications to allow very large filesystem block sizes.
      Signed-off-by: default avatarWei Yongjun <yjwei@cn.fujitsu.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      3d0518f4
  3. 06 Jan, 2009 1 commit
  4. 05 Nov, 2008 1 commit
    • Theodore Ts'o's avatar
      ext4: Change unsigned long to unsigned int · 498e5f24
      Theodore Ts'o authored
      
      Convert the unsigned longs that are most responsible for bloating the
      stack usage on 64-bit systems.
      
      Nearly all places in the ext3/4 code which uses "unsigned long" is
      probably a bug, since on 32-bit systems a ulong a 32-bits, which means
      we are wasting stack space on 64-bit systems.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      498e5f24
  5. 25 Oct, 2008 1 commit
  6. 09 Oct, 2008 1 commit
    • Eric Sandeen's avatar
      ext4: Avoid printk floods in the face of directory corruption · 9d9f1775
      Eric Sandeen authored
      
      Note: some people thinks this represents a security bug, since it
      might make the system go away while it is printing a large number of
      console messages, especially if a serial console is involved.  Hence,
      it has been assigned CVE-2008-3528, but it requires that the attacker
      either has physical access to your machine to insert a USB disk with a
      corrupted filesystem image (at which point why not just hit the power
      button), or is otherwise able to convince the system administrator to
      mount an arbitrary filesystem image (at which point why not just
      include a setuid shell or world-writable hard disk device file or some
      such).  Me, I think they're just being silly. --tytso
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Cc: linux-ext4@vger.kernel.org
      Cc: Eugene Teo <eugeneteo@kernel.sg>
      9d9f1775
  7. 08 Sep, 2008 2 commits
  8. 19 Aug, 2008 1 commit
  9. 14 Jul, 2008 1 commit
    • Mingming Cao's avatar
      ext4: delayed allocation ENOSPC handling · d2a17637
      Mingming Cao authored
      
      This patch does block reservation for delayed
      allocation, to avoid ENOSPC later at page flush time.
      
      Blocks(data and metadata) are reserved at da_write_begin()
      time, the freeblocks counter is updated by then, and the number of
      reserved blocks is store in per inode counter.
              
      At the writepage time, the unused reserved meta blocks are returned
      back. At unlink/truncate time, reserved blocks are properly released.
      
      Updated fix from  Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      to fix the oldallocator block reservation accounting with delalloc, added
      lock to guard the counters and also fix the reservation for meta blocks.
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMingming Cao <cmm@us.ibm.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      d2a17637
  10. 11 Jul, 2008 1 commit
  11. 29 Apr, 2008 2 commits
  12. 25 Feb, 2008 1 commit
  13. 28 Jan, 2008 2 commits
    • Aneesh Kumar K.V's avatar
      ext4: Introduce ext4_lblk_t · 725d26d3
      Aneesh Kumar K.V authored
      
      This patch adds a new data type ext4_lblk_t to represent
      the logical file blocks.
      
      This is the preparatory patch to support large files in ext4
      The follow up patch with convert the ext4_inode i_blocks to
      represent the number of blocks in file system block size. This
      changes makes it possible to have a block number 2**32 -1 which
      will result in overflow if the block number is represented by
      signed long. This patch convert all the block number to type
      ext4_lblk_t which is typedef to __u32
      
      Also remove dead code ext4_ext_walk_space
      Signed-off-by: default avatarAneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      Signed-off-by: default avatarMingming Cao <cmm@us.ibm.com>
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      725d26d3
    • Jan Kara's avatar
      ext4: Avoid rec_len overflow with 64KB block size · a72d7f83
      Jan Kara authored
      
      With 64KB blocksize, a directory entry can have size 64KB which does not fit
      into 16 bits we have for entry lenght. So we store 0xffff instead and convert
      value when read from / written to disk. The patch also converts some places
      to use ext4_next_entry() when we are changing them anyway.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMingming Cao <cmm@us.ibm.com>
      a72d7f83
  14. 17 Oct, 2007 2 commits
  15. 16 Oct, 2007 1 commit
  16. 19 Jul, 2007 2 commits
  17. 08 May, 2007 1 commit
  18. 08 Dec, 2006 1 commit
  19. 07 Dec, 2006 1 commit
    • Eric Sandeen's avatar
      [PATCH] handle ext4 directory corruption better · e6c40211
      Eric Sandeen authored
      I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
      http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz
      
      
      
      Basically it makes a filesystem, splats some random bits over it, then
      tries to mount it and do some simple filesystem actions.
      
      At best, the filesystem catches the corruption gracefully.  At worst,
      things spin out of control.
      
      As you might guess, we found a couple places in ext4 where things spin out
      of control :)
      
      First, we had a corrupted directory that was never checked for
      consistency...  it was corrupt, and pointed to another bad "entry" of
      length 0.  The for() loop looped forever, since the length of
      ext4_next_entry(de) was 0, and we kept looking at the same pointer over and
      over and over and over...  I modeled this check and subsequent action on
      what is done for other directory types in ext4_readdir...
      
      (adding this check adds some computational expense; I am testing a followup
      patch to reduce the number of times we check and re-check these directory
      entries, in all cases.  Thanks for the idea, Andreas).
      
      Next we had a root directory inode which had a corrupted size, claimed to
      be > 200M on a 4M filesystem.  There was only really 1 block in the
      directory, but because the size was so large, readdir kept coming back for
      more, spewing thousands of printk's along the way.
      
      Per Andreas' suggestion, if we're in this read error condition and we're
      trying to read an offset which is greater than i_blocks worth of bytes,
      stop trying, and break out of the loop.
      
      With these two changes fsfuzz test survives quite well on ext4.
      Signed-off-by: default avatarEric Sandeen <sandeen@redhat.com>
      Cc: <linux-ext4@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e6c40211
  20. 11 Oct, 2006 5 commits
  21. 30 Sep, 2006 1 commit
  22. 27 Sep, 2006 3 commits
  23. 21 Apr, 2006 1 commit
  24. 28 Mar, 2006 1 commit
  25. 26 Mar, 2006 1 commit
    • Mingming Cao's avatar
      [PATCH] ext3_get_blocks: Mapping multiple blocks at a once · 89747d36
      Mingming Cao authored
      Currently ext3_get_block() only maps or allocates one block at a time.  This
      is quite inefficient for sequential IO workload.
      
      I have posted a early implements a simply multiple block map and allocation
      with current ext3.  The basic idea is allocating the 1st block in the existing
      way, and attempting to allocate the next adjacent blocks on a best effort
      basis.  More description about the implementation could be found here:
      http://marc.theaimsgroup.com/?l=ext2-devel&m=112162230003522&w=2
      
      
      
      The following the latest version of the patch: break the original patch into 5
      patches, re-worked some logicals, and fixed some bugs.  The break ups are:
      
       [patch 1] Adding map multiple blocks at a time in ext3_get_blocks()
       [patch 2] Extend ext3_get_blocks() to support multiple block allocation
       [patch 3] Implement multiple block allocation in ext3-try-to-allocate
       (called via ext3_new_block()).
       [patch 4] Proper accounting updates in ext3_new_blocks()
       [patch 5] Adjust reservation window size properly (by the given number
       of blocks to allocate) before block allocation to increase the
       possibility of allocating multiple blocks in a single call.
      
      Tests done so far includes fsx,tiobench and dbench.  The following numbers
      collected from Direct IO tests (1G file creation/read) shows the system time
      have been greatly reduced (more than 50% on my 8 cpu system) with the patches.
      
       1G file DIO write:
       	2.6.15		2.6.15+patches
       real    0m31.275s	0m31.161s
       user    0m0.000s	0m0.000s
       sys     0m3.384s	0m0.564s
      
       1G file DIO read:
       	2.6.15		2.6.15+patches
       real    0m30.733s	0m30.624s
       user    0m0.000s	0m0.004s
       sys     0m0.748s	0m0.380s
      
      Some previous test we did on buffered IO with using multiple blocks allocation
      and delayed allocation shows noticeable improvement on throughput and system
      time.
      
      This patch:
      
      Add support of mapping multiple blocks in one call.
      
      This is useful for DIO reads and re-writes (where blocks are already
      allocated), also is in line with Christoph's proposal of using getblocks() in
      mpage_readpage() or mpage_readpages().
      Signed-off-by: default avatarMingming Cao <cmm@us.ibm.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      89747d36
  26. 23 Mar, 2006 1 commit
    • Andrew Morton's avatar
      [PATCH] ext3_readdir: use generic readahead · d8733c29
      Andrew Morton authored
      
      Linus points out that ext3_readdir's readahead only cuts in when
      ext3_readdir() is operating at the very start of the directory.  So for large
      directories we end up performing no readahead at all and we suck.
      
      So take it all out and use the core VM's page_cache_readahead().  This means
      that ext3 directory reads will use all of readahead's dynamic sizing goop.
      
      Note that we're using the directory's filp->f_ra to hold the readahead state,
      but readahead is actually being performed against the underlying blockdev's
      address_space.  Fortunately the readahead code is all set up to handle this.
      
      Tested with printk.  It works.  I was struggling to find a real workload which
      actually cared.
      
      (The patch also exports page_cache_readahead() to GPL modules)
      
      Cc: "Stephen C. Tweedie" <sct@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d8733c29
  27. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4