- 30 Mar, 2009 4 commits
-
-
Andre Noll authored
This patch renames the "size" field of struct mddev_s to "dev_sectors" and stores the number of 512-byte sectors instead of the number of 1K-blocks in it. All users of that field, including raid levels 1,4-6,10, are adjusted accordingly. This simplifies the code a bit because it allows to get rid of a couple of divisions/multiplications by two. In order to make checkpatch happy, some minor coding style issues have also been addressed. In particular, size_store() now uses strict_strtoull() instead of simple_strtoull(). Signed-off-by:
Andre Noll <maan@systemlinux.org> Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
It really is nicer to keep related code together.. Signed-off-by:
NeilBrown <neilb@suse.de>
-
NeilBrown authored
This makes the includes more explicit, and is preparation for moving md_k.h to drivers/md/md.h Remove include/raid/md.h as its only remaining use was to #include other files. Signed-off-by:
NeilBrown <neilb@suse.de>
-
Christoph Hellwig authored
Move the headers with the local structures for the disciplines and bitmap.h into drivers/md/ so that they are more easily grepable for hacking and not far away. md.h is left where it is for now as there are some uses from the outside. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- 08 Jan, 2009 1 commit
-
-
Cheng Renquan authored
The rdev_for_each macro defined in <linux/raid/md_k.h> is identical to list_for_each_entry_safe, from <linux/list.h>, it should be defined to use list_for_each_entry_safe, instead of reinventing the wheel. But some calls to each_entry_safe don't really need a safe version, just a direct list_for_each_entry is enough, this could save a temp variable (tmp) in every function that used rdev_for_each. In this patch, most rdev_for_each loops are replaced by list_for_each_entry, totally save many tmp vars; and only in the other situations that will call list_del to delete an entry, the safe version is used. Signed-off-by:
Cheng Renquan <crquan@gmail.com> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- 13 Oct, 2008 1 commit
-
-
Mike Christie authored
Multipath is best at handling transport errors. If it gets a device error then there is not much the multipath layer can do. It will just access the same device but from a different path. This patch breaks up failfast into device, transport and driver errors. The multipath layers (md and dm mutlipath) only ask the lower levels to fast fail transport errors. The user of failfast, read ahead, will ask to fast fail on all errors. Note that blk_noretry_request will return true if any failfast bit is set. This allows drivers that do not support the multipath failfast bits to continue to fail on any failfast error like before. Drivers like scsi that are able to fail fast specific errors can check for the specific fail fast type. In the next patch I will convert scsi. Signed-off-by:
Mike Christie <michaelc@cs.wisc.edu> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
James Bottomley <James.Bottomley@HansenPartnership.com>
-
- 12 Oct, 2008 1 commit
-
-
NeilBrown authored
A lot of cruft has gathered over the years. Time to remove it. Signed-off-by:
NeilBrown <neilb@suse.de>
-
- 09 Oct, 2008 2 commits
-
-
Tejun Heo authored
Move stats related fields - stamp, in_flight, dkstats - from disk to part0 and unify stat handling such that... * part_stat_*() now updates part0 together if the specified partition is not part0. ie. part_stat_*() are now essentially all_stat_*(). * {disk|all}_stat_*() are gone. * part_round_stats() is updated similary. It handles part0 stats automatically and disk_round_stats() is killed. * part_{inc|dec}_in_fligh() is implemented which automatically updates part0 stats for parts other than part0. * disk_map_sector_rcu() is updated to return part0 if no part matches. Combined with the above changes, this makes NULL special case handling in callers unnecessary. * Separate stats show code paths for disk are collapsed into part stats show code paths. * Rename disk_stat_lock/unlock() to part_stat_lock/unlock() While at it, reposition stat handling macros a bit and add missing parentheses around macro parameters. Signed-off-by:
Tejun Heo <tj@kernel.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
Tejun Heo authored
There are two variants of stat functions - ones prefixed with double underbars which don't care about preemption and ones without which disable preemption before manipulating per-cpu counters. It's unclear whether the underbarred ones assume that preemtion is disabled on entry as some callers don't do that. This patch unifies diskstats access by implementing disk_stat_lock() and disk_stat_unlock() which take care of both RCU (for partition access) and preemption (for per-cpu counter access). diskstats access should always be enclosed between the two functions. As such, there's no need for the versions which disables preemption. They're removed and double underbars ones are renamed to drop the underbars. As an extra argument is added, there's no danger of using the old version unconverted. disk_stat_lock() uses get_cpu() and returns the cpu index and all diskstat functions which access per-cpu counters now has @cpu argument to help RT. This change adds RCU or preemption operations at some places but also collapses several preemption ops into one at others. Overall, the performance difference should be negligible as all involved ops are very lightweight per-cpu ones. Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 21 Jul, 2008 1 commit
-
-
Andre Noll authored
This patch renames the array_size field of struct mddev_s to array_sectors and converts all instances to use units of 512 byte sectors instead of 1k blocks. Signed-off-by:
Andre Noll <maan@systemlinux.org> Signed-off-by:
NeilBrown <neilb@suse.de>
-
- 27 Jun, 2008 2 commits
-
-
Neil Brown authored
For all array types but linear, ->hot_add_disk returns 1 on success, 0 on failure. For linear, it returns 0 on success and -errno on failure. This doesn't cause a functional problem because the ->hot_add_disk function of linear is used quite differently to the others. However it is confusing. So convert all to return 0 for success or -errno on failure and fix call sites to match. Signed-off-by:
Neil Brown <neilb@suse.de>
-
Neil Brown authored
i.e. extend the 'md/dev-XXX/slot' attribute so that you can tell a device to fill an vacant slot in an and md array. Signed-off-by:
Neil Brown <neilb@suse.de>
-
- 24 May, 2008 1 commit
-
-
NeilBrown authored
When we get any IO error during a recovery (rebuilding a spare), we abort the recovery and restart it. For RAID6 (and multi-drive RAID1) it may not be best to restart at the beginning: when multiple failures can be tolerated, the recovery may be able to continue and re-doing all that has already been done doesn't make sense. We already have the infrastructure to record where a recovery is up to and restart from there, but it is not being used properly. This is because: - We sometimes abort with MD_RECOVERY_ERR rather than just MD_RECOVERY_INTR, which causes the recovery not be be checkpointed. - We remove spares and then re-added them which loses important state information. The distinction between MD_RECOVERY_ERR and MD_RECOVERY_INTR really isn't needed. If there is an error, the relevant drive will be marked as Faulty, and that is enough to ensure correct handling of the error. So we first remove MD_RECOVERY_ERR, changing some of the uses of it to MD_RECOVERY_INTR. Then we cause the attempt to remove a non-faulty device from an array to fail (unless recovery is impossible as the array is too degraded). Then when remove_and_add_spares attempts to remove the devices on which recovery can continue, it will fail, they will remain in place, and recovery will continue on them as desired. Issue: If we are halfway through rebuilding a spare and another drive fails, and a new spare is immediately available, do we want to: 1/ complete the current rebuild, then go back and rebuild the new spare or 2/ restart the rebuild from the start and rebuild both devices in parallel. Both options can be argued for. The code currently takes option 2 as a/ this requires least code change b/ this results in a minimally-degraded array in minimal time. Cc: "Eivind Sarto" <ivan@kasenna.com> Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 14 May, 2008 1 commit
-
-
Neil Brown authored
As setting and clearing queue flags now requires that we hold a spinlock on the queue, and as blk_queue_stack_limits is called without that lock, get the lock inside blk_queue_stack_limits. For blk_queue_stack_limits to be able to find the right lock, each md personality needs to set q->queue_lock to point to the appropriate lock. Those personalities which didn't previously use a spin_lock, us q->__queue_lock. So always initialise that lock when allocated. With this in place, setting/clearing of the QUEUE_FLAG_PLUGGED bit will no longer cause warnings as it will be clear that the proper lock is held. Thanks to Dan Williams for review and fixing the silly bugs. Signed-off-by:
NeilBrown <neilb@suse.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Alistair John Strachan <alistair@devzero.co.uk> Cc: Nick Piggin <npiggin@suse.de> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Jacek Luczak <difrost.kernel@gmail.com> Cc: Prakas...
-
- 28 Apr, 2008 1 commit
-
-
Nick Andrew authored
MD drivers use one printk() call to print 2 log messages and the second line may be prefixed by a TAB character. It may also output a trailing space before newline. klogd (I think) turns the TAB character into the 2 characters '^I' when logging to a file. This looks ugly. Instead of a leading TAB to indicate continuation, prefix both output lines with 'raid:' or similar. Also remove any trailing space in the vicinity of the affected code and consistently end the sentences with a period. Signed-off-by:
Nick Andrew <nick@nick-andrew.net> Cc: Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 06 Feb, 2008 1 commit
-
-
NeilBrown authored
As this is more in line with common practice in the kernel. Also swap the args around to be more like list_for_each. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 09 Nov, 2007 1 commit
-
-
Alan D. Brunelle authored
Added blk_unplug interface, allowing all invocations of unplugs to result in a generated blktrace UNPLUG. Signed-off-by:
Alan D. Brunelle <Alan.Brunelle@hp.com> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 16 Oct, 2007 1 commit
-
-
Jens Axboe authored
Then we can get rid of ->issue_flush_fn() and all the driver private implementations of that. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 10 Oct, 2007 1 commit
-
-
NeilBrown authored
As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 24 Jul, 2007 1 commit
-
-
Jens Axboe authored
Some of the code has been gradually transitioned to using the proper struct request_queue, but there's lots left. So do a full sweet of the kernel and get rid of this typedef and replace its uses with the proper type. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com>
-
- 28 Oct, 2006 1 commit
-
-
NeilBrown authored
A recent fix which made sure ->degraded was initialised properly exposed a second bug - ->degraded wasn't been updated when drives failed or were hot-added. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 21 Oct, 2006 1 commit
-
-
NeilBrown authored
Two less-used md personalities have bugs in the calculation of ->degraded (the extent to which the array is degraded). Signed-off-by:
Neil Brown <neilb@suse.de> Cc: <stable@kernel.org> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 03 Oct, 2006 2 commits
-
-
NeilBrown authored
raid1, raid10 and multipath don't report their 'congested' status through bdi_*_congested, but should. This patch adds the appropriate functions which just check the 'congested' status of all active members (with appropriate locking). raid1 read_balance should be modified to prefer devices where bdi_read_congested returns false. Then we could use the '&' branch rather than the '|' branch. However that should would need some benchmarking first to make sure it is actually a good idea. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
NeilBrown authored
Instead of magic numbers (0,1,2,3) in sb_dirty, we have some flags instead: MD_CHANGE_DEVS Some device state has changed requiring superblock update on all devices. MD_CHANGE_CLEAN The array has transitions from 'clean' to 'dirty' or back, requiring a superblock update on active devices, but possibly not on spares MD_CHANGE_PENDING A superblock update is underway. We wait for an update to complete by waiting for all flags to be clear. A flag can be set at any time, even during an update, without risk that the change will be lost. Stop exporting md_update_sb - isn't needed. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 26 Mar, 2006 1 commit
-
-
Matthew Dobson authored
This patch changes a mempool user, which is basically just a wrapper around kzalloc(), to use the common mempool_kmalloc/kfree, rather than its own wrapper function, removing duplicated code. Signed-off-by:
Matthew Dobson <colpatch@us.ibm.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 10 Jan, 2006 1 commit
-
-
Jesper Juhl authored
Decrease the number of pointer derefs in drivers/md/multipath.c Benefits of the patch: - Fewer pointer dereferences should make the code slightly faster. - Size of generated code is smaller - improved readability Signed-off-by:
Jesper Juhl <jesper.juhl@gmail.com> Acked-by:
Ingo Molnar <mingo@elte.hu> Cc: Alasdair G Kergon <agk@redhat.com> Acked-by:
NeilBrown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 06 Jan, 2006 3 commits
-
-
NeilBrown authored
Signed-off-by:
Neil Brown <neilb@suse.de> Acked-by:
Greg KH <greg@kroah.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
NeilBrown authored
md supports multiple different RAID level, each being implemented by a 'personality' (which is often in a separate module). These personalities have fairly artificial 'numbers'. The numbers are use to: 1- provide an index into an array where the various personalities are recorded 2- identify the module (via an alias) which implements are particular personality. Neither of these uses really justify the existence of personality numbers. The array can be replaced by a linked list which is searched (array lookup only happens very rarely). Module identification can be done using an alias based on level rather than 'personality' number. The current 'raid5' modules support two level (4 and 5) but only one personality. This slight awkwardness (which was handled in the mapping from level to personality) can be better handled by allowing raid5 to register 2 personalities. With this change in place, the core md module does not need to have an exhaustive list of all possible personalities, so other personalities can be added independently. This patch also moves the check for chunksize being non-zero into the ->run routines for the personalities that need it, rather than having it in core-md. This has a side effect of allowing 'faulty' and 'linear' not to have a chunk-size set. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
NeilBrown authored
Replace multiple kmalloc/memset pairs with kzalloc calls. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 09 Nov, 2005 2 commits
-
-
NeilBrown authored
This has the advantage of removing the confusion caused by 'rdev_t' and 'mddev_t' both having 'in_sync' fields. Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
Suzanne Wood authored
Acked-by: <paulmck@us.ibm.com> Signed-off-by:
Suzanne Wood <suzannew@cs.pdx.edu> Signed-off-by:
Neil Brown <neilb@suse.de> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 01 Nov, 2005 1 commit
-
-
Jens Axboe authored
Instead of having ->read_sectors and ->write_sectors, combine the two into ->sectors[2] and similar for the other fields. This saves a branch several places in the io path, since we don't have to care for what the actual io direction is. On my x86-64 box, that's 200 bytes less text in just the core (not counting the various drivers). Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 08 Oct, 2005 1 commit
-
-
Al Viro authored
- added typedef unsigned int __nocast gfp_t; - replaced __nocast uses for gfp flags with gfp_t - it gives exactly the same warnings as far as sparse is concerned, doesn't change generated code (from gcc point of view we replaced unsigned int with typedef) and documents what's going on far better. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 09 Sep, 2005 1 commit
-
-
NeilBrown authored
md does not yet support BIO_RW_BARRIER, so be honest about it and fail (-EOPNOTSUPP) any such requests. Signed-off-by:
Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 21 Jun, 2005 1 commit
-
-
Jesper Juhl authored
This patch removes some unneeded checks of pointers being NULL before calling kfree() on them. kfree() handles NULL pointers just fine, checking first is pointless. Signed-off-by:
Jesper Juhl <juhl-lkml@dif.dk> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 17 May, 2005 1 commit
-
-
NeilBrown authored
We we set the too early, they may still be in place and possibly get called even though the array didn't get set up properly. Signed-off-by:
Neil Brown <neilb@cse.unsw.edu.au> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 05 May, 2005 1 commit
-
-
Adrian Bunk authored
This patch makes some needlessly global identifiers static. Signed-off-by:
Adrian Bunk <bunk@stusta.de> Acked-by:
Arjan van de Ven <arjanv@infradead.org> Acked-by:
Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 01 May, 2005 1 commit
-
-
Paul E. McKenney authored
This patch changes calls to synchronize_kernel(), deprecated in the earlier "Deprecate synchronize_kernel, GPL replacement" patch to instead call the new synchronize_rcu() and synchronize_sched() APIs. Signed-off-by:
Paul E. McKenney <paulmck@us.ibm.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 16 Apr, 2005 1 commit
-
-
Linus Torvalds authored
Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip!
-