- 30 Sep, 2006 12 commits
-
-
Jens Axboe authored
Currently it scales with number of processes in that priority group, which is potentially not very nice as it's called quite often. Basically we always need to do tail inserts, except for the case of a new process. So just mark/detect a queue as such. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
Some were kmalloc_node(), some were still kmalloc(). Change them all to kmalloc_node(). Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
Kill a few inlines that bring in too much code to more than one location Shrinks kernel text by about 300 bytes on 32-bit x86. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
It's ok if the read path is a lot more costly, as long as inc/dec is really cheap. The inc/dec will happen for each created/freed io context, while the reading only happens when a disk queue exits. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
cfq_exit_lock is protecting two things now: - The per-ioc rbtree of cfq_io_contexts - The per-cfqd linked list of cfq_io_contexts The per-cfqd linked list can be protected by the queue lock, as it is (by definition) per cfqd as the queue lock is. The per-ioc rbtree is mainly used and updated by the process itself only. The only outside use is the io priority changing. If we move the priority changing to not browsing the rbtree, we can remove any locking from the rbtree updates and lookup completely. Let the sys_ioprio syscall just mark processes as having the iopriority changed and lazily update the private cfq io contexts the next time io is queued, and we can remove this locking as well. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
A collection of little fixes and cleanups: - We don't use the 'queued' sysfs exported attribute, since the may_queue() logic was rewritten. So kill it. - Remove dead defines. - cfq_set_active_queue() can be rewritten cleaner with else if conditions. - Several places had cfq_exit_cfqq() like logic, abstract that out and use that. - Annotate the cfqq kmem_cache_alloc() so the allocator knows that this is a repeat allocation if it fails with __GFP_WAIT set. Allows the allocator to start freeing some memory, if needed. CFQ already loops for this condition, so might as well pass the hint down. - Remove cfqd->rq_starved logic. It's not needed anymore after we dropped the crq allocation in cfq_set_request(). - Remove uneeded parameter passing. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
It's not needed for anything, so kill the bio passing. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
Get rid of the cfq_rq request type. With the added elevator_private2, we have enough room in struct request to get rid of any crq allocation/free for each request. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
There's just one flag currently (SYNC), and that one can be grabbed from the request. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
This removes the rbtree handling from CFQ. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
Right now, every IO scheduler implements its own backmerging (except for noop, which does no merging). That results in duplicated code for essentially the same operation, which is never a good thing. This patch moves the backmerging out of the io schedulers and into the elevator core. We save 1.6kb of text and as a bonus get backmerging for noop as well. Win-win! Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 21 Aug, 2006 1 commit
-
-
Oleg Nesterov authored
Obviously, cfq_cic_link() shouldn't free a just allocated cfq_io_context? The dead key is from __cic, so drop that. Signed-off-by:
Oleg Nesterov <oleg@tv-sign.ru> Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 25 Jul, 2006 1 commit
-
-
Jens Axboe authored
The CIC_SEEKY() test really wants to use the minimum of either: - 2 msecs (not jiffies) - or, the pending slice time So code it like that. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 30 Jun, 2006 1 commit
-
-
Jörn Engel authored
Signed-off-by:
Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by:
Adrian Bunk <bunk@stusta.de>
-
- 23 Jun, 2006 6 commits
-
-
Jens Axboe authored
They all duplicate macros to check for empty root and/or node, and clearing a node. So put those in rbtree.h. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
- Remember to set ->last_sector so that the cfq_choose_req() logic works correctly. - Remove redundant call to cfq_choose_req() Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
This is a collection of patches that greatly improve CFQ performance in some circumstances. - Change the idling logic to only kick in after a request is done and we are deciding what to do. Before the idling included the request service time, so it was hard to adjust. Now it's true think/idle time. - Take advantage of TCQ/NCQ/queueing for seeky sync workloads, but keep it in control for sync and sequential (or close to) workloads. - Expire queues immediately and move on to other busy queues, if we are not going to idle after the current one finishes. - Don't rearm idle timer if there are no busy queues. Just leave the system idle. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
Patch originally from Vasily Tarasov <vtaras@sw.ru> If you set io-priority of process 1 using sys_ioprio_set system call by another process 2 (like ionice do), then cfq_init_prio_data() function sets priority of process 2 (current) on queue of process 1 and clears the flag, that designates change of ioprio. So the process 1 will work like with priority of process 2. I propose not to call cfq_init_prio_data() on io-priority change, but only mark queue as queue with changed prority. Every time when new request comes cfq-scheduler checks for this flag and atomaticaly changes priority of queue to new value. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
A process flag to indicate whether we are doing sync io is incredibly ugly. It also causes performance problems when one does a lot of async io and then proceeds to sync it. Part of the io will go out as async, and the other part as sync. This causes a disconnect between the previously submitted io and the synced io. For io schedulers such as CFQ, this will cause us lost merges and suboptimal behaviour in scheduling. Remove PF_SYNCWRITE completely from the fsync/msync paths, and let the O_DIRECT path just directly indicate that the writes are sync by using WRITE_SYNC instead. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
We cannot update them if the user changes nr_requests, so don't set it in the first place. The gains are pretty questionable as well. The batching loss has been shown to decrease throughput. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 20 Jun, 2006 1 commit
-
-
Linus Torvalds authored
The color is now in the low bits of the parent pointer, and initializing it to 0 happens as part of the whole memset above, so just remove the unnecessary RB_CLEAR_COLOR. Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 14 Jun, 2006 1 commit
-
-
Jens Axboe authored
We don't clear the seek stat values in cfq_alloc_io_context(), and if ->seek_mean is unlucky enough to be set to -36 by chance, the first invocation of cfq_update_io_seektime() will oops with a divide by zero in do_div(). Just memset the entire cic instead of filling invididual values independently. Signed-off-by:
Jens Axboe <axboe@suse.de> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 08 Jun, 2006 1 commit
-
-
Jens Axboe authored
There's a race between shutting down one io scheduler and firing up the next, in which a new io could enter and cause the io scheduler to be invoked with bad or NULL data. To fix this, we need to maintain the queue lock for a bit longer. Unfortunately we cannot do that, since the elevator init requires to be run without the lock held. This isn't easily fixable, without also changing the mempool API. So split the initialization into two parts, and alloc-init operation and an attach operation. Then we can preallocate the io scheduler and related structures, and run the attach inside the lock after we detach the old one. This patch has survived 30 minutes of 1 second io scheduler switching with a very busy io load. Signed-off-by:
Jens Axboe <axboe@suse.de> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 01 Jun, 2006 5 commits
-
-
Jens Axboe authored
Now that we select busy_rr for possible service, insert entries at the back of that list instead of at the front. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
There's a small window from when the timer is entered and we grab the queue lock, where cfq_set_active_queue() could be rearming the timer for us. Seen in the wild on a 12-way ppc box. Fix this by just using mod_timer(), which will do the right thing for us. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
If the hardware is doing real queueing, decide that it's worthless to idle the hardware. It does reasonable simultaneous io in that case anyways, and the idling hurts some work loads. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
If we are anticipating a sync request from this process and we are waiting for that and see an async request come in, expire that slice and move on. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
For just one busy queue (like async write out), we often overlooked that we could queue more io and decided we were idle instead. This causes us quite a bit of performance loss. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 30 May, 2006 1 commit
-
-
Jens Axboe authored
- Drop cic from the list when seen as dead. - Fixup the locking, just use a simple spinlock. Signed-off-by:
Jens Axboe <axboe@suse.de> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 21 Apr, 2006 1 commit
-
-
David Woodhouse authored
They were abusing the rb_color field to mark nodes which weren't currently on the tree. Fix that to use the same method as eventpoll did -- setting the parent pointer to point back to itself. And use the appropriate accessor macros for setting and reading the parent. Signed-off-by:
David Woodhouse <dwmw2@infradead.org>
-
- 18 Apr, 2006 3 commits
-
-
OGAWA Hirofumi authored
In current code, we are re-reading cic->key after dead cic->key check. So, in theory, it may really re-read *after* cfq_exit_queue() seted NULL. To avoid race, we copy it to stack, then use it. With this change, I guess gcc will assign cic->key to a register or stack, and it wouldn't be re-readed. Signed-off-by:
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by:
Jens Axboe <axboe@suse.de>
-
OGAWA Hirofumi authored
When queue dies, we set cic->key=NULL as dead mark. So, when we traverse a rbtree, we must check whether it's still valid key. if it was invalidated, drop it, then restart the traversal from top. Signed-off-by:
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by:
Jens Axboe <axboe@suse.de>
-
OGAWA Hirofumi authored
On rmmod path, cfq/as waits to make sure all io-contexts was freed. However, it's using complete(), not wait_for_completion(). I think barrier() is not enough in here. To avoid the following case, this patch replaces barrier() with smb_wmb(). cpu0 visibility cpu1 [ioc_gnone=NULL,ioc_count=1] ioc_gnone = &all_gone NULL,ioc_count=1 atomic_read(&ioc_count) NULL,ioc_count=1 wait_for_completion() NULL,ioc_count=0 atomic_sub_and_test() NULL,ioc_count=0 if ( && ioc_gone) [ioc_gone==NULL, so doesn't call complete()] &all_gone,ioc_count=0 Signed-off-by:
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 28 Mar, 2006 3 commits
-
-
Jens Axboe authored
Detect whether a given process is seeky and if so disable (mostly) the idle window if it is. We still allow just a little idle time, just enough to allow that process to submit a new request. That is needed to maintain fairness across priority groups. In some cases, we could setup several async queues. This is not optimal from a performance POV, since we want all async io in one queue to perform good sorting on it. It also impacted sync queues, as async io got too much slice time. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Andreas Mohr authored
this is a small optimization to cfq_choose_req() in the CFQ I/O scheduler (this function is a semi-often invoked candidate in an oprofile log): by using a bit mask variable, we can use a simple switch() to check the various cases instead of having to query two variables for each check. Benefit: 251 vs. 285 bytes footprint of cfq_choose_req(). Also, common case 0 (no request wrapping) is now checked first in code. Signed-off-by:
Andreas Mohr <andi@lisas.de> Signed-off-by:
Jens Axboe <axboe@suse.de>
-
Jens Axboe authored
On setups with many disks, we spend a considerable amount of time looking up the process-disk mapping on each queue of io. Testing with a NULL based block driver, this costs 40-50% reduction in throughput for 1000 disks. Signed-off-by:
Jens Axboe <axboe@suse.de>
-
- 26 Mar, 2006 1 commit
-
-
Matthew Dobson authored
Modify well over a dozen mempool users to call mempool_create_slab_pool() rather than calling mempool_create() with extra arguments, saving about 30 lines of code and increasing readability. Signed-off-by:
Matthew Dobson <colpatch@us.ibm.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 18 Mar, 2006 2 commits