Commits · da775265021b61d5eb81df155e36cb0810f6df53 · matisse / android_kernel_samsung_matisse

20 Dec, 2006 1 commit

[PATCH] cfq-iosched: don't allow sync merges across queues · da775265

Jens Axboe authored 18 years ago


Currently we allow any merge, even if the io originates from different
processes. This can cause really bad starvation and unfairness, if those
ios happen to be synchronous (reads or direct writes).

So add a allow_merge hook to the io scheduler ops, so an io scheduler can
help decide whether a bio/process combination may be merged with an
existing request.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

da775265

13 Dec, 2006 1 commit

[PATCH] Propagate down request sync flag · 7749a8d4

Jens Axboe authored 18 years ago

We need to do this, otherwise the io schedulers don't get access to the
sync flag. Then they cannot tell the difference between a regular write
and an O_DIRECT write, which can cause a performance loss.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

7749a8d4

07 Dec, 2006 1 commit

[PATCH] slab: remove kmem_cache_t · e18b890b

Christoph Lameter authored 18 years ago


Replace all uses of kmem_cache_t with struct kmem_cache.

The patch was generated using the following script:

	#!/bin/sh
	#
	# Replace one string by another in all the kernel sources.
	#

	set -e

	for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
		quilt add $file
		sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
		mv /tmp/$$ $file
		quilt refresh
	done

The script was run like this

	sh replace kmem_cache_t "struct kmem_cache"
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e18b890b

01 Dec, 2006 1 commit

[BLOCK] Cleanup unused variable passing · bb37b94c

Jens Axboe authored 18 years ago


- ->init_queue() does not need the elevator passed in
- ->put_request() is a hot path and need not have the queue passed in
- cfq_update_io_seektime() does not need cfqd passed in
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>

bb37b94c

22 Nov, 2006 1 commit

WorkStruct: Pass the work_struct pointer instead of context data · 65f27f38

David Howells authored 18 years ago

Pass the work_struct pointer to the work function rather than context data.
The work function can use container_of() to work out the data.

For the cases where the container of the work_struct may go away the moment the
pending bit is cleared, it is made possible to defer the release of the
structure by deferring the clearing of the pending bit.

To make this work, an extra flag is introduced into the management side of the
work_struct. This governs auto-release of the structure upon execution.

Ordinarily, the work queue executor would release the work_struct for further
scheduling or deallocation by clearing the pending bit prior to jumping to the
work function. This means that, unless the driver makes some guarantee itself
that the work_struct won't go away, the work function may not access anything
else in the work_struct or its container lest they be deallocated.. This is a
problem if the auxiliary data is taken away (as done by the last patch).

However, if the pending bit is *not* cleared before jumping to the work
function, then the work function *may* access the work_struct and its container
with no problems. But then the work function must itself release the
work_struct by calling work_release().

In most cases, automatic release is fine, so this is the default. Special
initiators exist for the non-auto-release case (ending in _NAR).
Signed-Off-By: David Howells <dhowells@redhat.com>

65f27f38

31 Oct, 2006 1 commit

[PATCH] CFQ: request <-> request merging rr_list fixup · 5fccbf61

Jens Axboe authored 18 years ago


In very rare circumstances would we be pruning a merged request and at
the same time delete the implicated cfqq from the rr_list, and not readd
it when the merged request got added. This could cause io stalls until
that process issued io again.

Fix it up by putting the rr_list add handling into cfq_add_rq_rb(),
identical to how pruning is handled in cfq_del_rq_rb(). This fixes a
hang reproducible with fsx-linux.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

5fccbf61

30 Oct, 2006 2 commits

[PATCH] CFQ: bad locking in changed_ioprio() · c1b707d2

Jens Axboe authored 18 years ago


When the ioprio code recently got juggled a bit, a bug was introduced.
changed_ioprio() is no longer called with interrupts disabled, so using
plain spin_lock() on the queue_lock is a bug.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c1b707d2

[PATCH] CFQ: use irq safe locking in cfq_cic_link() · 0261d688

Jens Axboe authored 18 years ago


If cfq_set_request() is called for a new process AND a non-fs io
request (so that __GFP_WAIT may not be set), cfq_cic_link() may
use spin_lock_irq() and spin_unlock_irq() with interrupts already
disabled.

Fix is to always use irq safe locking in cfq_cic_link()
Acked-By: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

0261d688

01 Oct, 2006 1 commit

[PATCH] completions: lockdep annotate on stack completions · 6e9a4738

Peter Zijlstra authored 18 years ago


All on stack DECLARE_COMPLETIONs should be replaced by:
DECLARE_COMPLETION_ONSTACK
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6e9a4738

30 Sep, 2006 17 commits

[PATCH] Update axboe@suse.de email address · 0fe23479

Jens Axboe authored 18 years ago


As people often look for the copyright in files to see who to mail,
update the link to a neutral one.
Signed-off-by: Jens Axboe <axboe@kernel.dk>

0fe23479

[PATCH] cfq-iosched: use metadata read flag · 374f84ac

Jens Axboe authored 18 years ago


Give meta data reads preference over regular reads, as the process
often needs to get that out of the way to do the io it was actually
interested in.
Signed-off-by: Jens Axboe <axboe@suse.de>

374f84ac

[PATCH] cfq-iosched: improve queue preemption · bf572256

Jens Axboe authored 18 years ago


Don't touch the current queues, just make sure that the wanted queue
is selected next. Simplifies the logic.
Signed-off-by: Jens Axboe <axboe@suse.de>

bf572256

[PATCH] Add blk_start_queueing() helper · dc72ef4a

Jens Axboe authored 18 years ago


CFQ implements this on its own now, but it's really block layer
knowledge. Tells a device queue to start dispatching requests to
the driver, taking care to unplug if needed. Also fixes the issue
where as/cfq will invoke a stopped queue, which we really don't
want.
Signed-off-by: Jens Axboe <axboe@suse.de>

dc72ef4a

[PATCH] cfq-iosched: kill the empty_list · 981a7973

Jens Axboe authored 18 years ago


No point in having a place holder list just for empty queues, so remove
it. It's not used for anything other than to keep ->cfq_list busy.
Signed-off-by: Jens Axboe <axboe@suse.de>

981a7973

[PATCH] cfq-iosched: Kill O(N) runtime of cfq_resort_rr_list() · 53b03744

Jens Axboe authored 18 years ago


Currently it scales with number of processes in that priority group,
which is potentially not very nice as it's called quite often.
Basically we always need to do tail inserts, except for the case of a
new process. So just mark/detect a queue as such.
Signed-off-by: Jens Axboe <axboe@suse.de>

53b03744

[PATCH] Make sure all block/io scheduler setups are node aware · b5deef90

Jens Axboe authored 18 years ago


Some were kmalloc_node(), some were still kmalloc(). Change them all to
kmalloc_node().
Signed-off-by: Jens Axboe <axboe@suse.de>

b5deef90

[PATCH] Audit block layer inlines · 1ea25ecb

Jens Axboe authored 18 years ago


Kill a few inlines that bring in too much code to more than one location
Shrinks kernel text by about 300 bytes on 32-bit x86.
Signed-off-by: Jens Axboe <axboe@suse.de>

1ea25ecb

[PATCH] cfq-iosched: use new io context counting mechanism · 4050cf16

Jens Axboe authored 18 years ago


It's ok if the read path is a lot more costly, as long as inc/dec is
really cheap. The inc/dec will happen for each created/freed io context,
while the reading only happens when a disk queue exits.
Signed-off-by: Jens Axboe <axboe@suse.de>

4050cf16

[PATCH] cfq-iosched: kill cfq_exit_lock · fc46379d

Jens Axboe authored 18 years ago


cfq_exit_lock is protecting two things now:

- The per-ioc rbtree of cfq_io_contexts

- The per-cfqd linked list of cfq_io_contexts

The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.

The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.
Signed-off-by: Jens Axboe <axboe@suse.de>

fc46379d

[PATCH] cfq-iosched: cleanups, fixes, dead code removal · 89850f7e

Jens Axboe authored 18 years ago


A collection of little fixes and cleanups:

- We don't use the 'queued' sysfs exported attribute, since the
  may_queue() logic was rewritten. So kill it.

- Remove dead defines.

- cfq_set_active_queue() can be rewritten cleaner with else if conditions.

- Several places had cfq_exit_cfqq() like logic, abstract that out and
  use that.

- Annotate the cfqq kmem_cache_alloc() so the allocator knows that this
  is a repeat allocation if it fails with __GFP_WAIT set. Allows the
  allocator to start freeing some memory, if needed. CFQ already loops for
  this condition, so might as well pass the hint down.

- Remove cfqd->rq_starved logic. It's not needed anymore after we dropped
  the crq allocation in cfq_set_request().

- Remove uneeded parameter passing.
Signed-off-by: Jens Axboe <axboe@suse.de>

89850f7e

[PATCH] Drop useless bio passing in may_queue/set_request API · cb78b285
Jens Axboe authored 18 years ago
```
It's not needed for anything, so kill the bio passing.
Signed-off-by: Jens Axboe <axboe@suse.de>
```
cb78b285

[PATCH] cfq-iosched: kill crq · 5e705374

Jens Axboe authored 18 years ago


Get rid of the cfq_rq request type. With the added elevator_private2, we
have enough room in struct request to get rid of any crq allocation/free
for each request.
Signed-off-by: Jens Axboe <axboe@suse.de>

5e705374

[PATCH] cfq-iosched: remove the crq flag functions/variable · 5380a101

Jens Axboe authored 18 years ago


There's just one flag currently (SYNC), and that one can be grabbed from
the request.
Signed-off-by: Jens Axboe <axboe@suse.de>

5380a101

[PATCH] cfq-iosched: convert to using the FIFO elevator defines · 95e8810b
Jens Axboe authored 18 years ago
```
Signed-off-by: Jens Axboe <axboe@suse.de>
```
95e8810b
[PATCH] cfq-iosched: migrate to using the elevator rb functions · 21183b07
Jens Axboe authored 18 years ago
```
This removes the rbtree handling from CFQ.
Signed-off-by: Jens Axboe <axboe@suse.de>
```
21183b07

[PATCH] elevator: move the backmerging logic into the elevator core · 9817064b

Jens Axboe authored 18 years ago


Right now, every IO scheduler implements its own backmerging (except for
noop, which does no merging). That results in duplicated code for
essentially the same operation, which is never a good thing. This patch
moves the backmerging out of the io schedulers and into the elevator
core. We save 1.6kb of text and as a bonus get backmerging for noop as
well. Win-win!
Signed-off-by: Jens Axboe <axboe@suse.de>

9817064b

21 Aug, 2006 1 commit

[PATCH] cfq_cic_link: fix usage of wrong cfq_io_context · be33c3a6

Oleg Nesterov authored 18 years ago


Obviously, cfq_cic_link() shouldn't free a just allocated cfq_io_context?
The dead key is from __cic, so drop that.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Jens Axboe <axboe@suse.de>

be33c3a6

25 Jul, 2006 1 commit

[PATCH] cfq-iosched: don't use a hard jiffies value, translate from msecs · 44eb1231

Jens Axboe authored 18 years ago


The CIC_SEEKY() test really wants to use the minimum of either:

- 2 msecs (not jiffies)

- or, the pending slice time

So code it like that.
Signed-off-by: Jens Axboe <axboe@suse.de>

44eb1231

30 Jun, 2006 1 commit

Remove obsolete #include <linux/config.h> · 6ab3d562

Jörn Engel authored 18 years ago

Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>

6ab3d562

23 Jun, 2006 6 commits

[PATCH] rbtree: support functions used by the io schedulers · dd67d051

Jens Axboe authored 18 years ago


They all duplicate macros to check for empty root and/or node, and
clearing a node. So put those in rbtree.h.
Signed-off-by: Jens Axboe <axboe@suse.de>

dd67d051

[PATCH] cfq-iosched: rq update fixes · fd61af03

Jens Axboe authored 18 years ago


- Remember to set ->last_sector so that the cfq_choose_req() logic
  works correctly.

- Remove redundant call to cfq_choose_req()
Signed-off-by: Jens Axboe <axboe@suse.de>

fd61af03

[PATCH] cfq-iosched: many performance fixes · caaa5f9f

Jens Axboe authored 18 years ago


This is a collection of patches that greatly improve CFQ performance
in some circumstances.

- Change the idling logic to only kick in after a request is done and we
  are deciding what to do. Before the idling included the request service
  time, so it was hard to adjust. Now it's true think/idle time.

- Take advantage of TCQ/NCQ/queueing for seeky sync workloads, but keep
  it in control for sync and sequential (or close to) workloads.

- Expire queues immediately and move on to other busy queues, if we are
  not going to idle after the current one finishes.

- Don't rearm idle timer if there are no busy queues. Just leave the
  system idle.
Signed-off-by: Jens Axboe <axboe@suse.de>

caaa5f9f

[PATCH] cfq-iosched: correctly set ioprio on both targets · 35e6077c

Jens Axboe authored 18 years ago


Patch originally from Vasily Tarasov <vtaras@sw.ru>

If you set io-priority of process 1 using sys_ioprio_set system call by
another process 2 (like ionice do), then cfq_init_prio_data() function
sets priority of process 2 (current) on queue of process 1 and clears
the flag, that designates change of ioprio.  So the process  1 will work
like with priority of process 2.

I propose not to call cfq_init_prio_data() on io-priority change, but
only mark queue as queue with changed prority.  Every time when new
request comes cfq-scheduler checks for this flag and atomaticaly changes
priority of queue to new value.
Signed-off-by: Jens Axboe <axboe@suse.de>

35e6077c

[PATCH] Kill PF_SYNCWRITE flag · b31dc66a

Jens Axboe authored 18 years ago

A process flag to indicate whether we are doing sync io is incredibly
ugly. It also causes performance problems when one does a lot of async
io and then proceeds to sync it. Part of the io will go out as async,
and the other part as sync. This causes a disconnect between the
previously submitted io and the synced io. For io schedulers such as CFQ,
this will cause us lost merges and suboptimal behaviour in scheduling.

Remove PF_SYNCWRITE completely from the fsync/msync paths, and let
the O_DIRECT path just directly indicate that the writes are sync
by using WRITE_SYNC instead.
Signed-off-by: Jens Axboe <axboe@suse.de>

b31dc66a

[PATCH] cfq-iosched: Don't set the queue batching limits · 271f18f1

Jens Axboe authored 18 years ago


We cannot update them if the user changes nr_requests, so don't
set it in the first place. The gains are pretty questionable as
well. The batching loss has been shown to decrease throughput.
Signed-off-by: Jens Axboe <axboe@suse.de>

271f18f1

20 Jun, 2006 1 commit

Fix up CFQ scheduler for recent rbtree node shrinkage · 6b41fd17

Linus Torvalds authored 18 years ago


The color is now in the low bits of the parent pointer, and initializing
it to 0 happens as part of the whole memset above, so just remove the
unnecessary RB_CLEAR_COLOR.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

6b41fd17

14 Jun, 2006 1 commit

[PATCH] cfq-iosched: fix crash in do_div() · 553698f9

Jens Axboe authored 18 years ago


We don't clear the seek stat values in cfq_alloc_io_context(), and if
->seek_mean is unlucky enough to be set to -36 by chance, the first
invocation of cfq_update_io_seektime() will oops with a divide by zero
in do_div().

Just memset the entire cic instead of filling invididual values
independently.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

553698f9

08 Jun, 2006 1 commit

[PATCH] elevator switching race · bc1c1169

Jens Axboe authored 18 years ago


There's a race between shutting down one io scheduler and firing up the
next, in which a new io could enter and cause the io scheduler to be
invoked with bad or NULL data.

To fix this, we need to maintain the queue lock for a bit longer.
Unfortunately we cannot do that, since the elevator init requires to be
run without the lock held.  This isn't easily fixable, without also
changing the mempool API.  So split the initialization into two parts,
and alloc-init operation and an attach operation.  Then we can
preallocate the io scheduler and related structures, and run the attach
inside the lock after we detach the old one.

This patch has survived 30 minutes of 1 second io scheduler switching
with a very busy io load.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bc1c1169

01 Jun, 2006 2 commits

[PATCH] cfq-iosched: busy_rr fairness fix · b52a8348

Jens Axboe authored 18 years ago


Now that we select busy_rr for possible service, insert entries at the
back of that list instead of at the front.
Signed-off-by: Jens Axboe <axboe@suse.de>

b52a8348

[PATCH] cfq-iosched: fix bug in timer handling for the idle class · ae818a38

Jens Axboe authored 18 years ago


There's a small window from when the timer is entered and we grab
the queue lock, where cfq_set_active_queue() could be rearming the
timer for us. Seen in the wild on a 12-way ppc box. Fix this by
just using mod_timer(), which will do the right thing for us.
Signed-off-by: Jens Axboe <axboe@suse.de>

ae818a38