Commits · 539d8264093560b917ee3afe4c7f74e5da09d6a5 · matisse / android_kernel_samsung_matisse

31 Jul, 2008 1 commit

[PATCH 2/2] ocfs2: Fix race between mount and recovery · 539d8264

Sunil Mushran authored 16 years ago

As the fs recovery is asynchronous, there is a small chance that another
node can mount (and thus recover) the slot before the recovery thread
gets to it.

If this happens, the recovery thread will block indefinitely on the
journal/slot lock as that lock will be held for the duration of the mount
(by design) by the node assigned to that slot.

The solution implemented is to keep track of the journal replays using
a recovery generation in the journal inode, which will be incremented by the
thread replaying that journal. The recovery thread, before attempting the
blocking lock on the journal/slot lock, will compare the generation on disk
with what it has cached and skip recovery if it does not match.

This bug appears to have been inadvertently introduced during the mount/umount
vote removal by mainline commit 34d024f8

. In the
mount voting scheme, the messaging would indirectly indicate that the slot
was being recovered.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

539d8264

14 Jul, 2008 1 commit

ocfs2: Fix CONFIG_OCFS2_DEBUG_FS #ifdefs · e407e397

Joel Becker authored 17 years ago


A couple places use OCFS2_DEBUG_FS where they really mean
CONFIG_OCFS2_DEBUG_FS.
Reported-by: Robert P. J. Day <rpjday@crashcourse.ca>
Signed-off-by: Joel Becker <joel.becker@oracle.com>

e407e397

18 Apr, 2008 5 commits

ocfs2: Use BUG_ON · b1f3550f

Julia Lawall authored 17 years ago

if (...) BUG(); should be replaced with BUG_ON(...) when the test has no
side-effects to allow a definition of BUG_ON that drops the code completely.

The semantic patch that makes this change is as follows:
(http://www.emn.fr/x-info/coccinelle/

)

// <smpl>
@ disable unlikely @ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (unlikely(E)) { BUG(); }
+ BUG_ON(E);
)

@@ expression E,f; @@

(
  if (<... f(...) ...>) { BUG(); }
|
- if (E) { BUG(); }
+ BUG_ON(E);
)
// </smpl>
Signed-off-by: Julia Lawall <julia@diku.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

b1f3550f

ocfs2: De-magic the in-memory slot map. · fc881fa0

Joel Becker authored 17 years ago


The in-memory slot map uses the same magic as the on-disk one.  There is
a special value to mark a slot as invalid.  It relies on the size of
certain types and so on.

Write a new in-memory map that keeps validity as a separate field.  Outside
of the I/O functions, OCFS2_INVALID_SLOT now means what it is supposed to.
It also is no longer tied to the type size.

This also means that only the I/O functions refer to 16bit quantities.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

fc881fa0

ocfs2: Change the recovery map to an array of node numbers. · 553abd04

Joel Becker authored 17 years ago


The old recovery map was a bitmap of node numbers.  This was sufficient
for the maximum node number of 254.  Going forward, we want node numbers
to be UINT32.  Thus, we need a new recovery map.

Note that we can't keep track of slots here.  We must write down the
node number to recovery *before* we get the locks needed to convert a
node number into a slot number.

The recovery map is now an array of unsigned ints, max_slots in size.
It moves to journal.c with the rest of recovery.

Because it needs to be initialized, we move all of recovery initialization
into a new function, ocfs2_recovery_init().  This actually cleans up
ocfs2_initialize_super() a little as well.  Following on, recovery cleaup
becomes part of ocfs2_recovery_exit().

A number of node map functions are rendered obsolete and are removed.

Finally, waiting on recovery is wrapped in a function rather than naked
checks on the recovery_event.  This is a cleanup from Mark.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

553abd04

ocfs2: Make ocfs2_slot_info private. · d85b20e4

Joel Becker authored 17 years ago


Just use osb_lock around the ocfs2_slot_info data.  This allows us to
take the ocfs2_slot_info structure private in slot_info.c.  All access
is now via accessors.
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

d85b20e4

ocfs2: Move slot map access into slot_map.c · 8e8a4603

Mark Fasheh authored 17 years ago

journal.c and dlmglue.c would refresh the slot map by hand. Instead, have
the update and clear functions do the work inside slot_map.c. The eventual
result is to make ocfs2_slot_info defined privately in slot_map.c
Signed-off-by: Joel Becker <joel.becker@oracle.com>
Signed-off-by: Mark Fasheh <mfasheh@suse.com>

8e8a4603

25 Jan, 2008 4 commits

ocfs2: Silence false lockdep warnings · 5fa0613e

Jan Kara authored 17 years ago


Create separate lockdep lock classes for system file's i_mutexes. They are
used to guard allocations and similar things and thus rank differently
than i_mutex of a regular file or directory.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

5fa0613e

ocfs2: Support commit= mount option · d147b3d6

Mark Fasheh authored 17 years ago

Mostly taken from ext3. This allows the user to set the jbd commit interval,
in seconds. The default of 5 seconds stays the same, but now users can
easily increase the commit interval. Typically, this would be increased in
order to benefit performance at the expense of data-safety.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

d147b3d6

ocfs2: Rename ocfs2_meta_[un]lock · e63aecb6

Mark Fasheh authored 17 years ago


Call this the "inode_lock" now, since it covers both data and meta data.
This patch makes no functional changes.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

e63aecb6

ocfs2: Remove mount/unmount votes · 34d024f8

Mark Fasheh authored 17 years ago

The node maps that are set/unset by these votes are no longer relevant, thus
we can remove the mount and umount votes. Since those are the last two
remaining votes, we can also remove the entire vote infrastructure.

The vote thread has been renamed to the downconvert thread, and the small
amount of functionality related to managing it has been moved into
fs/ocfs2/dlmglue.c. All references to votes have been removed or updated.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

34d024f8

17 Dec, 2007 3 commits

ocfs2: Re-journal buffers after transaction extend · e8aed345

Mark Fasheh authored 17 years ago

ocfs2_extend_trans() might call journal_restart() which will commit dirty
buffers and then restart the transaction. This means that any buffers which
still need changes should be passed to journal_access() again. Some paths
during extend weren't doing this right.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

e8aed345

ocfs2: Allow for debugging of transaction extends · 0879c584

Mark Fasheh authored 17 years ago

The nastiest cases of transaction extends are also the rarest. We can expose
them more quickly at the expense of performance by going straight to the
journal_restart() in ocfs2_extend_trans(). Wrap things in OCFS2_DEBUG_FS so
that we only do this when "expensive debugging" is turned on.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

0879c584

ocfs2: fix exit-while-locked bug in ocfs2_queue_orphans() · a86370fb

Mark Fasheh authored 17 years ago


We're holding the cluster lock when a failure might happen in
ocfs2_dir_foreach() so it needs to be released.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

a86370fb

12 Oct, 2007 2 commits

ocfs2: Remove open coded readdir() · 5eae5b96

Mark Fasheh authored 17 years ago


ocfs2_queue_orphans() has an open coded readdir loop which can easily just
use a directory accessor function.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: Joel Becker <joel.becker@oracle.com>

5eae5b96

ocfs2: Move directory manipulation code into dir.c · 316f4b9f

Mark Fasheh authored 17 years ago

The code for adding, removing, deleting directory entries was splattered all
over namei.c. I'd rather have this all centralized, so that it's easier to
make changes for inline dir data, and eventually indexed directories.

None of the code in any of the functions was changed. I only removed the
static keyword from some prototypes so that they could be exported.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
Reviewed-by: Joel Becker <joel.becker@oracle.com>

316f4b9f

10 Jul, 2007 1 commit
- [PATCH] ocfs2: use list_for_each_entry where benefical · 800deef3
  Christoph Hellwig authored 18 years ago
```
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
  800deef3
02 May, 2007 1 commit

ocfs2: fix sparse warnings in fs/ocfs2 · 1ca1a111

Mark Fasheh authored 18 years ago


None of these are actually harmful, but the noise makes looking for real
problems difficult.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

1ca1a111

26 Apr, 2007 5 commits

ocfs2: Fix up i_blocks calculation to know about holes · 8110b073

Mark Fasheh authored 18 years ago


Older file systems which didn't support holes did a dumb calculation of
i_blocks based on i_size. This is no longer accurate, so fix things up to
take actual allocation into account.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

8110b073

ocfs2: Fix extent lookup to return true size of holes · 4f902c37

Mark Fasheh authored 18 years ago

Initially, we had wired things to return a size '1' of holes. Cook up a
small amount of code to find the next extent and calculate the number of
clusters between the virtual offset and the next allocated extent.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

4f902c37

ocfs2: Read from an unwritten extent returns zeros · 49cb8d2d

Mark Fasheh authored 18 years ago


Return an optional extent flags field from our lookup functions and wire up
callers to treat unwritten regions as holes for the purpose of returning
zeros to the user.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

49cb8d2d

ocfs2: temporarily remove extent map caching · 363041a5

Mark Fasheh authored 18 years ago

The code in extent_map.c is not prepared to deal with a subtree being
rotated between lookups. This can happen when filling holes in sparse files.
Instead of a lengthy patch to update the code (which would likely lose the
benefit of caching subtree roots), we remove most of the algorithms and
implement a simple path based lookup. A less ambitious extent caching scheme
will be added in a later patch.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

363041a5

ocfs2: Remove delete inode vote · 50008630

Tiger Yang authored 18 years ago

Ocfs2 currently does cluster-wide node messaging to check the open state of
an inode during delete. This patch removes that mechanism in favor of an
inode cluster lock which is taken at shared read when an inode is first read
and dropped in clear_inode(). This allows a deleting node to test the
liveness of an inode by attempting to take an exclusive lock.
Signed-off-by: Tiger Yang <tiger.yang@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

50008630

07 Dec, 2006 1 commit

ocfs2: local mounts · c271c5c2

Sunil Mushran authored 18 years ago

This allows users to format an ocfs2 file system with a special flag,
OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it
will not use any cluster services, nor will it require a cluster
configuration, thus acting like a 'local' file system.
Signed-off-by: Sunil Mushran <sunil.mushran@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c271c5c2

01 Dec, 2006 11 commits

ocfs2: Remove struct ocfs2_journal_handle in favor of handle_t · 1fabe148

Mark Fasheh authored 18 years ago


This is mostly a search and replace as ocfs2_journal_handle is now no more
than a container for a handle_t pointer.

ocfs2_commit_trans() becomes very straight forward, and we remove some out
of date comments / code.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

1fabe148

ocfs2: remove handle argument to ocfs2_start_trans() · 65eff9cc

Mark Fasheh authored 18 years ago


All callers either pass in NULL directly, or a local variable that is
already set to NULL.

The internals of ocfs2_start_trans() get a nice cleanup as a result.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

65eff9cc

ocfs2: remove ocfs2_journal_handle journal field · dae85832
Mark Fasheh authored 18 years ago
```
It is no longer used.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
dae85832
ocfs2: pass ocfs2_super * into ocfs2_commit_trans() · 02dc1af4
Mark Fasheh authored 18 years ago
```
This sets us up to remove handle->journal.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
```
02dc1af4

ocfs2: remove unused handle argument from ocfs2_meta_lock_full() · 4bcec184

Mark Fasheh authored 18 years ago


Now that this is unused and all callers pass NULL, we can safely remove it.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

4bcec184

ocfs2: make ocfs2_alloc_handle() static · a301a27d

Mark Fasheh authored 18 years ago


This is no longer used outside of journal.c
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

a301a27d

ocfs2: remove unused ocfs2_handle_add_lock() · daf29e9c

Mark Fasheh authored 18 years ago


This gets us rid of a slab we no longer need, as well as removing the
majority of what's left on ocfs2_journal_handle.

ocfs2_commit_unstarted_handle() has no more real work to do, so remove that
function too.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

daf29e9c

ocfs2: remove unused ocfs2_handle_add_inode() · 02928a71

Mark Fasheh authored 18 years ago

We can also delete the unused infrastructure which was once in place to
support this functionality. ocfs2_inode_private loses ip_handle and
ip_handle_list. ocfs2_journal_handle loses handle_list.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

02928a71

ocfs2: remove ocfs2_journal_handle flags field · c161f89b

Mark Fasheh authored 18 years ago


Callers can set h_sync directly on the handle_t, whether a transaction has
been started or not can be determined via the existence of the handle_t on
the struct ocfs2_journal_handle.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

c161f89b

ocfs2: have ocfs2_extend_trans() take handle_t · 1fc58146

Mark Fasheh authored 18 years ago


No reason to use our wrapper struct in this function, so take the handle_t
directly.

Also fixes a bug where we were incorrectly setting the handle to NULL in
case of a failure from journal_restart()
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

1fc58146

ocfs2: remove unused ocfs2_journal_handle field · 01ddf1e1

Mark Fasheh authored 18 years ago


max_buffs was just being set and not actually used.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

01ddf1e1

22 Nov, 2006 1 commit

WorkStruct: make allyesconfig · c4028958

David Howells authored 18 years ago


Fix up for make allyesconfig.
Signed-Off-By: David Howells <dhowells@redhat.com>

c4028958

24 Sep, 2006 1 commit

ocfs2: Remove i_generation from inode lock names · 24c19ef4

Mark Fasheh authored 18 years ago

OCFS2 puts inode meta data in the "lock value block" provided by the DLM.
Typically, i_generation is encoded in the lock name so that a deleted inode
on and a new one in the same block don't share the same lvb.

Unfortunately, that scheme means that the read in ocfs2_read_locked_inode()
is potentially thrown away as soon as the meta data lock is taken - we
cannot encode the lock name without first knowing i_generation, which
requires a disk read.

This patch encodes i_generation in the inode meta data lvb, and removes the
value from the inode meta data lock name. This way, the read can be covered
by a lock, and at the same time we can distinguish between an up to date and
a stale LVB.

This will help cold-cache stat(2) performance in particular.

Since this patch changes the protocol version, we take the opportunity to do
a minor re-organization of two of the LVB fields.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

24c19ef4

29 Jun, 2006 1 commit

ocfs2: clean up some osb fields · 78427043

Mark Fasheh authored 19 years ago


Get rid of osb->uuid, osb->proc_sub_dir, and osb->osb_id. Those fields were
unused, or could easily be removed. As a result, we also no longer need
MAX_OSB_ID or ocfs2_globals_lock.
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>

78427043

27 Jun, 2006 1 commit

[PATCH] spin/rwlock init cleanups · 34af946a

Ingo Molnar authored 18 years ago


locking init cleanups:

 - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
 - convert rwlocks in a similar manner

this patch was generated automatically.

Motivation:

 - cleanliness
 - lockdep needs control of lock initialization, which the open-coded
   variants do not give
 - it's also useful for -rt and for lock debugging in general
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

34af946a

26 Jun, 2006 1 commit

[PATCH] fs: use list_move() · f116629d

Akinobu Mita authored 18 years ago


This patch converts the combination of list_del(A) and list_add(A, B) to
list_move(A, B) under fs/.

Cc: Ian Kent <raven@themaw.net>
Acked-by: Joel Becker <joel.becker@oracle.com>
Cc: Neil Brown <neilb@cse.unsw.edu.au>
Cc: Hans Reiser <reiserfs-dev@namesys.com>
Cc: Urban Widmark <urban@teststation.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Mark Fasheh <mark.fasheh@oracle.com>
Signed-off-by: Akinobu Mita <mita@miraclelinux.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f116629d