- 31 Jul, 2008 1 commit
-
-
Sunil Mushran authored
As the fs recovery is asynchronous, there is a small chance that another node can mount (and thus recover) the slot before the recovery thread gets to it. If this happens, the recovery thread will block indefinitely on the journal/slot lock as that lock will be held for the duration of the mount (by design) by the node assigned to that slot. The solution implemented is to keep track of the journal replays using a recovery generation in the journal inode, which will be incremented by the thread replaying that journal. The recovery thread, before attempting the blocking lock on the journal/slot lock, will compare the generation on disk with what it has cached and skip recovery if it does not match. This bug appears to have been inadvertently introduced during the mount/umount vote removal by mainline commit 34d024f8 . In the mount voting scheme, the messaging would indirectly indicate that the slot was being recovered. Signed-off-by:
Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by:
Mark Fasheh <mfasheh@suse.com>
-
- 14 Jul, 2008 1 commit
-
-
Joel Becker authored
A couple places use OCFS2_DEBUG_FS where they really mean CONFIG_OCFS2_DEBUG_FS. Reported-by:
Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by:
Joel Becker <joel.becker@oracle.com>
-
- 18 Apr, 2008 5 commits
-
-
Julia Lawall authored
if (...) BUG(); should be replaced with BUG_ON(...) when the test has no side-effects to allow a definition of BUG_ON that drops the code completely. The semantic patch that makes this change is as follows: (http://www.emn.fr/x-info/coccinelle/ ) // <smpl> @ disable unlikely @ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (unlikely(E)) { BUG(); } + BUG_ON(E); ) @@ expression E,f; @@ ( if (<... f(...) ...>) { BUG(); } | - if (E) { BUG(); } + BUG_ON(E); ) // </smpl> Signed-off-by:
Julia Lawall <julia@diku.dk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
The in-memory slot map uses the same magic as the on-disk one. There is a special value to mark a slot as invalid. It relies on the size of certain types and so on. Write a new in-memory map that keeps validity as a separate field. Outside of the I/O functions, OCFS2_INVALID_SLOT now means what it is supposed to. It also is no longer tied to the type size. This also means that only the I/O functions refer to 16bit quantities. Signed-off-by:
Joel Becker <joel.becker@oracle.com> Signed-off-by:
Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
The old recovery map was a bitmap of node numbers. This was sufficient for the maximum node number of 254. Going forward, we want node numbers to be UINT32. Thus, we need a new recovery map. Note that we can't keep track of slots here. We must write down the node number to recovery *before* we get the locks needed to convert a node number into a slot number. The recovery map is now an array of unsigned ints, max_slots in size. It moves to journal.c with the rest of recovery. Because it needs to be initialized, we move all of recovery initialization into a new function, ocfs2_recovery_init(). This actually cleans up ocfs2_initialize_super() a little as well. Following on, recovery cleaup becomes part of ocfs2_recovery_exit(). A number of node map functions are rendered obsolete and are removed. Finally, waiting on recovery is wrapped in a function rather than naked checks on the recovery_event. This is a cleanup from Mark. Signed-off-by:
Joel Becker <joel.becker@oracle.com> Signed-off-by:
Mark Fasheh <mfasheh@suse.com>
-
Joel Becker authored
Just use osb_lock around the ocfs2_slot_info data. This allows us to take the ocfs2_slot_info structure private in slot_info.c. All access is now via accessors. Signed-off-by:
Joel Becker <joel.becker@oracle.com> Signed-off-by:
Mark Fasheh <mfasheh@suse.com>
-
Mark Fasheh authored
journal.c and dlmglue.c would refresh the slot map by hand. Instead, have the update and clear functions do the work inside slot_map.c. The eventual result is to make ocfs2_slot_info defined privately in slot_map.c Signed-off-by:
Joel Becker <joel.becker@oracle.com> Signed-off-by:
Mark Fasheh <mfasheh@suse.com>
-
- 25 Jan, 2008 4 commits
-
-
Jan Kara authored
Create separate lockdep lock classes for system file's i_mutexes. They are used to guard allocations and similar things and thus rank differently than i_mutex of a regular file or directory. Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
Mostly taken from ext3. This allows the user to set the jbd commit interval, in seconds. The default of 5 seconds stays the same, but now users can easily increase the commit interval. Typically, this would be increased in order to benefit performance at the expense of data-safety. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
Call this the "inode_lock" now, since it covers both data and meta data. This patch makes no functional changes. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
The node maps that are set/unset by these votes are no longer relevant, thus we can remove the mount and umount votes. Since those are the last two remaining votes, we can also remove the entire vote infrastructure. The vote thread has been renamed to the downconvert thread, and the small amount of functionality related to managing it has been moved into fs/ocfs2/dlmglue.c. All references to votes have been removed or updated. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 17 Dec, 2007 3 commits
-
-
Mark Fasheh authored
ocfs2_extend_trans() might call journal_restart() which will commit dirty buffers and then restart the transaction. This means that any buffers which still need changes should be passed to journal_access() again. Some paths during extend weren't doing this right. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
The nastiest cases of transaction extends are also the rarest. We can expose them more quickly at the expense of performance by going straight to the journal_restart() in ocfs2_extend_trans(). Wrap things in OCFS2_DEBUG_FS so that we only do this when "expensive debugging" is turned on. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
We're holding the cluster lock when a failure might happen in ocfs2_dir_foreach() so it needs to be released. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 12 Oct, 2007 2 commits
-
-
Mark Fasheh authored
ocfs2_queue_orphans() has an open coded readdir loop which can easily just use a directory accessor function. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by:
Joel Becker <joel.becker@oracle.com>
-
Mark Fasheh authored
The code for adding, removing, deleting directory entries was splattered all over namei.c. I'd rather have this all centralized, so that it's easier to make changes for inline dir data, and eventually indexed directories. None of the code in any of the functions was changed. I only removed the static keyword from some prototypes so that they could be exported. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com> Reviewed-by:
Joel Becker <joel.becker@oracle.com>
-
- 10 Jul, 2007 1 commit
-
-
Christoph Hellwig authored
Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 02 May, 2007 1 commit
-
-
Mark Fasheh authored
None of these are actually harmful, but the noise makes looking for real problems difficult. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 26 Apr, 2007 5 commits
-
-
Mark Fasheh authored
Older file systems which didn't support holes did a dumb calculation of i_blocks based on i_size. This is no longer accurate, so fix things up to take actual allocation into account. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
Initially, we had wired things to return a size '1' of holes. Cook up a small amount of code to find the next extent and calculate the number of clusters between the virtual offset and the next allocated extent. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
Return an optional extent flags field from our lookup functions and wire up callers to treat unwritten regions as holes for the purpose of returning zeros to the user. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
The code in extent_map.c is not prepared to deal with a subtree being rotated between lookups. This can happen when filling holes in sparse files. Instead of a lengthy patch to update the code (which would likely lose the benefit of caching subtree roots), we remove most of the algorithms and implement a simple path based lookup. A less ambitious extent caching scheme will be added in a later patch. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Tiger Yang authored
Ocfs2 currently does cluster-wide node messaging to check the open state of an inode during delete. This patch removes that mechanism in favor of an inode cluster lock which is taken at shared read when an inode is first read and dropped in clear_inode(). This allows a deleting node to test the liveness of an inode by attempting to take an exclusive lock. Signed-off-by:
Tiger Yang <tiger.yang@oracle.com> Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 07 Dec, 2006 1 commit
-
-
Sunil Mushran authored
This allows users to format an ocfs2 file system with a special flag, OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT. When the file system sees this flag, it will not use any cluster services, nor will it require a cluster configuration, thus acting like a 'local' file system. Signed-off-by:
Sunil Mushran <sunil.mushran@oracle.com> Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 01 Dec, 2006 11 commits
-
-
Mark Fasheh authored
This is mostly a search and replace as ocfs2_journal_handle is now no more than a container for a handle_t pointer. ocfs2_commit_trans() becomes very straight forward, and we remove some out of date comments / code. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
All callers either pass in NULL directly, or a local variable that is already set to NULL. The internals of ocfs2_start_trans() get a nice cleanup as a result. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
It is no longer used. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
This sets us up to remove handle->journal. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
Now that this is unused and all callers pass NULL, we can safely remove it. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
This is no longer used outside of journal.c Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
This gets us rid of a slab we no longer need, as well as removing the majority of what's left on ocfs2_journal_handle. ocfs2_commit_unstarted_handle() has no more real work to do, so remove that function too. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
We can also delete the unused infrastructure which was once in place to support this functionality. ocfs2_inode_private loses ip_handle and ip_handle_list. ocfs2_journal_handle loses handle_list. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
Callers can set h_sync directly on the handle_t, whether a transaction has been started or not can be determined via the existence of the handle_t on the struct ocfs2_journal_handle. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
No reason to use our wrapper struct in this function, so take the handle_t directly. Also fixes a bug where we were incorrectly setting the handle to NULL in case of a failure from journal_restart() Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
Mark Fasheh authored
max_buffs was just being set and not actually used. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 22 Nov, 2006 1 commit
-
-
David Howells authored
Fix up for make allyesconfig. Signed-Off-By:
David Howells <dhowells@redhat.com>
-
- 24 Sep, 2006 1 commit
-
-
Mark Fasheh authored
OCFS2 puts inode meta data in the "lock value block" provided by the DLM. Typically, i_generation is encoded in the lock name so that a deleted inode on and a new one in the same block don't share the same lvb. Unfortunately, that scheme means that the read in ocfs2_read_locked_inode() is potentially thrown away as soon as the meta data lock is taken - we cannot encode the lock name without first knowing i_generation, which requires a disk read. This patch encodes i_generation in the inode meta data lvb, and removes the value from the inode meta data lock name. This way, the read can be covered by a lock, and at the same time we can distinguish between an up to date and a stale LVB. This will help cold-cache stat(2) performance in particular. Since this patch changes the protocol version, we take the opportunity to do a minor re-organization of two of the LVB fields. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 29 Jun, 2006 1 commit
-
-
Mark Fasheh authored
Get rid of osb->uuid, osb->proc_sub_dir, and osb->osb_id. Those fields were unused, or could easily be removed. As a result, we also no longer need MAX_OSB_ID or ocfs2_globals_lock. Signed-off-by:
Mark Fasheh <mark.fasheh@oracle.com>
-
- 27 Jun, 2006 1 commit
-
-
Ingo Molnar authored
locking init cleanups: - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK() - convert rwlocks in a similar manner this patch was generated automatically. Motivation: - cleanliness - lockdep needs control of lock initialization, which the open-coded variants do not give - it's also useful for -rt and for lock debugging in general Signed-off-by:
Ingo Molnar <mingo@elte.hu> Signed-off-by:
Arjan van de Ven <arjan@linux.intel.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-
- 26 Jun, 2006 1 commit
-
-
Akinobu Mita authored
This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under fs/. Cc: Ian Kent <raven@themaw.net> Acked-by:
Joel Becker <joel.becker@oracle.com> Cc: Neil Brown <neilb@cse.unsw.edu.au> Cc: Hans Reiser <reiserfs-dev@namesys.com> Cc: Urban Widmark <urban@teststation.com> Acked-by:
David Howells <dhowells@redhat.com> Acked-by:
Mark Fasheh <mark.fasheh@oracle.com> Signed-off-by:
Akinobu Mita <mita@miraclelinux.com> Signed-off-by:
Andrew Morton <akpm@osdl.org> Signed-off-by:
Linus Torvalds <torvalds@osdl.org>
-