- 06 Apr, 2009 10 commits
-
-
Jean Delvare authored
The i2c-algo-sgi code was merged into the vino driver, so we can delete it now. Signed-off-by:
Jean Delvare <khali@linux-fr.org>
-
Jean Delvare authored
Delete many unused I2C driver IDs. We should be able to get rid of i2c_driver.id pretty soon now. Signed-off-by:
Jean Delvare <khali@linux-fr.org>
-
Jean Delvare authored
The new i2c binding model makes the client_register and client_unregister methods of struct i2c_adapter useless, so we can remove them with the rest of the legacy model. Signed-off-by:
Jean Delvare <khali@linux-fr.org>
-
Kim Kyuwon authored
ROHM BD2802GU is a RGB LED controller attached to i2c bus and specifically engineered for decoration purposes. This RGB controller incorporates lighting patterns and illuminates. This driver is designed to minimize power consumption, so when there is no emitting LED, it enters to reset state. And because the BD2802GU has lots of features that can't be covered by the current LED framework, it provides Advanced Configuration Function(ADF) mode, so that user applications can set registers of BD2802GU directly. Here are basic usage examples : ; to turn on LED (not blink) $ echo 1 > /sys/class/leds/led1_R/brightness ; to blink LED $ echo timer > /sys/class/leds/led1_R/trigger $ echo 1 > /sys/class/leds/led1_R/delay_on $ echo 1 > /sys/class/leds/led1_R/delay_off ; to turn off LED $ echo 0 > /sys/class/leds/led1_R/brightness [akpm@linux-foundation.org: coding-style fixes] Signed-off-by:
Kim Kyuwon <chammoru@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Richard Purdie <rpurdie@linux.intel.com>
-
Richard Purdie authored
Add an option to preserve LED state when suspending/resuming to the LED gpio driver. Based on a suggestion from Robert Jarzmik. Tested-by:
Robert Jarzmik <robert.jarzmik@free.fr> Signed-off-by:
Richard Purdie <rpurdie@linux.intel.com>
-
Luotao Fu authored
Add a simple driver for pwm driver LEDs. pwm_id and period can be defined in board file. It is developed for pxa, however it is probably generic enough to be used on other platforms with pwm. Signed-off-by:
Luotao Fu <l.fu@pengutronix.de> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Richard Purdie <rpurdie@linux.intel.com>
-
Guennadi Liakhovetski authored
This patch allows drivers to override the default maximum brightness value of 255. We take care to preserve backwards-compatibility as much as possible, so that user-space ABI doesn't change for existing drivers. LED trigger code has also been updated to use the per-LED maximum. Signed-off-by:
Guennadi Liakhovetski <lg@denx.de> Signed-off-by:
Richard Purdie <rpurdie@linux.intel.com>
-
Jens Axboe authored
By default, CFQ will anticipate more IO from a given io context if the previously completed IO was sync. This used to be fine, since the only sync IO was reads and O_DIRECT writes. But with more "normal" sync writes being used now, we don't want to anticipate for those. Add a bio/request flag that informs the IO scheduler that this is a sync request that we should not idle for. Introduce WRITE_ODIRECT specifically for O_DIRECT writes, and make sure that the other sync writes set this flag. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Jens Axboe authored
(S)WRITE_SYNC always unplugs the device right after IO submission. Sometimes we want to build up a queue before doing so, so add variants that explicitly DON'T unplug the queue. The caller must then do that after submitting all the IO. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
Jens Axboe authored
This makes sure that we never wait on async IO for sync requests, instead of doing the split on writes vs reads. Signed-off-by:
Jens Axboe <jens.axboe@oracle.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 05 Apr, 2009 1 commit
-
-
Bjorn Helgaas authored
This patch adds support for ACPI device driver .notify() methods. If such a method is present, Linux/ACPI installs a handler for device notifications (but not for system notifications such as Bus Check, Device Check, etc). When a device notification occurs, Linux/ACPI passes it on to the driver's .notify() method. In most cases, this removes the need for drivers to install their own handlers for device-specific notifications. For fixed hardware devices like some power and sleep buttons, there's no notification value because there's no control method to execute a Notify opcode. When a fixed hardware device generates an event, we handle it the same as a regular device notification, except we send a ACPI_FIXED_HARDWARE_EVENT value. This is outside the normal 0x0-0xff range used by Notify opcodes. Several drivers install their own handlers for system Bus Check and Device Check notifications so they can support hot-plug. This patch doesn't affect that usage. Signed-off-by:
Bjorn Helgaas <bjorn.helgaas@hp.com> Reviewed-by:
Alex Chiang <achiang@hp.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
- 04 Apr, 2009 5 commits
-
-
Philipp Zabel authored
This driver requests a clock that usually is supplied by the MFD in which the DS1WM is contained. Currently, it is impossible for a MFD to register their clocks with the generic clock API due to different implementations across architectures. For now, this patch removes the clock handling from DS1WM altogether, trusting that the MFD enable/disable functions will switch the clock if needed. The clock rate is obtained from a new parameter in driver_data. Signed-off-by:
Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by:
Samuel Ortiz <sameo@openedhand.com>
-
Philipp Zabel authored
Removes the now-unused bus_shift field from pasic3_platform_data. Signed-off-by:
Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by:
Samuel Ortiz <sameo@openedhand.com>
-
Philipp Zabel authored
This patch converts the DS1WM driver into an MFD cell. It also calculates the bus_shift parameter from the memory resource size. Signed-off-by:
Philipp Zabel <philipp.zabel@gmail.com> Signed-off-by:
Samuel Ortiz <sameo@openedhand.com>
-
Mark Brown authored
Signed-off-by:
Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by:
Samuel Ortiz <sameo@openedhand.com>
-
Linus Torvalds authored
Instead of always splitting the file offset into 32-bit 'high' and 'low' parts, just split them into the largest natural word-size - which in C terms is 'unsigned long'. This allows 64-bit architectures to avoid the unnecessary 32-bit shifting and masking for native format (while the compat interfaces will obviously always have to do it). This also changes the order of 'high' and 'low' to be "low first". Why? Because when we have it like this, the 64-bit system calls now don't use the "pos_high" argument at all, and it makes more sense for the native system call to simply match the user-mode prototype. This results in a much more natural calling convention, and allows the compiler to generate much more straightforward code. On x86-64, we now generate testq %rcx, %rcx # pos_l js .L122 #, movq %rcx, -48(%rbp) # pos_l, pos from the C source loff_t pos = pos_from_hilo(pos_h, pos_l); ... if (pos < 0) return -EINVAL; and the 'pos_h' register isn't even touched. It used to generate code like mov %r8d, %r8d # pos_low, pos_low salq $32, %rcx #, tmp71 movq %r8, %rax # pos_low, pos.386 orq %rcx, %rax # tmp71, pos.386 js .L122 #, movq %rax, -48(%rbp) # pos.386, pos which isn't _that_ horrible, but it does show how the natural word size is just a more sensible interface (same arguments will hold in the user level glibc wrapper function, of course, so the kernel side is just half of the equation!) Note: in all cases the user code wrapper can again be the same. You can just do #define HALF_BITS (sizeof(unsigned long)*4) __syscall(PWRITEV, fd, iov, count, offset, (offset >> HALF_BITS) >> HALF_BITS); or something like that. That way the user mode wrapper will also be nicely passing in a zero (it won't actually have to do the shifts, the compiler will understand what is going on) for the last argument. And that is a good idea, even if nobody will necessarily ever care: if we ever do move to a 128-bit lloff_t, this particular system call might be left alone. Of course, that will be the least of our worries if we really ever need to care, so this may not be worth really caring about. [ Fixed for lost 'loff_t' cast noticed by Andrew Morton ] Acked-by:
Gerd Hoffmann <kraxel@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-api@vger.kernel.org Cc: linux-arch@vger.kernel.org Cc: Ingo Molnar <mingo@elte.hu> Cc: Ralf Baechle <ralf@linux-mips.org>> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- 03 Apr, 2009 24 commits
-
-
Jan Engelhardt authored
Signed-off-by:
Jan Engelhardt <jengelh@medozas.de> Signed-off-by:
Len Brown <len.brown@intel.com>
-
Suresh Siddha authored
Signed-off-by:
Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
Suresh Siddha authored
All logical processors with APIC ID values of 255 and greater will have their APIC reported through Processor X2APIC structure (type-9 entry type) and all logical processors with APIC ID less than 255 will have their APIC reported through legacy Processor Local APIC (type-0 entry type) only. This is the same case even for NMI structure reporting. The Processor X2APIC Affinity structure provides the association between the X2APIC ID of a logical processor and the proximity domain to which the logical processor belongs. For OSPM, Procssor IDs outside the 0-254 range are to be declared as Device() objects in the ACPI namespace. Signed-off-by:
Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by:
Len Brown <len.brown@intel.com>
-
Evgeniy Polyakov authored
This patch contains DST core files, which introduce block layer, connector and sysfs registration glue and main headers. Connector is used for the configuration of the node (its type, address, device name and so on). Sysfs provides bits of information about running devices in the following format: +/* + * DST sysfs tree for device called 'storage': + * + * /sys/bus/dst/devices/storage/ + * /sys/bus/dst/devices/storage/type : 192.168.4.80:1025 + * /sys/bus/dst/devices/storage/size : 800 + * /sys/bus/dst/devices/storage/name : storage + */ DST header contains structure definitions and protocol command description. Signed-off-by:
Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de>
-
Kumar Gala authored
Commit f4112de6 ("mm: introduce debug_kmap_atomic") broke PPC builds with CONFIG_HIGHMEM=y: CC init/main.o In file included from include/linux/highmem.h:25, from include/linux/pagemap.h:11, from include/linux/mempolicy.h:63, from init/main.c:53: arch/powerpc/include/asm/highmem.h: In function 'kmap_atomic_prot': arch/powerpc/include/asm/highmem.h:98: error: implicit declaration of function 'debug_kmap_atomic' In file included from include/linux/pagemap.h:11, from include/linux/mempolicy.h:63, from init/main.c:53: include/linux/highmem.h: At top level: include/linux/highmem.h:196: warning: conflicting types for 'debug_kmap_atomic' include/linux/highmem.h:196: error: static declaration of 'debug_kmap_atomic' follows non-static declaration include/asm/highmem.h:98: error: previous implicit declaration of 'debug_kmap_atomic' was here make[1]: *** [init/main.o] Error 1 make: *** [init] Error 2 Signed-off-by:
Kumar Gala <galak@kernel.crashing.org> Acked-by:
Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
David Howells authored
nfs_readpage_async() needs to be non-static so that it can be used as a fallback for the local on-disk caching should an EIO crop up when reading the cache. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Add some new NFS I/O counters for FS-Cache doing things for NFS. A new line is emitted into /proc/pid/mountstats if caching is enabled that looks like: fsc: <rok> <rfl> <wok> <wfl> <unc> Where <rok> is the number of pages read successfully from the cache, <rfl> is the number of failed page reads against the cache, <wok> is the number of successful page writes to the cache, <wfl> is the number of failed page writes to the cache, and <unc> is the number of NFS pages that have been disconnected from the cache. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Bind data storage objects in the local cache to NFS inodes. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Define and create superblock-level cache index objects (as managed by nfs_server structs). Each superblock object is created in a server level index object and is itself an index into which inode-level objects are inserted. Ideally there would be one superblock-level object per server, and the former would be folded into the latter; however, since the "nosharecache" option exists this isn't possible. The superblock object key is a sequence consisting of: (1) Certain superblock s_flags. (2) Various connection parameters that serve to distinguish superblocks for sget(). (3) The volume FSID. (4) The security flavour. (5) The uniquifier length. (6) The uniquifier text. This is normally an empty string, unless the fsc=xyz mount option was used to explicitly specify a uniquifier. The key blob is of variable length, depending on the length of (6). The superblock object is given no coherency data to carry in the auxiliary data permitted by the cache. It is assumed that the superblock is always coherent. This patch also adds uniquification handling such that two otherwise identical superblocks, at least one of which is marked "nosharecache", won't end up trying to share the on-disk cache. It will be possible to manually provide a uniquifier through a mount option with a later patch to avoid the error otherwise produced. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Define and create server-level cache index objects (as managed by nfs_client structs). Each server object is created in the NFS top-level index object and is itself an index into which superblock-level objects are inserted. Ideally there would be one superblock-level object per server, and the former would be folded into the latter; however, since the "nosharecache" option exists this isn't possible. The server object key is a sequence consisting of: (1) NFS version (2) Server address family (eg: AF_INET or AF_INET6) (3) Server port. (4) Server IP address. The key blob is of variable length, depending on the length of (4). The server object is given no coherency data to carry in the auxiliary data permitted by the cache. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Add FS-Cache option bit to nfs_server struct. This is set to indicate local on-disk caching is enabled for a particular superblock. Also add debug bit for local caching operations. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Add a function to install a monitor on the page lock waitqueue for a particular page, thus allowing the page being unlocked to be detected. This is used by CacheFiles to detect read completion on a page in the backing filesystem so that it can then copy the data to the waiting netfs page. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Rik van Riel <riel@redhat.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Implement the data I/O part of the FS-Cache netfs API. The documentation and API header file were added in a previous patch. This patch implements the following functions for the netfs to call: (*) fscache_attr_changed(). Indicate that the object has changed its attributes. The only attribute currently recorded is the file size. Only pages within the set file size will be stored in the cache. This operation is submitted for asynchronous processing, and will return immediately. It will return -ENOMEM if an out of memory error is encountered, -ENOBUFS if the object is not actually cached, or 0 if the operation is successfully queued. (*) fscache_read_or_alloc_page(). (*) fscache_read_or_alloc_pages(). Request data be fetched from the disk, and allocate internal metadata to track the netfs pages and reserve disk space for unknown pages. These operations perform semi-asynchronous data reads. Upon returning they will indicate which pages they think can be retrieved from disk, and will have set in progress attempts to retrieve those pages. These will return, in order of preference, -ENOMEM on memory allocation error, -ERESTARTSYS if a signal interrupted proceedings, -ENODATA if one or more requested pages are not yet cached, -ENOBUFS if the object is not actually cached or if there isn't space for future pages to be cached on this object, or 0 if successful. In the case of the multipage function, the pages for which reads are set in progress will be removed from the list and the page count decreased appropriately. If any read operations should fail, the completion function will be given an error, and will also be passed contextual information to allow the netfs to fall back to querying the server for the absent pages. For each successful read, the page completion function will also be called. Any pages subsequently tracked by the cache will have PG_fscache set upon them on return. fscache_uncache_page() must be called for such pages. If supplied by the netfs, the mark_pages_cached() cookie op will be invoked for any pages now tracked. (*) fscache_alloc_page(). Allocate internal metadata to track a netfs page and reserve disk space. This will return -ENOMEM on memory allocation error, -ERESTARTSYS on signal, -ENOBUFS if the object isn't cached, or there isn't enough space in the cache, or 0 if successful. Any pages subsequently tracked by the cache will have PG_fscache set upon them on return. fscache_uncache_page() must be called for such pages. If supplied by the netfs, the mark_pages_cached() cookie op will be invoked for any pages now tracked. (*) fscache_write_page(). Request data be stored to disk. This may only be called on pages that have been read or alloc'd by the above three functions and have not yet been uncached. This will return -ENOMEM on memory allocation error, -ERESTARTSYS on signal, -ENOBUFS if the object isn't cached, or there isn't immediately enough space in the cache, or 0 if successful. On a successful return, this operation will have queued the page for asynchronous writing to the cache. The page will be returned with PG_fscache_write set until the write completes one way or another. The caller will not be notified if the write fails due to an I/O error. If that happens, the object will become available and all pending writes will be aborted. Note that the cache may batch up page writes, and so it may take a while to get around to writing them out. The caller must assume that until PG_fscache_write is cleared the page is use by the cache. Any changes made to the page may be reflected on disk. The page may even be under DMA. (*) fscache_uncache_page(). Indicate that the cache should stop tracking a page previously read or alloc'd from the cache. If the page was alloc'd only, but unwritten, it will not appear on disk. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Implement the cookie management part of the FS-Cache netfs client API. The documentation and API header file were added in a previous patch. This patch implements the following three functions: (1) fscache_acquire_cookie(). Acquire a cookie to represent an object to the netfs. If the object in question is a non-index object, then that object and its parent indices will be created on disk at this point if they don't already exist. Index creation is deferred because an index may reside in multiple caches. (2) fscache_relinquish_cookie(). Retire or release a cookie previously acquired. At this point, the object on disk may be destroyed. (3) fscache_update_cookie(). Update the in-cache representation of a cookie. This is used to update the auxiliary data for coherency management purposes. With this patch it is possible to have a netfs instruct a cache backend to look up, validate and create metadata on disk and to destroy it again. The ability to actually store and retrieve data in the objects so created is added in later patches. Note that these functions will never return an error. _All_ errors are handled internally to FS-Cache. The worst that can happen is that fscache_acquire_cookie() may return a NULL pointer - which is considered a negative cookie pointer and can be passed back to any function that takes a cookie without harm. A negative cookie pointer merely suppresses caching at that level. The stub in linux/fscache.h will detect inline the negative cookie pointer and abort the operation as fast as possible. This means that the compiler doesn't have to set up for a call in that case. See the documentation in Documentation/filesystems/caching/netfs-api.txt for more information. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Add functions to register and unregister a network filesystem or other client of the FS-Cache service. This allocates and releases the cookie representing the top-level index for a netfs, and makes it available to the netfs. If the FS-Cache facility is disabled, then the calls are optimised away at compile time. Note that whilst this patch may appear to work with FS-Cache enabled and a netfs attempting to use it, it will leak the cookie it allocates for the netfs as fscache_relinquish_cookie() is implemented in a later patch. This will cause the slab code to emit a warning when the module is removed. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Implement two features of FS-Cache: (1) The ability to request and release cache tags - names by which a cache may be known to a netfs, and thus selected for use. (2) An internal function by which a cache is selected by consulting the netfs, if the netfs wishes to be consulted. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Make FS-Cache create its /proc interface and present various statistical information through it. Also provide the functions for updating this information. These features are enabled by: CONFIG_FSCACHE_PROC CONFIG_FSCACHE_STATS CONFIG_FSCACHE_HISTOGRAM The /proc directory for FS-Cache is also exported so that caching modules can add their own statistics there too. The FS-Cache module is loadable at this point, and the statistics files can be examined by userspace: cat /proc/fs/fscache/stats cat /proc/fs/fscache/histogram Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Add the API for a generic facility (FS-Cache) by which caches may declare them selves open for business, and may obtain work to be done from network filesystems. The header file is included by: #include <linux/fscache-cache.h> Documentation for the API is also added to: Documentation/filesystems/caching/backend-api.txt This API is not usable without the implementation of the utility functions which will be added in further patches. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Add the API for a generic facility (FS-Cache) by which filesystems (such as AFS or NFS) may call on local caching capabilities without having to know anything about how the cache works, or even if there is a cache: +---------+ | | +--------------+ | NFS |--+ | | | | | +-->| CacheFS | +---------+ | +----------+ | | /dev/hda5 | | | | | +--------------+ +---------+ +-->| | | | | | |--+ | AFS |----->| FS-Cache | | | | |--+ +---------+ +-->| | | | | | | +--------------+ +---------+ | +----------+ | | | | | | +-->| CacheFiles | | ISOFS |--+ | /var/cache | | | +--------------+ +---------+ General documentation and documentation of the netfs specific API are provided in addition to the header files. As this patch stands, it is possible to build a filesystem against the facility and attempt to use it. All that will happen is that all requests will be immediately denied as if no cache is present. Further patches will implement the core of the facility. The facility will transfer requests from networking filesystems to appropriate caches if possible, or else gracefully deny them. If this facility is disabled in the kernel configuration, then all its operations will trivially reduce to nothing during compilation. WHY NOT I_MAPPING? ================== I have added my own API to implement caching rather than using i_mapping to do this for a number of reasons. These have been discussed a lot on the LKML and CacheFS mailing lists, but to summarise the basics: (1) Most filesystems don't do hole reportage. Holes in files are treated as blocks of zeros and can't be distinguished otherwise, making it difficult to distinguish blocks that have been read from the network and cached from those that haven't. (2) The backing inode must be fully populated before being exposed to userspace through the main inode because the VM/VFS goes directly to the backing inode and does not interrogate the front inode's VM ops. Therefore: (a) The backing inode must fit entirely within the cache. (b) All backed files currently open must fit entirely within the cache at the same time. (c) A working set of files in total larger than the cache may not be cached. (d) A file may not grow larger than the available space in the cache. (e) A file that's open and cached, and remotely grows larger than the cache is potentially stuffed. (3) Writes go to the backing filesystem, and can only be transferred to the network when the file is closed. (4) There's no record of what changes have been made, so the whole file must be written back. (5) The pages belong to the backing filesystem, and all metadata associated with that page are relevant only to the backing filesystem, and not anything stacked atop it. OVERVIEW ======== FS-Cache provides (or will provide) the following facilities: (1) Caches can be added / removed at any time, even whilst in use. (2) Adds a facility by which tags can be used to refer to caches, even if they're not available yet. (3) More than one cache can be used at once. Caches can be selected explicitly by use of tags. (4) The netfs is provided with an interface that allows either party to withdraw caching facilities from a file (required for (1)). (5) A netfs may annotate cache objects that belongs to it. This permits the storage of coherency maintenance data. (6) Cache objects will be pinnable and space reservations will be possible. (7) The interface to the netfs returns as few errors as possible, preferring rather to let the netfs remain oblivious. (8) Cookies are used to represent indices, files and other objects to the netfs. The simplest cookie is just a NULL pointer - indicating nothing cached there. (9) The netfs is allowed to propose - dynamically - any index hierarchy it desires, though it must be aware that the index search function is recursive, stack space is limited, and indices can only be children of indices. (10) Indices can be used to group files together to reduce key size and to make group invalidation easier. The use of indices may make lookup quicker, but that's cache dependent. (11) Data I/O is effectively done directly to and from the netfs's pages. The netfs indicates that page A is at index B of the data-file represented by cookie C, and that it should be read or written. The cache backend may or may not start I/O on that page, but if it does, a netfs callback will be invoked to indicate completion. The I/O may be either synchronous or asynchronous. (12) Cookies can be "retired" upon release. At this point FS-Cache will mark them as obsolete and the index hierarchy rooted at that point will get recycled. (13) The netfs provides a "match" function for index searches. In addition to saying whether a match was made or not, this can also specify that an entry should be updated or deleted. FS-Cache maintains a virtual index tree in which all indices, files, objects and pages are kept. Bits of this tree may actually reside in one or more caches. FSDEF | +------------------------------------+ | | NFS AFS | | +--------------------------+ +-----------+ | | | | homedir mirror afs.org redhat.com | | | +------------+ +---------------+ +----------+ | | | | | | 00001 00002 00007 00125 vol00001 vol00002 | | | | | +---+---+ +-----+ +---+ +------+------+ +-----+----+ | | | | | | | | | | | | | PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak | | PG0 +-------+ | | 00001 00003 | +---+---+ | | | PG0 PG1 PG2 In the example above, two netfs's can be seen to be backed: NFS and AFS. These have different index hierarchies: (*) The NFS primary index will probably contain per-server indices. Each server index is indexed by NFS file handles to get data file objects. Each data file objects can have an array of pages, but may also have further child objects, such as extended attributes and directory entries. Extended attribute objects themselves have page-array contents. (*) The AFS primary index contains per-cell indices. Each cell index contains per-logical-volume indices. Each of volume index contains up to three indices for the read-write, read-only and backup mirrors of those volumes. Each of these contains vnode data file objects, each of which contains an array of pages. The very top index is the FS-Cache master index in which individual netfs's have entries. Any index object may reside in more than one cache, provided it only has index children. Any index with non-index object children will be assumed to only reside in one cache. The FS-Cache overview can be found in: Documentation/filesystems/caching/fscache.txt The netfs API to FS-Cache can be found in: Documentation/filesystems/caching/netfs-api.txt Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Recruit a page flag to aid in cache management. The following extra flag is defined: (1) PG_fscache (PG_private_2) The marked page is backed by a local cache and is pinning resources in the cache driver. If PG_fscache is set, then things that checked for PG_private will now also check for that. This includes things like truncation and page invalidation. The function page_has_private() had been added to make the checks for both PG_private and PG_private_2 at the same time. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Rik van Riel <riel@redhat.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
The attached patch causes read_cache_pages() to release page-private data on a page for which add_to_page_cache() fails. If the filler function fails, then the problematic page is left attached to the pagecache (with appropriate flags set, one presumes) and the remaining to-be-attached pages are invalidated and discarded. This permits pages with caching references associated with them to be cleaned up. The invalidatepage() address space op is called (indirectly) to do the honours. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Rik van Riel <riel@redhat.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Document the slow work thread pool. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Make the slow work pool configurable through /proc/sys/kernel/slow-work. (*) /proc/sys/kernel/slow-work/min-threads The minimum number of threads that should be in the pool as long as it is in use. This may be anywhere between 2 and max-threads. (*) /proc/sys/kernel/slow-work/max-threads The maximum number of threads that should in the pool. This may be anywhere between min-threads and 255 or NR_CPUS * 2, whichever is greater. (*) /proc/sys/kernel/slow-work/vslow-percentage The percentage of active threads in the pool that may be used to execute very slow work items. This may be between 1 and 99. The resultant number is bounded to between 1 and one fewer than the number of active threads. This ensures there is always at least one thread that can process very slow work items, and always at least one thread that won't. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Serge Hallyn <serue@us.ibm.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-
David Howells authored
Create a dynamically sized pool of threads for doing very slow work items, such as invoking mkdir() or rmdir() - things that may take a long time and may sleep, holding mutexes/semaphores and hogging a thread, and are thus unsuitable for workqueues. The number of threads is always at least a settable minimum, but more are started when there's more work to do, up to a limit. Because of the nature of the load, it's not suitable for a 1-thread-per-CPU type pool. A system with one CPU may well want several threads. This is used by FS-Cache to do slow caching operations in the background, such as looking up, creating or deleting cache objects. Signed-off-by:
David Howells <dhowells@redhat.com> Acked-by:
Serge Hallyn <serue@us.ibm.com> Acked-by:
Steve Dickson <steved@redhat.com> Acked-by:
Trond Myklebust <Trond.Myklebust@netapp.com> Acked-by:
Al Viro <viro@zeniv.linux.org.uk> Tested-by:
Daire Byrne <Daire.Byrne@framestore.com>
-