- 21 Feb, 2016 40 commits
-
-
Nikola Majkić authored
This reverts commit e03742da.
-
Dave Chinner authored
When sync does it's WB_SYNC_ALL writeback, it issues data Io and then immediately waits for IO completion. This is done in the context of the flusher thread, and hence completely ties up the flusher thread for the backing device until all the dirty inodes have been synced. On filesystems that are dirtying inodes constantly and quickly, this means the flusher thread can be tied up for minutes per sync call and hence badly affect system level write IO performance as the page cache cannot be cleaned quickly. We already have a wait loop for IO completion for sync(2), so cut this out of the flusher thread and delegate it to wait_sb_inodes(). Hence we can do rapid IO submission, and then wait for it all to complete. Effect of sync on fsmark before the patch: FSUse% Count Size Files/sec App Overhead ..... 0 640000 4096 35154.6 1026984 0 720000 4096 36740.3 1023844 0 800000 4096 36184.6 916599 0 880000 4096 1282.7 1054367 0 960000 4096 3951.3 918773 0 1040000 4096 40646.2 996448 0 1120000 4096 43610.1 895647 0 1200000 4096 40333.1 921048 And a single sync pass took: real 0m52.407s user 0m0.000s sys 0m0.090s After the patch, there is no impact on fsmark results, and each individual sync(2) operation run concurrently with the same fsmark workload takes roughly 7s: real 0m6.930s user 0m0.000s sys 0m0.039s IOWs, sync is 7-8x faster on a busy filesystem and does not have an adverse impact on ongoing async data write operations. Signed-off-by:
Dave Chinner <dchinner@redhat.com> Reviewed-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Change-Id: I9e55d65f5ecb2305497711d4688f0647d9346035
-
Jan Kara authored
In case when system contains no dirty pages, wakeup_flusher_threads() will submit WB_SYNC_NONE writeback for 0 pages so wb_writeback() exits immediately without doing anything. Thus sync(1) will write all the dirty inodes from a WB_SYNC_ALL writeback pass which is slow. Fix the problem by using get_nr_dirty_pages() in wakeup_flusher_threads() instead of calculating number of dirty pages manually. That function also takes number of dirty inodes into account. Change-Id: I458027ae08d9a5a93202a7b97ace1f8da7a18a07 CC: stable@vger.kernel.org Reported-by:
Paul Taysom <taysom@chromium.org> Signed-off-by:
Jan Kara <jack@suse.cz>
-
VijayKumar Gn authored
Front camera in Thea P1B uses MCLK1 where as Thea P1A & Titan Uses MCLK0. Providing a facility to introduce Overriding logic Change-Id: Iac48eda90e7b74229d81d88991e7cce606c42f20 Signed-off-by:
VijayKumar Gn <gxnf38@motorola.com> Reviewed-on: http://gerrit.mot.com/663248 Tested-by:
Jira Key <jirakey@motorola.com> Reviewed-by:
Sudharsan Yettapu <sudharsan.yettapu@motorola.com> Reviewed-by:
Jian-Jun Fan <stevenfan@motorola.com> Reviewed-by:
Christopher Fries <cfries@motorola.com> SLTApproved: Christopher Fries <cfries@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com>
-
Alok Chauhan authored
mutex is not released while checking for null pdata which can lead to possible deadlock condition. Change-Id: I1dc3efd725df5bb75c96434a81d9a8f3f868cd0a Signed-off-by:
Alok Chauhan <alokc@codeaurora.org> Reviewed-on: http://gerrit.mot.com/681902 Submit-Approved: Jira Key <jirakey@motorola.com> Tested-by:
Jira Key <jirakey@motorola.com> SLTApproved: Slta Waiver <sltawvr@motorola.com> Reviewed-by:
Yi-Wei Zhao <gbjc64@motorola.com> Reviewed-by:
Lian-Wei Wang <lian-wei.wang@motorola.com>
-
Andrew Wheeler authored
The phone will not boot with this change. This reverts commit 8016dca8. Conflicts: arch/arm/mach-msm/acpuclock-8226.c Change-Id: I0d99de60ba139d819a9521a2f2a9f606ed716b7d Reviewed-on: http://gerrit.pcs.mot.com/547213 SLT-Approved: Slta Waiver <sltawvr@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com> Tested-by:
Jira Key <jirakey@motorola.com> Reviewed-by:
Andrew Wheeler <thf876@motorola.com>
-
Naveen Ramaraj authored
Certain clients of ocmem always have exclusive regions set aside in the OCMEM memory map. Allocation requests for such clients do not have to wait for any pending evictions in progress to be completed nor will trigger any new evictions and can be safely scheduled right away. CRs-Fixed: 599002 Change-Id: I54369bca9d26378a9aa70db83d35ab94866a3ff1 Signed-off-by:
Naveen Ramaraj <nramaraj@codeaurora.org> Signed-off-by:
Neeti Desai <neetid@codeaurora.org> Reviewed-on: http://gerrit.mot.com/681940 Submit-Approved: Jira Key <jirakey@motorola.com> Tested-by:
Jira Key <jirakey@motorola.com> SLTApproved: Slta Waiver <sltawvr@motorola.com> Reviewed-by:
Lian-Wei Wang <lian-wei.wang@motorola.com> Reviewed-by:
Yi-Wei Zhao <gbjc64@motorola.com>
-
Yuanyuan Zhong authored
Three timespec tv_nsec values are added together when calculating monotonic boottime. They may have overflowed 32bit before adding with s64 nsecs. User will observe the monotonic boottime going backwards. Cast the tv_nsec value to s64 before doing the math. Change-Id: Iddd476bf26bc60e2830c5e90ccc4747790ac03a3 Signed-off-by:
Yuanyuan Zhong <zyy@motorola.com> Reviewed-on: http://gerrit.mot.com/688924 Tested-by:
Jira Key <jirakey@motorola.com> Reviewed-by:
Igor Kovalenko <igork@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com> SLTApproved: Christopher Fries <cfries@motorola.com>
-
Sunil Khatri authored
In existing code we calculate nbytes based on the byte boundary, but genalloc uses bitmap for maintaining the memory allocation aligned to long. So while calculating nbytes we end up getting wrong nbytes. example: lets say nbytes comes to 9 bytes for 70 bits when bytes aligned,but if long aligned we will have 3 long words i.e 12 bytes. This difference may lead to choosing the wrong api for freeing the memory i.e Between kfree() and vfree(). Fix was inspired by an upstream commit eedce141cd2dad8d0cefc5468ef41898949a7031, bringing same fix into the gen_pool_detroy path. Change-Id: I942caf59e25515c780896b328b912604df9e10bf Signed-off-by:
Hareesh Gundu <hareeshg@codeaurora.org> Signed-off-by:
Sunil Khatri <sunilkh@codeaurora.org>
-
Lianwei Wang authored
commit upstream 22939ea455c4290b32853c2e4fd79794e5336ddf. We saw the idle time report from /proc/stat is weird jump just after cpu hotplug. User 150 + Nice 0 + Sys 141 + Idle -44061 + IOW -1492 + IRQ 0 + SIRQ 4 = -45258 There are two reasons to cause this issue. 1. All the ts history data are cleaned by below patch. 4b0c0f2 tick: Cleanup NOHZ per cpu data on cpu down We don't need to clean all the ts data to fix the cpu hotplug race, but just clean the inidle and tick_stopped, and keep all the other ts data to record the history idle/iowait time data. 2. The idle/iowait time is read from different place by below patch. 7386cdb "nohz: Fix idle ticks in cpu summary line of /proc/stat" The idle/iowait time is read from get_cpu_idle/iowait_time_us when cpu is online, but is read from cpustat when cpu is offline. So we may compare the data with different base and get wrong result. Since the off time is no longer counted as idle time, it is safe to revert it now. Change-Id: I828caf375f882fdc92c41fe65ebd13a6ffbbd92e Signed-off-by:
Lianwei Wang <a22439@motorola.com> Reviewed-on: http://gerrit.mot.com/696792 Tested-by:
Jira Key <jirakey@motorola.com> Reviewed-by:
Igor Kovalenko <igork@motorola.com> Submit-Approved: Jira Key <jirakey@motorola.com> SLTApproved: Christopher Fries <cfries@motorola.com>
-
Jan Kara authored
commit ad56edad089b56300fd13bb9eeb7d0424d978239 upstream. jbd2_journal_dirty_metadata() didn't get a reference to journal_head it was working with. This is OK in most of the cases since the journal head should be attached to a transaction but in rare occasions when we are journalling data, __ext4_journalled_writepage() can race with jbd2_journal_invalidatepage() stripping buffers from a page and thus journal head can be freed under hands of jbd2_journal_dirty_metadata(). Fix the problem by getting own journal head reference in jbd2_journal_dirty_metadata() (and also in jbd2_journal_set_triggers() which can possibly have the same issue). Reported-by:
Zheng Liu <gnehzuil.liu@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Sandeen authored
commit eeecef0af5ea4efd763c9554cf2bd80fc4a0efd3 upstream. This sequence: # truncate --size=1g fsfile # mkfs.ext4 -F fsfile # mount -o loop,ro fsfile /mnt # umount /mnt # dmesg | tail results in an IO error when unmounting the RO filesystem: [ 318.020828] Buffer I/O error on device loop1, logical block 196608 [ 318.027024] lost page write due to I/O error on loop1 [ 318.032088] JBD2: Error -5 detected when updating journal superblock for loop1-8. This was a regression introduced by commit 24bcc89c : "jbd2: split updating of journal superblock and marking journal empty". Signed-off-by:
Eric Sandeen <sandeen@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Laura Abbott authored
Secure buffers are never passed to userspace and are controled by the secure world so there is no real need to zero. Pass the dma attribute to skip zeroing. Change-Id: Iad870d0d7732d3dea09443418b9294cb9e05b5e0 Signed-off-by:
Laura Abbott <lauraa@codeaurora.org>
-
Abhijeet Dharmapurikar authored
The PWM_CL register is secure i.e. driver has to write 0xA5 to 0XD0 register to allow updates to PWM_CL register. Fix it. Change-Id: I3f273627bdc137d8c10768c7d5824abe96ee8707 Signed-off-by:
Abhijeet Dharmapurikar <adharmap@codeaurora.org>
-
Hariram Purushothaman authored
V4L2 only allow 32 buffers. Check the num_buf to make sure that user space passed value is not out of bound. CRs-Fixed: 514698 Change-Id: I662ec1eb998ed8bfb2a7f188e645410aa78c83b0 Signed-off-by:
Hariram Purushothaman <hpurus@codeaurora.org> Signed-off-by:
Ankit Premrajka <ankitp@codeaurora.org> Signed-off-by:
Raghu DP <dp.raghu@codeaurora.org>
-
Ravi Kiran Vonteddu authored
Maximum width capability check is required to take care of indefinite behavior when a clip having width more than Q6 capability is played. Also, make the capability check generic for Q6 and Venus. CRs-fixed: 626642 Change-Id: Ic10be0ad4434019fea45e7a090b21ba5cf54d9a6 Signed-off-by:
Ravi Kiran Vonteddu <rvontedd@codeaurora.org>
-
Sarada Prasanna Garnayak authored
Upgarde firmware on the touch controller when the new firmware version is geater than the current firmware version. Update the version id after successful firmware update. skip firmware update process when device is in suspend state. CRs-Fixed: 623803 Change-Id: Ic462f6483887a3654665852e58ae9891de9f5eff Signed-off-by:
Sarada Prasanna Garnayak <c_sgarna@codeaurora.org>
-
Saravana Kannan authored
devfreq already provides a sysfs interface for changing polling/sampling period. So, there is no need for the governor to separately expose sampling_ms. Also, sampling_ms had to be explicitly updated whenever polling_ms was updated for the governor to function correctly. Make the interface simpler by combining sample_ms with polling_interval control provided by devfreq. The rounding of freq to multiples of bw_step is unnecessary since the devfreq device already does the rounding to the next valid level. Rounding of AB to multiples of bw_step is still necessary since it's a vote that's summed up and doesn't have a direct 1-to-1 mapping to frequencies. Also update default bw_step to 190 MB/s instead of 200 MB/s to account for the fact that MB/s to MHz conversion needs to take into account the difference in the meaning of M (2^20 vs 10^6) between MB and MHz. Similarly also update io_percent to 16. Change-Id: I5fea989c647955103de3813be8eb9ec612f131bc Signed-off-by:
Saravana Kannan <skannan@codeaurora.org>
-
Eric Dumazet authored
[ Upstream commit cd6b423afd3c08b27e1fed52db828ade0addbc6b ] While investigating about strange increase of retransmit rates on hosts ~24 days after boot, Van found hystart was disabled if ca->epoch_start was 0, as following condition is true when tcp_time_stamp high order bit is set. (s32)(tcp_time_stamp - ca->epoch_start) < HZ Quoting Van : At initialization & after every loss ca->epoch_start is set to zero so I believe that the above line will turn off hystart as soon as the 2^31 bit is set in tcp_time_stamp & hystart will stay off for 24 days. I think we've observed that cubic's restart is too aggressive without hystart so this might account for the higher drop rate we observe. Diagnosed-by:
Van Jacobson <vanj@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
[ Upstream commit 2ed0edf9090bf4afa2c6fc4f38575a85a80d4b20 ] commit 17a6e9f1 ("tcp_cubic: fix clock dependency") added an overflow error in bictcp_update() in following code : /* change the unit from HZ to bictcp_HZ */ t = ((tcp_time_stamp + msecs_to_jiffies(ca->delay_min>>3) - ca->epoch_start) << BICTCP_HZ) / HZ; Because msecs_to_jiffies() being unsigned long, compiler does implicit type promotion. We really want to constrain (tcp_time_stamp - ca->epoch_start) to a signed 32bit value, or else 't' has unexpected high values. This bugs triggers an increase of retransmit rates ~24 days after boot [1], as the high order bit of tcp_time_stamp flips. [1] for hosts with HZ=1000 Big thanks to Van Jacobson for spotting this problem. Diagnosed-by:
Van Jacobson <vanj@google.com> Signed-off-by:
Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Acked-by:
Neal Cardwell <ncardwell@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lars-Peter Clausen authored
commit 8abac3ba51b5525354e9b2ec0eed1c9e95c905d9 upstream. The last register block, which falls into the specified range, is not handled correctly. The formula which calculates the number of register which should be synced is inverse (and off by one). E.g. if all registers in that block should be synced only one is synced, and if only one should be synced all (but one) are synced. To calculate the number of registers that need to be synced we need to subtract the number of the first register in the block from the max register number and add one. This patch updates the code accordingly. The issue was introduced in commit ac8d91c8 ("regmap: Supply ranges to the sync operations"). Signed-off-by:
Lars-Peter Clausen <lars@metafoo.de> Signed-off-by:
Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Balakumaran Kannan authored
[ Upstream commit 25fb6ca4ed9cad72f14f61629b68dc03c0d9713f ] IPv6 Routing table becomes broken once we do ifdown, ifup of the loopback(lo) interface. After down-up, routes of other interface's IPv6 addresses through 'lo' are lost. IPv6 addresses assigned to all interfaces are routed through 'lo' for internal communication. Once 'lo' is down, those routing entries are removed from routing table. But those removed entries are not being re-created properly when 'lo' is brought up. So IPv6 addresses of other interfaces becomes unreachable from the same machine. Also this breaks communication with other machines because of NDISC packet processing failure. This patch fixes this issue by reading all interface's IPv6 addresses and adding them to IPv6 routing table while bringing up 'lo'. ==Testing== Before applying the patch: $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo 2000::20/128 :: Un 0 1 0 lo fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ sudo ifdown lo $ sudo ifup lo $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ After applying the patch: $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo 2000::20/128 :: Un 0 1 0 lo fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ sudo ifdown lo $ sudo ifup lo $ route -A inet6 Kernel IPv6 routing table Destination Next Hop Flag Met Ref Use If 2000::20/128 :: U 256 0 0 eth0 fe80::/64 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo ::1/128 :: Un 0 1 0 lo 2000::20/128 :: Un 0 1 0 lo fe80::xxxx:xxxx:xxxx:xxxx/128 :: Un 0 1 0 lo ff00::/8 :: U 256 0 0 eth0 ::/0 :: !n -1 1 1 lo $ Signed-off-by:
Balakumaran Kannan <Balakumaran.Kannan@ap.sony.com> Signed-off-by:
Maruthi Thotad <Maruthi.Thotad@ap.sony.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Slaby authored
commit 37b7f3c76595e23257f61bd80b223de8658617ee upstream. In commit b0de59b5733d ("TTY: do not update atime/mtime on read/write") we removed timestamps from tty inodes to fix a security issue and waited if something breaks. Well, 'w', the utility to find out logged users and their inactivity time broke. It shows that users are inactive since the time they logged in. To revert to the old behaviour while still preventing attackers to guess the password length, we update the timestamps in one-minute intervals by this patch. Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jiri Slaby authored
commit b0de59b5733d18b0d1974a060860a8b5c1b36a2e upstream. On http://vladz.devzero.fr/013_ptmx-timing.php , we can see how to find out length of a password using timestamps of /dev/ptmx. It is documented in "Timing Analysis of Keystrokes and Timing Attacks on SSH". To avoid that problem, do not update time when reading from/writing to a TTY. I am afraid of regressions as this is a behavior we have since 0.97 and apps may expect the time to be current, e.g. for monitoring whether there was a change on the TTY. Now, there is no change. So this would better have a lot of testing before it goes upstream. References: CVE-2013-0160 Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Russell King authored
commit b6c7aabd923a17af993c5a5d5d7995f0b27c000a upstream. Let's do the changes properly and fix the same problem everywhere, not just for one case. Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Lukas Czerner authored
commit 810da240f221d64bf90020f25941b05b378186fe upstream. We're using macro EXT4_B2C() to convert number of blocks to number of clusters for bigalloc file systems. However, we should be using EXT4_NUM_B2C(). Signed-off-by:
Lukas Czerner <lczerner@redhat.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Signed-off-by:
CAI Qian <caiqian@redhat.com> Signed-off-by:
Lingzhu Xiang <lxiang@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Riley Andrews authored
Fix build broken by commit b0bd81a67ae5ced88b ("android: drivers: workaround debugfs race in binder"). Change-Id: I10c5c0211144a4a1c270dc03f92cf6a1a829e8f8
-
Russell King authored
commit 43659222e7a0113912ed02f6b2231550b3e471ac upstream. It's no good setting vga_base after the VGA console has been initialised, because if we do that we get this: Unable to handle kernel paging request at virtual address 000b8000 pgd = c0004000 [000b8000] *pgd=07ffc831, *pte=00000000, *ppte=00000000 0Internal error: Oops: 5017 [#1] ARM Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 3.12.0+ #49 task: c03e2974 ti: c03d8000 task.ti: c03d8000 PC is at vgacon_startup+0x258/0x39c LR is at request_resource+0x10/0x1c pc : [<c01725d0>] lr : [<c0022b50>] psr: 60000053 sp : c03d9f68 ip : 000b8000 fp : c03d9f8c r10: 000055aa r9 : 4401a103 r8 : ffffaa55 r7 : c03e357c r6 : c051b460 r5 : 000000ff r4 : 000c0000 r3 : 000b8000 r2 : c03e0514 r1 : 00000000 r0 : c0304971 Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment kernel which is an access to the 0xb8000 without the PCI offset required to make it work. Fixes: cc22b4c1 ("ARM: set vga memory base at run-time") Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> [bwh: Backported to 3.2: adjust context] Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Cc: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Arnd Bergmann authored
commit 92bdd3f5eba299b33c2f4407977d6fa2e2a6a0da upstream. The cpu_topology symbol is required by any driver using the topology interfaces, which leads to a couple of build errors: ERROR: "cpu_topology" [drivers/net/ethernet/sfc/sfc.ko] undefined! ERROR: "cpu_topology" [drivers/cpufreq/arm_big_little.ko] undefined! ERROR: "cpu_topology" [drivers/block/mtip32xx/mtip32xx.ko] undefined! The obvious solution is to export this symbol. Signed-off-by:
Arnd Bergmann <arnd@arndb.de> Acked-by:
Will Deacon <will.deacon@arm.com> Cc: Nicolas Pitre <nico@linaro.org> Cc: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Cc: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Russell King authored
commit 29c350bf28da333e41e30497b649fe335712a2ab upstream. The array was missing the final entry for the undefined instruction exception handler; this commit adds it. Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sujit Reddy Thumma authored
eMMC and SD card specifications restrict the usage of a class of commands while commands in other class are in progress. For example, during erase operations the SD/eMMC spec. allows only CMD35, CMD36, CMD38. If clock scaling is enabled and decide to scale up the clocks it may be possible that CMD19/21 tuning commands are sent in between erase commands, which is illegal as per specification. Fix such illegal transactions to the card and also make clock scaling statistics accountable only for read/write commands instead of time consuming commands, like CMD38 erase, where transactions are independent of bus frequency. Change-Id: Iffba175787837e7f95bde8970f19d0f0f9d7d67d Signed-off-by:
Sujit Reddy Thumma <sthumma@codeaurora.org>
-
Greg Thelen authored
commit 5f00110f7273f9ff04ac69a5f85bb535a4fd0987 upstream. The tmpfs remount logic preserves filesystem mempolicy if the mpol=M option is not specified in the remount request. A new policy can be specified if mpol=M is given. Before this patch remounting an mpol bound tmpfs without specifying mpol= mount option in the remount request would set the filesystem's mempolicy object to a freed mempolicy object. To reproduce the problem boot a DEBUG_PAGEALLOC kernel and run: # mkdir /tmp/x # mount -t tmpfs -o size=100M,mpol=interleave nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=102400k,mpol=interleave:0-3 0 0 # mount -o remount,size=200M nodev /tmp/x # grep /tmp/x /proc/mounts nodev /tmp/x tmpfs rw,relatime,size=204800k,mpol=??? 0 0 # note ? garbage in mpol=... output above # dd if=/dev/zero of=/tmp/x/f count=1 # panic here Panic: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [< (null)>] (null) [...] Oops: 0010 [#1] SMP DEBUG_PAGEALLOC Call Trace: mpol_shared_policy_init+0xa5/0x160 shmem_get_inode+0x209/0x270 shmem_mknod+0x3e/0xf0 shmem_create+0x18/0x20 vfs_create+0xb5/0x130 do_last+0x9a1/0xea0 path_openat+0xb3/0x4d0 do_filp_open+0x42/0xa0 do_sys_open+0xfe/0x1e0 compat_sys_open+0x1b/0x20 cstar_dispatch+0x7/0x1f Non-debug kernels will not crash immediately because referencing the dangling mpol will not cause a fault. Instead the filesystem will reference a freed mempolicy object, which will cause unpredictable behavior. The problem boils down to a dropped mpol reference below if shmem_parse_options() does not allocate a new mpol: config = *sbinfo shmem_parse_options(data, &config, true) mpol_put(sbinfo->mpol) sbinfo->mpol = config.mpol /* BUG: saves unreferenced mpol */ This patch avoids the crash by not releasing the mempolicy if shmem_parse_options() doesn't create a new mpol. How far back does this issue go? I see it in both 2.6.36 and 3.3. I did not look back further. Signed-off-by:
Greg Thelen <gthelen@google.com> Acked-by:
Hugh Dickins <hughd@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
commit 7c45512df987c5619db041b5c9b80d281e26d3db upstream. Commit c060f943d092 ("mm: use aligned zone start for pfn_to_bitidx calculation") fixed out calculation of the index into the pageblock bitmap when a !SPARSEMEM zome was not aligned to pageblock_nr_pages. However, the _allocation_ of that bitmap had never taken this alignment requirement into accout, so depending on the exact size and alignment of the zone, the use of that index could then access past the allocation, resulting in some very subtle memory corruption. This was reported (and bisected) by Ingo Molnar: one of his random config builds would hang with certain very specific kernel command line options. In the meantime, commit c060f943d092 has been marked for stable, so this fix needs to be back-ported to the stable kernels that backported the commit to use the right alignment. Bisected-and-tested-by:
Ingo Molnar <mingo@kernel.org> Acked-by:
Mel Gorman <mgorman@suse.de> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mel Gorman authored
commit 18a2f371f5edf41810f6469cb9be39931ef9deb9 upstream. This fixes a regression in 3.7-rc, which has since gone into stable. Commit 00442ad04a5e ("mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went on expecting alloc_page_vma() to drop the refcount it had acquired. This deserves a rework: but for now fix the leak in shmem_alloc_page(). Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use the same refcounting there as in shmem_alloc_page(), delete its onstack mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() - those were invented to let swapin_readahead() make an unknown number of calls to alloc_pages_vma() with one mempolicy; but since 00442ad04a5e, alloc_pages_vma() has kept refcount in balance, so now no problem. Reported-and-tested-by:
Tommi Rantala <tt.rantala@gmail.com> Signed-off-by:
Mel Gorman <mgorman@suse.de> Signed-off-by:
Hugh Dickins <hughd@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Hugh Dickins authored
commit f2a07f40dbc603c15f8b06e6ec7f768af67b424f upstream. Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA mempolicy testing. Very nasty. Reading /proc/mounts, /proc/pid/mounts or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often in a page table (causing "Bad swap" or "Bad page map" warning or "Bad pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere worse. "mpol=prefer" and "mpol=prefer:Node" are equally toxic. Recent NUMA enhancements are not to blame: this dates back to 2.6.35, when commit e17f74af "mempolicy: don't call mpol_set_nodemask() when no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(), which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags. With slab poisoning, you can then rely on mpol_to_str() to set the bit for node 0x6b6b, probably in the next page above the caller's stack. mpol_parse_str() is only called from shmem_parse_options(): no_context is always true, so call it unused for now, and remove !no_context code. Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might expect. Then mpol_to_str() can ignore its no_context argument also, the mpol being appropriately initialized whether contextualized or not. Rename its no_context unused too, and let subsequent patch remove them (that's not needed for stable backporting, which would involve rejects). I don't understand why MPOL_LOCAL is described as a pseudo-policy: it's a reasonable policy which suffers from a confusing implementation in terms of MPOL_PREFERRED with MPOL_F_LOCAL. I believe this would be much more robust if MPOL_LOCAL were recognized in switch statements throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly empty) nodes mask like everyone else, instead of its preferred_node variant (I presume an optimization from the days before MPOL_LOCAL). But that would take me too long to get right and fully tested. Signed-off-by:
Hugh Dickins <hughd@google.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Michal Hocko authored
commit 9a5a8f19b43430752067ecaee62fc59e11e88fa6 upstream. oom_badness() takes a totalpages argument which says how many pages are available and it uses it as a base for the score calculation. The value is calculated by mem_cgroup_get_limit which considers both limit and total_swap_pages (resp. memsw portion of it). This is usually correct but since fe35004fbf9e ("mm: avoid swapping out with swappiness==0") we do not swap when swappiness is 0 which means that we cannot really use up all the totalpages pages. This in turn confuses oom score calculation if the memcg limit is much smaller than the available swap because the used memory (capped by the limit) is negligible comparing to totalpages so the resulting score is too small if adj!=0 (typically task with CAP_SYS_ADMIN or non zero oom_score_adj). A wrong process might be selected as result. The problem can be worked around by checking mem_cgroup_swappiness==0 and not considering swap at all in such a case. Signed-off-by:
Michal Hocko <mhocko@suse.cz> Acked-by:
David Rientjes <rientjes@google.com> Acked-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Acked-by:
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mel Gorman authored
commit b22d127a39ddd10d93deee3d96e643657ad53a49 upstream. shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot be dereferenced if sp->lock is not held and 2) another thread can modify sp_node between spin_unlock for allocating a new sp node and next spin_lock. The bug was introduced before 2.6.12-rc2. Kosaki's original patch for this problem was to allocate an sp node and policy within shared_policy_replace and initialise it when the lock is reacquired. I was not keen on this approach because it partially duplicates sp_alloc(). As the paths were sp->lock is taken are not that performance critical this patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc(). [kosaki.motohiro@jp.fujitsu.com: Original patch] Signed-off-by:
Mel Gorman <mgorman@suse.de> Acked-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Reviewed-by:
Christoph Lameter <cl@linux.com> Cc: Josh Boyer <jwboyer@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Laura Abbott authored
There are many drivers in the kernel which can hold on to lots of memory. It can be useful to dump out all those drivers at key points in the kernel. Introduct a notifier framework for dumping this information. When the notifiers are called, drivers can dump out the state of any memory they may be using. Change-Id: I514ef1d01510a50970a661c8e9bedc8b78683eab Signed-off-by:
Laura Abbott <lauraa@codeaurora.org>
-
Vinayak Menon authored
Currently, vmpressure is tied to memcg and its events are available only to userspace clients. This patch removes the dependency on CONFIG_MEMCG and adds a mechanism for in-kernel clients to subscribe for vmpressure events (in fact raw vmpressure values are delivered instead of vmpressure levels, to provide clients more flexibility to take actions on custom pressure levels which are not currently defined by vmpressure module). Change-Id: I1500c098cde11010e463d67955e8a03feb193a67 Signed-off-by:
Vinayak Menon <vinmenon@codeaurora.org>
-
Liam Mark authored
Allow other functions to dump the list of tasks. Useful for when debugging memory leaks. Bug: 17871993 Change-Id: I0d9e812d242cbd9e152d561be9a16c00bad3c032 Signed-off-by:
Liam Mark <lmark@codeaurora.org> Signed-off-by:
Naveen Ramaraj <nramaraj@codeaurora.org>
-