System vulnerabilities
Showing 5401 - 5450 of 8.8K CVEs
- CVE-2024-49885 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: mm, slub: avoid zeroing kmalloc redzone Since commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested"), setting orig_size treats the wasted space (object_size - orig_size) as a redzone. However with init_on_free=1 we clear the full object->size, including the redzone. Additionally we clear the object metadata, including the stored orig_size, making it zero, which makes check_object() treat the whole object as a redzone. These issues lead to the following BUG report with "slub_debug=FUZ init_on_free=1": [ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-8 (Not tainted): kmalloc Redzone overwritten [ 0.000000] ----------------------------------------------------------------------------- [ 0.000000] [ 0.000000] 0xffff000010032858-0xffff00001003285f @offset=2136. First byte 0x0 instead of 0xcc [ 0.000000] FIX kmalloc-8: Restoring kmalloc Redzone 0xffff000010032858-0xffff00001003285f=0xcc [ 0.000000] Slab 0xfffffdffc0400c80 objects=36 used=23 fp=0xffff000010032a18 flags=0x3fffe0000000200(workingset|node=0|zone=0|lastcpupid=0x1ffff) [ 0.000000] Object 0xffff000010032858 @offset=2136 fp=0xffff0000100328c8 [ 0.000000] [ 0.000000] Redzone ffff000010032850: cc cc cc cc cc cc cc cc ........ [ 0.000000] Object ffff000010032858: cc cc cc cc cc cc cc cc ........ [ 0.000000] Redzone ffff000010032860: cc cc cc cc cc cc cc cc ........ [ 0.000000] Padding ffff0000100328b4: 00 00 00 00 00 00 00 00 00 00 00 00 ............ [ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814-00004-g61844c55c3f4 #144 [ 0.000000] Hardware name: NXP i.MX95 19X19 board (DT) [ 0.000000] Call trace: [ 0.000000] dump_backtrace+0x90/0xe8 [ 0.000000] show_stack+0x18/0x24 [ 0.000000] dump_stack_lvl+0x74/0x8c [ 0.000000] dump_stack+0x18/0x24 [ 0.000000] print_trailer+0x150/0x218 [ 0.000000] check_object+0xe4/0x454 [ 0.000000] free_to_partial_list+0x2f8/0x5ec To address the issue, use orig_size to clear the used area. And restore the value of orig_size after clear the remaining area. When CONFIG_SLUB_DEBUG not defined, (get_orig_size()' directly returns s->object_size. So when using memset to init the area, the size can simply be orig_size, as orig_size returns object_size when CONFIG_SLUB_DEBUG not enabled. And orig_size can never be bigger than object_size.
- CVE-2024-49884 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ext4: fix slab-use-after-free in ext4_split_extent_at() We hit the following use-after-free: ================================================================== BUG: KASAN: slab-use-after-free in ext4_split_extent_at+0xba8/0xcc0 Read of size 2 at addr ffff88810548ed08 by task kworker/u20:0/40 CPU: 0 PID: 40 Comm: kworker/u20:0 Not tainted 6.9.0-dirty #724 Call Trace: <TASK> kasan_report+0x93/0xc0 ext4_split_extent_at+0xba8/0xcc0 ext4_split_extent.isra.0+0x18f/0x500 ext4_split_convert_extents+0x275/0x750 ext4_ext_handle_unwritten_extents+0x73e/0x1580 ext4_ext_map_blocks+0xe20/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] Allocated by task 40: __kmalloc_noprof+0x1ac/0x480 ext4_find_extent+0xf3b/0x1e70 ext4_ext_map_blocks+0x188/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] Freed by task 40: kfree+0xf1/0x2b0 ext4_find_extent+0xa71/0x1e70 ext4_ext_insert_extent+0xa22/0x3260 ext4_split_extent_at+0x3ef/0xcc0 ext4_split_extent.isra.0+0x18f/0x500 ext4_split_convert_extents+0x275/0x750 ext4_ext_handle_unwritten_extents+0x73e/0x1580 ext4_ext_map_blocks+0xe20/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] ================================================================== The flow of issue triggering is as follows: ext4_split_extent_at path = *ppath ext4_ext_insert_extent(ppath) ext4_ext_create_new_leaf(ppath) ext4_find_extent(orig_path) path = *orig_path read_extent_tree_block // return -ENOMEM or -EIO ext4_free_ext_path(path) kfree(path) *orig_path = NULL a. If err is -ENOMEM: ext4_ext_dirty(path + path->p_depth) // path use-after-free !!! b. If err is -EIO and we have EXT_DEBUG defined: ext4_ext_show_leaf(path) eh = path[depth].p_hdr // path also use-after-free !!! So when trying to zeroout or fix the extent length, call ext4_find_extent() to update the path. In addition we use *ppath directly as an ext4_ext_show_leaf() input to avoid possible use-after-free when EXT_DEBUG is defined, and to avoid unnecessary path updates.
- CVE-2024-49883 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ext4: aovid use-after-free in ext4_ext_insert_extent() As Ojaswin mentioned in Link, in ext4_ext_insert_extent(), if the path is reallocated in ext4_ext_create_new_leaf(), we'll use the stale path and cause UAF. Below is a sample trace with dummy values: ext4_ext_insert_extent path = *ppath = 2000 ext4_ext_create_new_leaf(ppath) ext4_find_extent(ppath) path = *ppath = 2000 if (depth > path[0].p_maxdepth) kfree(path = 2000); *ppath = path = NULL; path = kcalloc() = 3000 *ppath = 3000; return path; /* here path is still 2000, UAF! */ eh = path[depth].p_hdr ================================================================== BUG: KASAN: slab-use-after-free in ext4_ext_insert_extent+0x26d4/0x3330 Read of size 8 at addr ffff8881027bf7d0 by task kworker/u36:1/179 CPU: 3 UID: 0 PID: 179 Comm: kworker/u6:1 Not tainted 6.11.0-rc2-dirty #866 Call Trace: <TASK> ext4_ext_insert_extent+0x26d4/0x3330 ext4_ext_map_blocks+0xe22/0x2d40 ext4_map_blocks+0x71e/0x1700 ext4_do_writepages+0x1290/0x2800 [...] Allocated by task 179: ext4_find_extent+0x81c/0x1f70 ext4_ext_map_blocks+0x146/0x2d40 ext4_map_blocks+0x71e/0x1700 ext4_do_writepages+0x1290/0x2800 ext4_writepages+0x26d/0x4e0 do_writepages+0x175/0x700 [...] Freed by task 179: kfree+0xcb/0x240 ext4_find_extent+0x7c0/0x1f70 ext4_ext_insert_extent+0xa26/0x3330 ext4_ext_map_blocks+0xe22/0x2d40 ext4_map_blocks+0x71e/0x1700 ext4_do_writepages+0x1290/0x2800 ext4_writepages+0x26d/0x4e0 do_writepages+0x175/0x700 [...] ================================================================== So use *ppath to update the path to avoid the above problem.
- CVE-2024-49882 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ext4: fix double brelse() the buffer of the extents path In ext4_ext_try_to_merge_up(), set path[1].p_bh to NULL after it has been released, otherwise it may be released twice. An example of what triggers this is as follows: split2 map split1 |--------|-------|--------| ext4_ext_map_blocks ext4_ext_handle_unwritten_extents ext4_split_convert_extents // path->p_depth == 0 ext4_split_extent // 1. do split1 ext4_split_extent_at |ext4_ext_insert_extent | ext4_ext_create_new_leaf | ext4_ext_grow_indepth | le16_add_cpu(&neh->eh_depth, 1) | ext4_find_extent | // return -ENOMEM |// get error and try zeroout |path = ext4_find_extent | path->p_depth = 1 |ext4_ext_try_to_merge | ext4_ext_try_to_merge_up | path->p_depth = 0 | brelse(path[1].p_bh) ---> not set to NULL here |// zeroout success // 2. update path ext4_find_extent // 3. do split2 ext4_split_extent_at ext4_ext_insert_extent ext4_ext_create_new_leaf ext4_ext_grow_indepth le16_add_cpu(&neh->eh_depth, 1) ext4_find_extent path[0].p_bh = NULL; path->p_depth = 1 read_extent_tree_block ---> return err // path[1].p_bh is still the old value ext4_free_ext_path ext4_ext_drop_refs // path->p_depth == 1 brelse(path[1].p_bh) ---> brelse a buffer twice Finally got the following WARRNING when removing the buffer from lru: ============================================ VFS: brelse: Trying to free free buffer WARNING: CPU: 2 PID: 72 at fs/buffer.c:1241 __brelse+0x58/0x90 CPU: 2 PID: 72 Comm: kworker/u19:1 Not tainted 6.9.0-dirty #716 RIP: 0010:__brelse+0x58/0x90 Call Trace: <TASK> __find_get_block+0x6e7/0x810 bdev_getblk+0x2b/0x480 __ext4_get_inode_loc+0x48a/0x1240 ext4_get_inode_loc+0xb2/0x150 ext4_reserve_inode_write+0xb7/0x230 __ext4_mark_inode_dirty+0x144/0x6a0 ext4_ext_insert_extent+0x9c8/0x3230 ext4_ext_map_blocks+0xf45/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] ============================================
- CVE-2024-49881 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ext4: update orig_path in ext4_find_extent() In ext4_find_extent(), if the path is not big enough, we free it and set *orig_path to NULL. But after reallocating and successfully initializing the path, we don't update *orig_path, in which case the caller gets a valid path but a NULL ppath, and this may cause a NULL pointer dereference or a path memory leak. For example: ext4_split_extent path = *ppath = 2000 ext4_find_extent if (depth > path[0].p_maxdepth) kfree(path = 2000); *orig_path = path = NULL; path = kcalloc() = 3000 ext4_split_extent_at(*ppath = NULL) path = *ppath; ex = path[depth].p_ext; // NULL pointer dereference! ================================================================== BUG: kernel NULL pointer dereference, address: 0000000000000010 CPU: 6 UID: 0 PID: 576 Comm: fsstress Not tainted 6.11.0-rc2-dirty #847 RIP: 0010:ext4_split_extent_at+0x6d/0x560 Call Trace: <TASK> ext4_split_extent.isra.0+0xcb/0x1b0 ext4_ext_convert_to_initialized+0x168/0x6c0 ext4_ext_handle_unwritten_extents+0x325/0x4d0 ext4_ext_map_blocks+0x520/0xdb0 ext4_map_blocks+0x2b0/0x690 ext4_iomap_begin+0x20e/0x2c0 [...] ================================================================== Therefore, *orig_path is updated when the extent lookup succeeds, so that the caller can safely use path or *ppath.
- CVE-2024-49880 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ext4: fix off by one issue in alloc_flex_gd() Wesley reported an issue: ================================================================== EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks ------------[ cut here ]------------ kernel BUG at fs/ext4/resize.c:324! CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27 RIP: 0010:ext4_resize_fs+0x1212/0x12d0 Call Trace: __ext4_ioctl+0x4e0/0x1800 ext4_ioctl+0x12/0x20 __x64_sys_ioctl+0x99/0xd0 x64_sys_call+0x1206/0x20d0 do_syscall_64+0x72/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e ================================================================== While reviewing the patch, Honza found that when adjusting resize_bg in alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than flexbg_size. The reproduction of the problem requires the following: o_group = flexbg_size * 2 * n; o_size = (o_group + 1) * group_size; n_group: [o_group + flexbg_size, o_group + flexbg_size * 2) o_size = (n_group + 1) * group_size; Take n=0,flexbg_size=16 as an example: last:15 |o---------------|--------------n-| o_group:0 resize to n_group:30 The corresponding reproducer is: img=test.img rm -f $img truncate -s 600M $img mkfs.ext4 -F $img -b 1024 -G 16 8M dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 248M Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE() to prevent the issue from happening again. [ Note: another reproucer which this commit fixes is: img=test.img rm -f $img truncate -s 25MiB $img mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img truncate -s 3GiB $img dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 3G umount $dev losetup -d $dev -- TYT ]
- CVE-2024-49879 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: drm: omapdrm: Add missing check for alloc_ordered_workqueue As it may return NULL pointer and cause NULL pointer dereference. Add check for the return value of alloc_ordered_workqueue.
- CVE-2024-49878 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: resource: fix region_intersects() vs add_memory_driver_managed() On a system with CXL memory, the resource tree (/proc/iomem) related to CXL memory may look like something as follows. 490000000-50fffffff : CXL Window 0 490000000-50fffffff : region0 490000000-50fffffff : dax0.0 490000000-50fffffff : System RAM (kmem) Because drivers/dax/kmem.c calls add_memory_driver_managed() during onlining CXL memory, which makes "System RAM (kmem)" a descendant of "CXL Window X". This confuses region_intersects(), which expects all "System RAM" resources to be at the top level of iomem_resource. This can lead to bugs. For example, when the following command line is executed to write some memory in CXL memory range via /dev/mem, $ dd if=data of=/dev/mem bs=$((1 << 10)) seek=$((0x490000000 >> 10)) count=1 dd: error writing '/dev/mem': Bad address 1+0 records in 0+0 records out 0 bytes copied, 0.0283507 s, 0.0 kB/s the command fails as expected. However, the error code is wrong. It should be "Operation not permitted" instead of "Bad address". More seriously, the /dev/mem permission checking in devmem_is_allowed() passes incorrectly. Although the accessing is prevented later because ioremap() isn't allowed to map system RAM, it is a potential security issue. During command executing, the following warning is reported in the kernel log for calling ioremap() on system RAM. ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d Call Trace: memremap+0xcb/0x184 xlate_dev_mem_ptr+0x25/0x2f write_mem+0x94/0xfb vfs_write+0x128/0x26d ksys_write+0xac/0xfe do_syscall_64+0x9a/0xfd entry_SYSCALL_64_after_hwframe+0x4b/0x53 The details of command execution process are as follows. In the above resource tree, "System RAM" is a descendant of "CXL Window 0" instead of a top level resource. So, region_intersects() will report no System RAM resources in the CXL memory region incorrectly, because it only checks the top level resources. Consequently, devmem_is_allowed() will return 1 (allow access via /dev/mem) for CXL memory region incorrectly. Fortunately, ioremap() doesn't allow to map System RAM and reject the access. So, region_intersects() needs to be fixed to work correctly with the resource tree with "System RAM" not at top level as above. To fix it, if we found a unmatched resource in the top level, we will continue to search matched resources in its descendant resources. So, we will not miss any matched resources in resource tree anymore. In the new implementation, an example resource tree |------------- "CXL Window 0" ------------| |-- "System RAM" --| will behave similar as the following fake resource tree for region_intersects(, IORESOURCE_SYSTEM_RAM, ), |-- "System RAM" --||-- "CXL Window 0a" --| Where "CXL Window 0a" is part of the original "CXL Window 0" that isn't covered by "System RAM".
- CVE-2024-49877 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if bh is NULL.
- CVE-2024-49876 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: drm/xe: fix UAF around queue destruction We currently do stuff like queuing the final destruction step on a random system wq, which will outlive the driver instance. With bad timing we can teardown the driver with one or more work workqueue still being alive leading to various UAF splats. Add a fini step to ensure user queues are properly torn down. At this point GuC should already be nuked so queue itself should no longer be referenced from hw pov. v2 (Matt B) - Looks much safer to use a waitqueue and then just wait for the xa_array to become empty before triggering the drain. (cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3)
- CVE-2024-49875 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: nfsd: map the EBADMSG to nfserr_io to avoid warning Ext4 will throw -EBADMSG through ext4_readdir when a checksum error occurs, resulting in the following WARNING. Fix it by mapping EBADMSG to nfserr_io. nfsd_buffered_readdir iterate_dir // -EBADMSG -74 ext4_readdir // .iterate_shared ext4_dx_readdir ext4_htree_fill_tree htree_dirblock_to_tree ext4_read_dirblock __ext4_read_dirblock ext4_dirblock_csum_verify warn_no_space_for_csum __warn_no_space_for_csum return ERR_PTR(-EFSBADCRC) // -EBADMSG -74 nfserrno // WARNING [ 161.115610] ------------[ cut here ]------------ [ 161.116465] nfsd: non-standard errno: -74 [ 161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0 [ 161.118596] Modules linked in: [ 161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138 [ 161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe mu.org 04/01/2014 [ 161.123601] RIP: 0010:nfserrno+0x9d/0xd0 [ 161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6 05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33 [ 161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286 [ 161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a [ 161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827 [ 161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021 [ 161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8 [ 161.135244] FS: 0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000 [ 161.136695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0 [ 161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 161.141519] PKRU: 55555554 [ 161.142076] Call Trace: [ 161.142575] ? __warn+0x9b/0x140 [ 161.143229] ? nfserrno+0x9d/0xd0 [ 161.143872] ? report_bug+0x125/0x150 [ 161.144595] ? handle_bug+0x41/0x90 [ 161.145284] ? exc_invalid_op+0x14/0x70 [ 161.146009] ? asm_exc_invalid_op+0x12/0x20 [ 161.146816] ? nfserrno+0x9d/0xd0 [ 161.147487] nfsd_buffered_readdir+0x28b/0x2b0 [ 161.148333] ? nfsd4_encode_dirent_fattr+0x380/0x380 [ 161.149258] ? nfsd_buffered_filldir+0xf0/0xf0 [ 161.150093] ? wait_for_concurrent_writes+0x170/0x170 [ 161.151004] ? generic_file_llseek_size+0x48/0x160 [ 161.151895] nfsd_readdir+0x132/0x190 [ 161.152606] ? nfsd4_encode_dirent_fattr+0x380/0x380 [ 161.153516] ? nfsd_unlink+0x380/0x380 [ 161.154256] ? override_creds+0x45/0x60 [ 161.155006] nfsd4_encode_readdir+0x21a/0x3d0 [ 161.155850] ? nfsd4_encode_readlink+0x210/0x210 [ 161.156731] ? write_bytes_to_xdr_buf+0x97/0xe0 [ 161.157598] ? __write_bytes_to_xdr_buf+0xd0/0xd0 [ 161.158494] ? lock_downgrade+0x90/0x90 [ 161.159232] ? nfs4svc_decode_voidarg+0x10/0x10 [ 161.160092] nfsd4_encode_operation+0x15a/0x440 [ 161.160959] nfsd4_proc_compound+0x718/0xe90 [ 161.161818] nfsd_dispatch+0x18e/0x2c0 [ 161.162586] svc_process_common+0x786/0xc50 [ 161.163403] ? nfsd_svc+0x380/0x380 [ 161.164137] ? svc_printk+0x160/0x160 [ 161.164846] ? svc_xprt_do_enqueue.part.0+0x365/0x380 [ 161.165808] ? nfsd_svc+0x380/0x380 [ 161.166523] ? rcu_is_watching+0x23/0x40 [ 161.167309] svc_process+0x1a5/0x200 [ 161.168019] nfsd+0x1f5/0x380 [ 161.168663] ? nfsd_shutdown_threads+0x260/0x260 [ 161.169554] kthread+0x1c4/0x210 [ 161.170224] ? kthread_insert_work_sanity_check+0x80/0x80 [ 161.171246] ret_from_fork+0x1f/0x30
- CVE-2024-49874 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition In the svc_i3c_master_probe function, &master->hj_work is bound with svc_i3c_master_hj_work, &master->ibi_work is bound with svc_i3c_master_ibi_work. And svc_i3c_master_ibi_work can start the hj_work, svc_i3c_master_irq_handler can start the ibi_work. If we remove the module which will call svc_i3c_master_remove to make cleanup, it will free master->base through i3c_master_unregister while the work mentioned above will be used. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | svc_i3c_master_hj_work svc_i3c_master_remove | i3c_master_unregister(&master->base)| device_unregister(&master->dev) | device_release | //free master->base | | i3c_master_do_daa(&master->base) | //use master->base Fix it by ensuring that the work is canceled before proceeding with the cleanup in svc_i3c_master_remove.
- CVE-2024-49873 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: mm/filemap: fix filemap_get_folios_contig THP panic Patch series "memfd-pin huge page fixes". Fix multiple bugs that occur when using memfd_pin_folios with hugetlb pages and THP. The hugetlb bugs only bite when the page is not yet faulted in when memfd_pin_folios is called. The THP bug bites when the starting offset passed to memfd_pin_folios is not huge page aligned. See the commit messages for details. This patch (of 5): memfd_pin_folios on memory backed by THP panics if the requested start offset is not huge page aligned: BUG: kernel NULL pointer dereference, address: 0000000000000036 RIP: 0010:filemap_get_folios_contig+0xdf/0x290 RSP: 0018:ffffc9002092fbe8 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002 The fault occurs here, because xas_load returns a folio with value 2: filemap_get_folios_contig() for (folio = xas_load(&xas); folio && xas.xa_index <= end; folio = xas_next(&xas)) { ... if (!folio_try_get(folio)) <-- BOOM "2" is an xarray sibling entry. We get it because memfd_pin_folios does not round the indices passed to filemap_get_folios_contig to huge page boundaries for THP, so we load from the middle of a huge page range see a sibling. (It does round for hugetlbfs, at the is_file_hugepages test). To fix, if the folio is a sibling, then return the next index as the starting point for the next call to filemap_get_folios_contig.
- CVE-2024-49872 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: mm/gup: fix memfd_pin_folios alloc race panic If memfd_pin_folios tries to create a hugetlb page, but someone else already did, then folio gets the value -EEXIST here: folio = memfd_alloc_folio(memfd, start_idx); if (IS_ERR(folio)) { ret = PTR_ERR(folio); if (ret != -EEXIST) goto err; then on the next trip through the "while start_idx" loop we panic here: if (folio) { folio_put(folio); To fix, set the folio to NULL on error.
- CVE-2024-49871 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: Input: adp5589-keys - fix NULL pointer dereference We register a devm action to call adp5589_clear_config() and then pass the i2c client as argument so that we can call i2c_get_clientdata() in order to get our device object. However, i2c_set_clientdata() is only being set at the end of the probe function which means that we'll get a NULL pointer dereference in case the probe function fails early.
- CVE-2024-49870 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: cachefiles: fix dentry leak in cachefiles_open_file() A dentry leak may be caused when a lookup cookie and a cull are concurrent: P1 | P2 ----------------------------------------------------------- cachefiles_lookup_cookie cachefiles_look_up_object lookup_one_positive_unlocked // get dentry cachefiles_cull inode->i_flags |= S_KERNEL_FILE; cachefiles_open_file cachefiles_mark_inode_in_use __cachefiles_mark_inode_in_use can_use = false if (!(inode->i_flags & S_KERNEL_FILE)) can_use = true return false return false // Returns an error but doesn't put dentry After that the following WARNING will be triggered when the backend folder is umounted: ================================================================== BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img} still in use (1) [unmount of ext4 sda] WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70 CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25 RIP: 0010:umount_check+0x5d/0x70 Call Trace: <TASK> d_walk+0xda/0x2b0 do_one_tree+0x20/0x40 shrink_dcache_for_umount+0x2c/0x90 generic_shutdown_super+0x20/0x160 kill_block_super+0x1a/0x40 ext4_kill_sb+0x22/0x40 deactivate_locked_super+0x35/0x80 cleanup_mnt+0x104/0x160 ================================================================== Whether cachefiles_open_file() returns true or false, the reference count obtained by lookup_positive_unlocked() in cachefiles_look_up_object() should be released. Therefore release that reference count in cachefiles_look_up_object() to fix the above issue and simplify the code.
- CVE-2024-49869 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: btrfs: send: fix buffer overflow detection when copying path to cache entry Starting with commit c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()") we annotated the variable length array "name" from the name_cache_entry structure with __counted_by() to improve overflow detection. However that alone was not correct, because the length of that array does not match the "name_len" field - it matches that plus 1 to include the NUL string terminator, so that makes a fortified kernel think there's an overflow and report a splat like this: strcpy: detected buffer overflow: 20 byte write of buffer size 19 WARNING: CPU: 3 PID: 3310 at __fortify_report+0x45/0x50 CPU: 3 UID: 0 PID: 3310 Comm: btrfs Not tainted 6.11.0-prnet #1 Hardware name: CompuLab Ltd. sbc-ihsw/Intense-PC2 (IPC2), BIOS IPC2_3.330.7 X64 03/15/2018 RIP: 0010:__fortify_report+0x45/0x50 Code: 48 8b 34 (...) RSP: 0018:ffff97ebc0d6f650 EFLAGS: 00010246 RAX: 7749924ef60fa600 RBX: ffff8bf5446a521a RCX: 0000000000000027 RDX: 00000000ffffdfff RSI: ffff97ebc0d6f548 RDI: ffff8bf84e7a1cc8 RBP: ffff8bf548574080 R08: ffffffffa8c40e10 R09: 0000000000005ffd R10: 0000000000000004 R11: ffffffffa8c70e10 R12: ffff8bf551eef400 R13: 0000000000000000 R14: 0000000000000013 R15: 00000000000003a8 FS: 00007fae144de8c0(0000) GS:ffff8bf84e780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fae14691690 CR3: 00000001027a2003 CR4: 00000000001706f0 Call Trace: <TASK> ? __warn+0x12a/0x1d0 ? __fortify_report+0x45/0x50 ? report_bug+0x154/0x1c0 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x1a/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? __fortify_report+0x45/0x50 __fortify_panic+0x9/0x10 __get_cur_name_and_parent+0x3bc/0x3c0 get_cur_path+0x207/0x3b0 send_extent_data+0x709/0x10d0 ? find_parent_nodes+0x22df/0x25d0 ? mas_nomem+0x13/0x90 ? mtree_insert_range+0xa5/0x110 ? btrfs_lru_cache_store+0x5f/0x1e0 ? iterate_extent_inodes+0x52d/0x5a0 process_extent+0xa96/0x11a0 ? __pfx_lookup_backref_cache+0x10/0x10 ? __pfx_store_backref_cache+0x10/0x10 ? __pfx_iterate_backrefs+0x10/0x10 ? __pfx_check_extent_item+0x10/0x10 changed_cb+0x6fa/0x930 ? tree_advance+0x362/0x390 ? memcmp_extent_buffer+0xd7/0x160 send_subvol+0xf0a/0x1520 btrfs_ioctl_send+0x106b/0x11d0 ? __pfx___clone_root_cmp_sort+0x10/0x10 _btrfs_ioctl_send+0x1ac/0x240 btrfs_ioctl+0x75b/0x850 __se_sys_ioctl+0xca/0x150 do_syscall_64+0x85/0x160 ? __count_memcg_events+0x69/0x100 ? handle_mm_fault+0x1327/0x15c0 ? __se_sys_rt_sigprocmask+0xf1/0x180 ? syscall_exit_to_user_mode+0x75/0xa0 ? do_syscall_64+0x91/0x160 ? do_user_addr_fault+0x21d/0x630 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fae145eeb4f Code: 00 48 89 (...) RSP: 002b:00007ffdf1cb09b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fae145eeb4f RDX: 00007ffdf1cb0ad0 RSI: 0000000040489426 RDI: 0000000000000004 RBP: 00000000000078fe R08: 00007fae144006c0 R09: 00007ffdf1cb0927 R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffdf1cb1ce8 R13: 0000000000000003 R14: 000055c499fab2e0 R15: 0000000000000004 </TASK> Fix this by not storing the NUL string terminator since we don't actually need it for name cache entries, this way "name_len" corresponds to the actual size of the "name" array. This requires marking the "name" array field with __nonstring and using memcpy() instead of strcpy() as recommended by the guidelines at: https://github.com/KSPP/linux/issues/90
- CVE-2024-49868 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix a NULL pointer dereference when failed to start a new trasacntion [BUG] Syzbot reported a NULL pointer dereference with the following crash: FAULT_INJECTION: forcing a failure. start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676 prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642 relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678 ... BTRFS info (device loop0): balance: ended with status: -12 Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667] RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926 Call Trace: <TASK> commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496 btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430 del_balance_item fs/btrfs/volumes.c:3678 [inline] reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742 btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [CAUSE] The allocation failure happens at the start_transaction() inside prepare_to_relocate(), and during the error handling we call unset_reloc_control(), which makes fs_info->balance_ctl to be NULL. Then we continue the error path cleanup in btrfs_balance() by calling reset_balance_state() which will call del_balance_item() to fully delete the balance item in the root tree. However during the small window between set_reloc_contrl() and unset_reloc_control(), we can have a subvolume tree update and created a reloc_root for that subvolume. Then we go into the final btrfs_commit_transaction() of del_balance_item(), and into btrfs_update_reloc_root() inside commit_fs_roots(). That function checks if fs_info->reloc_ctl is in the merge_reloc_tree stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer dereference. [FIX] Just add extra check on fs_info->reloc_ctl inside btrfs_update_reloc_root(), before checking fs_info->reloc_ctl->merge_reloc_tree. That DEAD_RELOC_TREE handling is to prevent further modification to the reloc tree during merge stage, but since there is no reloc_ctl at all, we do not need to bother that.
- CVE-2024-49867 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: btrfs: wait for fixup workers before stopping cleaner kthread during umount During unmount, at close_ctree(), we have the following steps in this order: 1) Park the cleaner kthread - this doesn't destroy the kthread, it basically halts its execution (wake ups against it work but do nothing); 2) We stop the cleaner kthread - this results in freeing the respective struct task_struct; 3) We call btrfs_stop_all_workers() which waits for any jobs running in all the work queues and then free the work queues. Syzbot reported a case where a fixup worker resulted in a crash when doing a delayed iput on its inode while attempting to wake up the cleaner at btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread was already freed. This can happen during unmount because we don't wait for any fixup workers still running before we call kthread_stop() against the cleaner kthread, which stops and free all its resources. Fix this by waiting for any fixup workers at close_ctree() before we call kthread_stop() against the cleaner and run pending delayed iputs. The stack traces reported by syzbot were the following: BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065 Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52 CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: btrfs-fixup btrfs_work_helper Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154 btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842 btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Allocated by task 2: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:247 [inline] slab_post_alloc_hook mm/slub.c:4086 [inline] slab_alloc_node mm/slub.c:4135 [inline] kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1107 copy_process+0x5d1/0x3d50 kernel/fork.c:2206 kernel_clone+0x223/0x880 kernel/fork.c:2787 kernel_thread+0x1bc/0x240 kernel/fork.c:2849 create_kthread kernel/kthread.c:412 [inline] kthreadd+0x60d/0x810 kernel/kthread.c:765 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 61: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_h ---truncated---
- CVE-2024-49866 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: tracing/timerlat: Fix a race during cpuhp processing There is another found exception that the "timerlat/1" thread was scheduled on CPU0, and lead to timer corruption finally: ``` ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220 WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0 Modules linked in: CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:debug_print_object+0x7d/0xb0 ... Call Trace: <TASK> ? __warn+0x7c/0x110 ? debug_print_object+0x7d/0xb0 ? report_bug+0xf1/0x1d0 ? prb_read_valid+0x17/0x20 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? debug_print_object+0x7d/0xb0 ? debug_print_object+0x7d/0xb0 ? __pfx_timerlat_irq+0x10/0x10 __debug_object_init+0x110/0x150 hrtimer_init+0x1d/0x60 timerlat_main+0xab/0x2d0 ? __pfx_timerlat_main+0x10/0x10 kthread+0xb7/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ``` After tracing the scheduling event, it was discovered that the migration of the "timerlat/1" thread was performed during thread creation. Further analysis confirmed that it is because the CPU online processing for osnoise is implemented through workers, which is asynchronous with the offline processing. When the worker was scheduled to create a thread, the CPU may has already been removed from the cpu_online_mask during the offline process, resulting in the inability to select the right CPU: T1 | T2 [CPUHP_ONLINE] | cpu_device_down() osnoise_hotplug_workfn() | | cpus_write_lock() | takedown_cpu(1) | cpus_write_unlock() [CPUHP_OFFLINE] | cpus_read_lock() | start_kthread(1) | cpus_read_unlock() | To fix this, skip online processing if the CPU is already offline.
- CVE-2024-49865 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: drm/xe/vm: move xa_alloc to prevent UAF Evil user can guess the next id of the vm before the ioctl completes and then call vm destroy ioctl to trigger UAF since create ioctl is still referencing the same vm. Move the xa_alloc all the way to the end to prevent this. v2: - Rebase (cherry picked from commit dcfd3971327f3ee92765154baebbaece833d3ca9)
- CVE-2024-49864 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: rxrpc: Fix a race between socket set up and I/O thread creation In rxrpc_open_socket(), it sets up the socket and then sets up the I/O thread that will handle it. This is a problem, however, as there's a gap between the two phases in which a packet may come into rxrpc_encap_rcv() from the UDP packet but we oops when trying to wake the not-yet created I/O thread. As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's no I/O thread yet. A better, but more intrusive fix would perhaps be to rearrange things such that the socket creation is done by the I/O thread.
- CVE-2024-49863 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: vhost/scsi: null-ptr-dereference in vhost_scsi_get_req() Since commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from control queue handler") a null pointer dereference bug can be triggered when guest sends an SCSI AN request. In vhost_scsi_ctl_handle_vq(), `vc.target` is assigned with `&v_req.tmf.lun[1]` within a switch-case block and is then passed to vhost_scsi_get_req() which extracts `vc->req` and `tpg`. However, for a `VIRTIO_SCSI_T_AN_*` request, tpg is not required, so `vc.target` is set to NULL in this branch. Later, in vhost_scsi_get_req(), `vc->target` is dereferenced without being checked, leading to a null pointer dereference bug. This bug can be triggered from guest. When this bug occurs, the vhost_worker process is killed while holding `vq->mutex` and the corresponding tpg will remain occupied indefinitely. Below is the KASAN report: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 840 Comm: poc Not tainted 6.10.0+ #1 Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:vhost_scsi_get_req+0x165/0x3a0 Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2b 02 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 65 30 4c 89 e2 48 c1 ea 03 <0f> b6 04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 be 01 00 00 RSP: 0018:ffff888017affb50 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88801b000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888017affcb8 RBP: ffff888017affb80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff888017affc88 R14: ffff888017affd1c R15: ffff888017993000 FS: 000055556e076500(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200027c0 CR3: 0000000010ed0004 CR4: 0000000000370ef0 Call Trace: <TASK> ? show_regs+0x86/0xa0 ? die_addr+0x4b/0xd0 ? exc_general_protection+0x163/0x260 ? asm_exc_general_protection+0x27/0x30 ? vhost_scsi_get_req+0x165/0x3a0 vhost_scsi_ctl_handle_vq+0x2a4/0xca0 ? __pfx_vhost_scsi_ctl_handle_vq+0x10/0x10 ? __switch_to+0x721/0xeb0 ? __schedule+0xda5/0x5710 ? __kasan_check_write+0x14/0x30 ? _raw_spin_lock+0x82/0xf0 vhost_scsi_ctl_handle_kick+0x52/0x90 vhost_run_work_list+0x134/0x1b0 vhost_task_fn+0x121/0x350 ... </TASK> ---[ end trace 0000000000000000 ]--- Let's add a check in vhost_scsi_get_req. [whitespace fixes]
- CVE-2024-49862 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: powercap: intel_rapl: Fix off by one in get_rpi() The rp->priv->rpi array is either rpi_msr or rpi_tpmi which have NR_RAPL_PRIMITIVES number of elements. Thus the > needs to be >= to prevent an off by one access.
- CVE-2024-49861 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix helper writes to read-only maps Lonial found an issue that despite user- and BPF-side frozen BPF map (like in case of .rodata), it was still possible to write into it from a BPF program side through specific helpers having ARG_PTR_TO_{LONG,INT} as arguments. In check_func_arg() when the argument is as mentioned, the meta->raw_mode is never set. Later, check_helper_mem_access(), under the case of PTR_TO_MAP_VALUE as register base type, it assumes BPF_READ for the subsequent call to check_map_access_type() and given the BPF map is read-only it succeeds. The helpers really need to be annotated as ARG_PTR_TO_{LONG,INT} | MEM_UNINIT when results are written into them as opposed to read out of them. The latter indicates that it's okay to pass a pointer to uninitialized memory as the memory is written to anyway. However, ARG_PTR_TO_{LONG,INT} is a special case of ARG_PTR_TO_FIXED_SIZE_MEM just with additional alignment requirement. So it is better to just get rid of the ARG_PTR_TO_{LONG,INT} special cases altogether and reuse the fixed size memory types. For this, add MEM_ALIGNED to additionally ensure alignment given these helpers write directly into the args via *<ptr> = val. The .arg*_size has been initialized reflecting the actual sizeof(*<ptr>). MEM_ALIGNED can only be used in combination with MEM_FIXED_SIZE annotated argument types, since in !MEM_FIXED_SIZE cases the verifier does not know the buffer size a priori and therefore cannot blindly write *<ptr> = val.
- CVE-2024-49860 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: ACPI: sysfs: validate return type of _STR method Only buffer objects are valid return values of _STR. If something else is returned description_show() will access invalid memory.
- CVE-2024-49859 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: f2fs: fix to check atomic_file in f2fs ioctl interfaces Some f2fs ioctl interfaces like f2fs_ioc_set_pin_file(), f2fs_move_file_range(), and f2fs_defragment_range() missed to check atomic_write status, which may cause potential race issue, fix it.
- CVE-2024-49858 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption The TPM event log table is a Linux specific construct, where the data produced by the GetEventLog() boot service is cached in memory, and passed on to the OS using an EFI configuration table. The use of EFI_LOADER_DATA here results in the region being left unreserved in the E820 memory map constructed by the EFI stub, and this is the memory description that is passed on to the incoming kernel by kexec, which is therefore unaware that the region should be reserved. Even though the utility of the TPM2 event log after a kexec is questionable, any corruption might send the parsing code off into the weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY instead, which is always treated as reserved by the E820 conversion logic.
- CVE-2024-49857 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: wifi: iwlwifi: mvm: set the cipher for secured NDP ranging The cipher pointer is not set, but is derefereced trying to set its content, which leads to a NULL pointer dereference. Fix it by pointing to the cipher parameter before dereferencing.
- CVE-2024-49856 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: x86/sgx: Fix deadlock in SGX NUMA node search When the current node doesn't have an EPC section configured by firmware and all other EPC sections are used up, CPU can get stuck inside the while loop that looks for an available EPC page from remote nodes indefinitely, leading to a soft lockup. Note how nid_of_current will never be equal to nid in that while loop because nid_of_current is not set in sgx_numa_mask. Also worth mentioning is that it's perfectly fine for the firmware not to setup an EPC section on a node. While setting up an EPC section on each node can enhance performance, it is not a requirement for functionality. Rework the loop to start and end on *a* node that has SGX memory. This avoids the deadlock looking for the current SGX-lacking node to show up in the loop when it never will.
- CVE-2024-49855 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: nbd: fix race between timeout and normal completion If request timetout is handled by nbd_requeue_cmd(), normal completion has to be stopped for avoiding to complete this requeued request, other use-after-free can be triggered. Fix the race by clearing NBD_CMD_INFLIGHT in nbd_requeue_cmd(), meantime make sure that cmd->lock is grabbed for clearing the flag and the requeue.
- CVE-2024-49854 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: block, bfq: fix uaf for accessing waker_bfqq after splitting After commit 42c306ed7233 ("block, bfq: don't break merge chain in bfq_split_bfqq()"), if the current procress is the last holder of bfqq, the bfqq can be freed after bfq_split_bfqq(). Hence recored the bfqq and then access bfqq->waker_bfqq may trigger UAF. What's more, the waker_bfqq may in the merge chain of bfqq, hence just recored waker_bfqq is still not safe. Fix the problem by adding a helper bfq_waker_bfqq() to check if bfqq->waker_bfqq is in the merge chain, and current procress is the only holder.
- CVE-2024-49853 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: firmware: arm_scmi: Fix double free in OPTEE transport Channels can be shared between protocols, avoid freeing the same channel descriptors twice when unloading the stack.
- CVE-2024-49852 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: scsi: elx: libefc: Fix potential use after free in efc_nport_vport_del() The kref_put() function will call nport->release if the refcount drops to zero. The nport->release release function is _efc_nport_free() which frees "nport". But then we dereference "nport" on the next line which is a use after free. Re-order these lines to avoid the use after free.
- CVE-2024-49851 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: tpm: Clean up TPM space after command failure tpm_dev_transmit prepares the TPM space before attempting command transmission. However if the command fails no rollback of this preparation is done. This can result in transient handles being leaked if the device is subsequently closed with no further commands performed. Fix this by flushing the space in the event of command transmission failure.
- CVE-2024-49850 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: bpf: correctly handle malformed BPF_CORE_TYPE_ID_LOCAL relos In case of malformed relocation record of kind BPF_CORE_TYPE_ID_LOCAL referencing a non-existing BTF type, function bpf_core_calc_relo_insn would cause a null pointer deference. Fix this by adding a proper check upper in call stack, as malformed relocation records could be passed from user space. Simplest reproducer is a program: r0 = 0 exit With a single relocation record: .insn_off = 0, /* patch first instruction */ .type_id = 100500, /* this type id does not exist */ .access_str_off = 6, /* offset of string "0" */ .kind = BPF_CORE_TYPE_ID_LOCAL, See the link for original reproducer or next commit for a test case.
- CVE-2024-47757 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: nilfs2: fix potential oob read in nilfs_btree_check_delete() The function nilfs_btree_check_delete(), which checks whether degeneration to direct mapping occurs before deleting a b-tree entry, causes memory access outside the block buffer when retrieving the maximum key if the root node has no entries. This does not usually happen because b-tree mappings with 0 child nodes are never created by mkfs.nilfs2 or nilfs2 itself. However, it can happen if the b-tree root node read from a device is configured that way, so fix this potential issue by adding a check for that case.
- CVE-2024-47756 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: PCI: keystone: Fix if-statement expression in ks_pcie_quirk() This code accidentally uses && where || was intended. It potentially results in a NULL dereference. Thus, fix the if-statement expression to use the correct condition. [kwilczynski: commit log]
- CVE-2024-47755 Published Oct 21, 2024
Rejected reason: This CVE ID has been rejected or withdrawn by its CVE Numbering Authority.
- CVE-2024-47754 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: media: mediatek: vcodec: Fix H264 multi stateless decoder smatch warning Fix a smatch static checker warning on vdec_h264_req_multi_if.c. Which leads to a kernel crash when fb is NULL.
- CVE-2024-47753 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: media: mediatek: vcodec: Fix VP8 stateless decoder smatch warning Fix a smatch static checker warning on vdec_vp8_req_if.c. Which leads to a kernel crash when fb is NULL.
- CVE-2024-47752 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: media: mediatek: vcodec: Fix H264 stateless decoder smatch warning Fix a smatch static checker warning on vdec_h264_req_if.c. Which leads to a kernel crash when fb is NULL.
- CVE-2024-47751 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: PCI: kirin: Fix buffer overflow in kirin_pcie_parse_port() Within kirin_pcie_parse_port(), the pcie->num_slots is compared to pcie->gpio_id_reset size (MAX_PCI_SLOTS) which is correct and would lead to an overflow. Thus, fix condition to pcie->num_slots + 1 >= MAX_PCI_SLOTS and move pcie->num_slots increment below the if-statement to avoid out-of-bounds array access. Found by Linux Verification Center (linuxtesting.org) with SVACE. [kwilczynski: commit log]
- CVE-2024-47750 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: RDMA/hns: Fix Use-After-Free of rsv_qp on HIP08 Currently rsv_qp is freed before ib_unregister_device() is called on HIP08. During the time interval, users can still dereg MR and rsv_qp will be used in this process, leading to a UAF. Move the release of rsv_qp after calling ib_unregister_device() to fix it.
- CVE-2024-47746 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: fuse: use exclusive lock when FUSE_I_CACHE_IO_MODE is set This may be a typo. The comment has said shared locks are not allowed when this bit is set. If using shared lock, the wait in `fuse_file_cached_io_open` may be forever.
- CVE-2024-47749 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: RDMA/cxgb4: Added NULL check for lookup_atid The lookup_atid() function can return NULL if the ATID is invalid or does not exist in the identifier table, which could lead to dereferencing a null pointer without a check in the `act_establish()` and `act_open_rpl()` functions. Add a NULL check to prevent null pointer dereferencing. Found by Linux Verification Center (linuxtesting.org) with SVACE.
- CVE-2024-47748 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: vhost_vdpa: assign irq bypass producer token correctly We used to call irq_bypass_unregister_producer() in vhost_vdpa_setup_vq_irq() which is problematic as we don't know if the token pointer is still valid or not. Actually, we use the eventfd_ctx as the token so the life cycle of the token should be bound to the VHOST_SET_VRING_CALL instead of vhost_vdpa_setup_vq_irq() which could be called by set_status(). Fixing this by setting up irq bypass producer's token when handling VHOST_SET_VRING_CALL and un-registering the producer before calling vhost_vring_ioctl() to prevent a possible use after free as eventfd could have been released in vhost_vring_ioctl(). And such registering and unregistering will only be done if DRIVER_OK is set.
- CVE-2024-47747 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race Condition In the ether3_probe function, a timer is initialized with a callback function ether3_ledoff, bound to &prev(dev)->timer. Once the timer is started, there is a risk of a race condition if the module or device is removed, triggering the ether3_remove function to perform cleanup. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | ether3_ledoff ether3_remove | free_netdev(dev); | put_devic | kfree(dev); | | ether3_outw(priv(dev)->regs.config2 |= CFG2_CTRLO, REG_CONFIG2); | // use dev Fix it by ensuring that the timer is canceled before proceeding with the cleanup in ether3_remove.
- CVE-2024-47745 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: mm: call the security_mmap_file() LSM hook in remap_file_pages() The remap_file_pages syscall handler calls do_mmap() directly, which doesn't contain the LSM security check. And if the process has called personality(READ_IMPLIES_EXEC) before and remap_file_pages() is called for RW pages, this will actually result in remapping the pages to RWX, bypassing a W^X policy enforced by SELinux. So we should check prot by security_mmap_file LSM hook in the remap_file_pages syscall handler before do_mmap() is called. Otherwise, it potentially permits an attacker to bypass a W^X policy enforced by SELinux. The bypass is similar to CVE-2016-10044, which bypass the same thing via AIO and can be found in [1]. The PoC: $ cat > test.c int main(void) { size_t pagesz = sysconf(_SC_PAGE_SIZE); int mfd = syscall(SYS_memfd_create, "test", 0); const char *buf = mmap(NULL, 4 * pagesz, PROT_READ | PROT_WRITE, MAP_SHARED, mfd, 0); unsigned int old = syscall(SYS_personality, 0xffffffff); syscall(SYS_personality, READ_IMPLIES_EXEC | old); syscall(SYS_remap_file_pages, buf, pagesz, 0, 2, 0); syscall(SYS_personality, old); // show the RWX page exists even if W^X policy is enforced int fd = open("/proc/self/maps", O_RDONLY); unsigned char buf2[1024]; while (1) { int ret = read(fd, buf2, 1024); if (ret <= 0) break; write(1, buf2, ret); } close(fd); } $ gcc test.c -o test $ ./test | grep rwx 7f1836c34000-7f1836c35000 rwxs 00002000 00:01 2050 /memfd:test (deleted) [PM: subject line tweaks]
- CVE-2024-47744 Published Oct 21, 2024
In the Linux kernel, the following vulnerability has been resolved: KVM: Use dedicated mutex to protect kvm_usage_count to avoid deadlock Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock on x86 due to a chain of locks and SRCU synchronizations. Translating the below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the fairness of r/w semaphores). CPU0 CPU1 CPU2 1 lock(&kvm->slots_lock); 2 lock(&vcpu->mutex); 3 lock(&kvm->srcu); 4 lock(cpu_hotplug_lock); 5 lock(kvm_lock); 6 lock(&kvm->slots_lock); 7 lock(cpu_hotplug_lock); 8 sync(&kvm->srcu); Note, there are likely more potential deadlocks in KVM x86, e.g. the same pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with __kvmclock_cpufreq_notifier(): cpuhp_cpufreq_online() | -> cpufreq_online() | -> cpufreq_gov_performance_limits() | -> __cpufreq_driver_target() | -> __target_index() | -> cpufreq_freq_transition_begin() | -> cpufreq_notify_transition() | -> ... __kvmclock_cpufreq_notifier() But, actually triggering such deadlocks is beyond rare due to the combination of dependencies and timings involved. E.g. the cpufreq notifier is only used on older CPUs without a constant TSC, mucking with the NX hugepage mitigation while VMs are running is very uncommon, and doing so while also onlining/offlining a CPU (necessary to generate contention on cpu_hotplug_lock) would be even more unusual. The most robust solution to the general cpu_hotplug_lock issue is likely to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq notifier doesn't to take kvm_lock. For now, settle for fixing the most blatant deadlock, as switching to an RCU-protected list is a much more involved change, but add a comment in locking.rst to call out that care needs to be taken when walking holding kvm_lock and walking vm_list. ====================================================== WARNING: possible circular locking dependency detected 6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S O ------------------------------------------------------ tee/35048 is trying to acquire lock: ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm] but task is already holding lock: ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (kvm_lock){+.+.}-{3:3}: __mutex_lock+0x6a/0xb40 mutex_lock_nested+0x1f/0x30 kvm_dev_ioctl+0x4fb/0xe50 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #2 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2e/0xb0 static_key_slow_inc+0x16/0x30 kvm_lapic_set_base+0x6a/0x1c0 [kvm] kvm_set_apic_base+0x8f/0xe0 [kvm] kvm_set_msr_common+0x9ae/0xf80 [kvm] vmx_set_msr+0xa54/0xbe0 [kvm_intel] __kvm_set_msr+0xb6/0x1a0 [kvm] kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm] kvm_vcpu_ioctl+0x485/0x5b0 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&kvm->srcu){.+.+}-{0:0}: __synchronize_srcu+0x44/0x1a0 ---truncated---
In the Linux kernel, the following vulnerability has been resolved: mm, slub: avoid zeroing kmalloc redzone Since commit 946fa0dbf2d8 ("mm/slub: extend redzone check to extra allocated kmalloc space than requested"), setting orig_size treats the wasted space (object_size - orig_size) as a redzone. However with init_on_free=1 we clear the full object->size, including the redzone. Additionally we clear the object metadata, including the stored orig_size, making it zero, which makes check_object() treat the whole object as a redzone. These issues lead to the following BUG report with "slub_debug=FUZ init_on_free=1": [ 0.000000] ============================================================================= [ 0.000000] BUG kmalloc-8 (Not tainted): kmalloc Redzone overwritten [ 0.000000] ----------------------------------------------------------------------------- [ 0.000000] [ 0.000000] 0xffff000010032858-0xffff00001003285f @offset=2136. First byte 0x0 instead of 0xcc [ 0.000000] FIX kmalloc-8: Restoring kmalloc Redzone 0xffff000010032858-0xffff00001003285f=0xcc [ 0.000000] Slab 0xfffffdffc0400c80 objects=36 used=23 fp=0xffff000010032a18 flags=0x3fffe0000000200(workingset|node=0|zone=0|lastcpupid=0x1ffff) [ 0.000000] Object 0xffff000010032858 @offset=2136 fp=0xffff0000100328c8 [ 0.000000] [ 0.000000] Redzone ffff000010032850: cc cc cc cc cc cc cc cc ........ [ 0.000000] Object ffff000010032858: cc cc cc cc cc cc cc cc ........ [ 0.000000] Redzone ffff000010032860: cc cc cc cc cc cc cc cc ........ [ 0.000000] Padding ffff0000100328b4: 00 00 00 00 00 00 00 00 00 00 00 00 ............ [ 0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814-00004-g61844c55c3f4 #144 [ 0.000000] Hardware name: NXP i.MX95 19X19 board (DT) [ 0.000000] Call trace: [ 0.000000] dump_backtrace+0x90/0xe8 [ 0.000000] show_stack+0x18/0x24 [ 0.000000] dump_stack_lvl+0x74/0x8c [ 0.000000] dump_stack+0x18/0x24 [ 0.000000] print_trailer+0x150/0x218 [ 0.000000] check_object+0xe4/0x454 [ 0.000000] free_to_partial_list+0x2f8/0x5ec To address the issue, use orig_size to clear the used area. And restore the value of orig_size after clear the remaining area. When CONFIG_SLUB_DEBUG not defined, (get_orig_size()' directly returns s->object_size. So when using memset to init the area, the size can simply be orig_size, as orig_size returns object_size when CONFIG_SLUB_DEBUG not enabled. And orig_size can never be bigger than object_size.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: ext4: fix slab-use-after-free in ext4_split_extent_at() We hit the following use-after-free: ================================================================== BUG: KASAN: slab-use-after-free in ext4_split_extent_at+0xba8/0xcc0 Read of size 2 at addr ffff88810548ed08 by task kworker/u20:0/40 CPU: 0 PID: 40 Comm: kworker/u20:0 Not tainted 6.9.0-dirty #724 Call Trace: <TASK> kasan_report+0x93/0xc0 ext4_split_extent_at+0xba8/0xcc0 ext4_split_extent.isra.0+0x18f/0x500 ext4_split_convert_extents+0x275/0x750 ext4_ext_handle_unwritten_extents+0x73e/0x1580 ext4_ext_map_blocks+0xe20/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] Allocated by task 40: __kmalloc_noprof+0x1ac/0x480 ext4_find_extent+0xf3b/0x1e70 ext4_ext_map_blocks+0x188/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] Freed by task 40: kfree+0xf1/0x2b0 ext4_find_extent+0xa71/0x1e70 ext4_ext_insert_extent+0xa22/0x3260 ext4_split_extent_at+0x3ef/0xcc0 ext4_split_extent.isra.0+0x18f/0x500 ext4_split_convert_extents+0x275/0x750 ext4_ext_handle_unwritten_extents+0x73e/0x1580 ext4_ext_map_blocks+0xe20/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] ================================================================== The flow of issue triggering is as follows: ext4_split_extent_at path = *ppath ext4_ext_insert_extent(ppath) ext4_ext_create_new_leaf(ppath) ext4_find_extent(orig_path) path = *orig_path read_extent_tree_block // return -ENOMEM or -EIO ext4_free_ext_path(path) kfree(path) *orig_path = NULL a. If err is -ENOMEM: ext4_ext_dirty(path + path->p_depth) // path use-after-free !!! b. If err is -EIO and we have EXT_DEBUG defined: ext4_ext_show_leaf(path) eh = path[depth].p_hdr // path also use-after-free !!! So when trying to zeroout or fix the extent length, call ext4_find_extent() to update the path. In addition we use *ppath directly as an ext4_ext_show_leaf() input to avoid possible use-after-free when EXT_DEBUG is defined, and to avoid unnecessary path updates.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: ext4: aovid use-after-free in ext4_ext_insert_extent() As Ojaswin mentioned in Link, in ext4_ext_insert_extent(), if the path is reallocated in ext4_ext_create_new_leaf(), we'll use the stale path and cause UAF. Below is a sample trace with dummy values: ext4_ext_insert_extent path = *ppath = 2000 ext4_ext_create_new_leaf(ppath) ext4_find_extent(ppath) path = *ppath = 2000 if (depth > path[0].p_maxdepth) kfree(path = 2000); *ppath = path = NULL; path = kcalloc() = 3000 *ppath = 3000; return path; /* here path is still 2000, UAF! */ eh = path[depth].p_hdr ================================================================== BUG: KASAN: slab-use-after-free in ext4_ext_insert_extent+0x26d4/0x3330 Read of size 8 at addr ffff8881027bf7d0 by task kworker/u36:1/179 CPU: 3 UID: 0 PID: 179 Comm: kworker/u6:1 Not tainted 6.11.0-rc2-dirty #866 Call Trace: <TASK> ext4_ext_insert_extent+0x26d4/0x3330 ext4_ext_map_blocks+0xe22/0x2d40 ext4_map_blocks+0x71e/0x1700 ext4_do_writepages+0x1290/0x2800 [...] Allocated by task 179: ext4_find_extent+0x81c/0x1f70 ext4_ext_map_blocks+0x146/0x2d40 ext4_map_blocks+0x71e/0x1700 ext4_do_writepages+0x1290/0x2800 ext4_writepages+0x26d/0x4e0 do_writepages+0x175/0x700 [...] Freed by task 179: kfree+0xcb/0x240 ext4_find_extent+0x7c0/0x1f70 ext4_ext_insert_extent+0xa26/0x3330 ext4_ext_map_blocks+0xe22/0x2d40 ext4_map_blocks+0x71e/0x1700 ext4_do_writepages+0x1290/0x2800 ext4_writepages+0x26d/0x4e0 do_writepages+0x175/0x700 [...] ================================================================== So use *ppath to update the path to avoid the above problem.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: ext4: fix double brelse() the buffer of the extents path In ext4_ext_try_to_merge_up(), set path[1].p_bh to NULL after it has been released, otherwise it may be released twice. An example of what triggers this is as follows: split2 map split1 |--------|-------|--------| ext4_ext_map_blocks ext4_ext_handle_unwritten_extents ext4_split_convert_extents // path->p_depth == 0 ext4_split_extent // 1. do split1 ext4_split_extent_at |ext4_ext_insert_extent | ext4_ext_create_new_leaf | ext4_ext_grow_indepth | le16_add_cpu(&neh->eh_depth, 1) | ext4_find_extent | // return -ENOMEM |// get error and try zeroout |path = ext4_find_extent | path->p_depth = 1 |ext4_ext_try_to_merge | ext4_ext_try_to_merge_up | path->p_depth = 0 | brelse(path[1].p_bh) ---> not set to NULL here |// zeroout success // 2. update path ext4_find_extent // 3. do split2 ext4_split_extent_at ext4_ext_insert_extent ext4_ext_create_new_leaf ext4_ext_grow_indepth le16_add_cpu(&neh->eh_depth, 1) ext4_find_extent path[0].p_bh = NULL; path->p_depth = 1 read_extent_tree_block ---> return err // path[1].p_bh is still the old value ext4_free_ext_path ext4_ext_drop_refs // path->p_depth == 1 brelse(path[1].p_bh) ---> brelse a buffer twice Finally got the following WARRNING when removing the buffer from lru: ============================================ VFS: brelse: Trying to free free buffer WARNING: CPU: 2 PID: 72 at fs/buffer.c:1241 __brelse+0x58/0x90 CPU: 2 PID: 72 Comm: kworker/u19:1 Not tainted 6.9.0-dirty #716 RIP: 0010:__brelse+0x58/0x90 Call Trace: <TASK> __find_get_block+0x6e7/0x810 bdev_getblk+0x2b/0x480 __ext4_get_inode_loc+0x48a/0x1240 ext4_get_inode_loc+0xb2/0x150 ext4_reserve_inode_write+0xb7/0x230 __ext4_mark_inode_dirty+0x144/0x6a0 ext4_ext_insert_extent+0x9c8/0x3230 ext4_ext_map_blocks+0xf45/0x2dc0 ext4_map_blocks+0x724/0x1700 ext4_do_writepages+0x12d6/0x2a70 [...] ============================================
high 7.8
In the Linux kernel, the following vulnerability has been resolved: ext4: update orig_path in ext4_find_extent() In ext4_find_extent(), if the path is not big enough, we free it and set *orig_path to NULL. But after reallocating and successfully initializing the path, we don't update *orig_path, in which case the caller gets a valid path but a NULL ppath, and this may cause a NULL pointer dereference or a path memory leak. For example: ext4_split_extent path = *ppath = 2000 ext4_find_extent if (depth > path[0].p_maxdepth) kfree(path = 2000); *orig_path = path = NULL; path = kcalloc() = 3000 ext4_split_extent_at(*ppath = NULL) path = *ppath; ex = path[depth].p_ext; // NULL pointer dereference! ================================================================== BUG: kernel NULL pointer dereference, address: 0000000000000010 CPU: 6 UID: 0 PID: 576 Comm: fsstress Not tainted 6.11.0-rc2-dirty #847 RIP: 0010:ext4_split_extent_at+0x6d/0x560 Call Trace: <TASK> ext4_split_extent.isra.0+0xcb/0x1b0 ext4_ext_convert_to_initialized+0x168/0x6c0 ext4_ext_handle_unwritten_extents+0x325/0x4d0 ext4_ext_map_blocks+0x520/0xdb0 ext4_map_blocks+0x2b0/0x690 ext4_iomap_begin+0x20e/0x2c0 [...] ================================================================== Therefore, *orig_path is updated when the extent lookup succeeds, so that the caller can safely use path or *ppath.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: ext4: fix off by one issue in alloc_flex_gd() Wesley reported an issue: ================================================================== EXT4-fs (dm-5): resizing filesystem from 7168 to 786432 blocks ------------[ cut here ]------------ kernel BUG at fs/ext4/resize.c:324! CPU: 9 UID: 0 PID: 3576 Comm: resize2fs Not tainted 6.11.0+ #27 RIP: 0010:ext4_resize_fs+0x1212/0x12d0 Call Trace: __ext4_ioctl+0x4e0/0x1800 ext4_ioctl+0x12/0x20 __x64_sys_ioctl+0x99/0xd0 x64_sys_call+0x1206/0x20d0 do_syscall_64+0x72/0x110 entry_SYSCALL_64_after_hwframe+0x76/0x7e ================================================================== While reviewing the patch, Honza found that when adjusting resize_bg in alloc_flex_gd(), it was possible for flex_gd->resize_bg to be bigger than flexbg_size. The reproduction of the problem requires the following: o_group = flexbg_size * 2 * n; o_size = (o_group + 1) * group_size; n_group: [o_group + flexbg_size, o_group + flexbg_size * 2) o_size = (n_group + 1) * group_size; Take n=0,flexbg_size=16 as an example: last:15 |o---------------|--------------n-| o_group:0 resize to n_group:30 The corresponding reproducer is: img=test.img rm -f $img truncate -s 600M $img mkfs.ext4 -F $img -b 1024 -G 16 8M dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 248M Delete the problematic plus 1 to fix the issue, and add a WARN_ON_ONCE() to prevent the issue from happening again. [ Note: another reproucer which this commit fixes is: img=test.img rm -f $img truncate -s 25MiB $img mkfs.ext4 -b 4096 -E nodiscard,lazy_itable_init=0,lazy_journal_init=0 $img truncate -s 3GiB $img dev=`losetup -f --show $img` mkdir -p /tmp/test mount $dev /tmp/test resize2fs $dev 3G umount $dev losetup -d $dev -- TYT ]
high 7.8
In the Linux kernel, the following vulnerability has been resolved: drm: omapdrm: Add missing check for alloc_ordered_workqueue As it may return NULL pointer and cause NULL pointer dereference. Add check for the return value of alloc_ordered_workqueue.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: resource: fix region_intersects() vs add_memory_driver_managed() On a system with CXL memory, the resource tree (/proc/iomem) related to CXL memory may look like something as follows. 490000000-50fffffff : CXL Window 0 490000000-50fffffff : region0 490000000-50fffffff : dax0.0 490000000-50fffffff : System RAM (kmem) Because drivers/dax/kmem.c calls add_memory_driver_managed() during onlining CXL memory, which makes "System RAM (kmem)" a descendant of "CXL Window X". This confuses region_intersects(), which expects all "System RAM" resources to be at the top level of iomem_resource. This can lead to bugs. For example, when the following command line is executed to write some memory in CXL memory range via /dev/mem, $ dd if=data of=/dev/mem bs=$((1 << 10)) seek=$((0x490000000 >> 10)) count=1 dd: error writing '/dev/mem': Bad address 1+0 records in 0+0 records out 0 bytes copied, 0.0283507 s, 0.0 kB/s the command fails as expected. However, the error code is wrong. It should be "Operation not permitted" instead of "Bad address". More seriously, the /dev/mem permission checking in devmem_is_allowed() passes incorrectly. Although the accessing is prevented later because ioremap() isn't allowed to map system RAM, it is a potential security issue. During command executing, the following warning is reported in the kernel log for calling ioremap() on system RAM. ioremap on RAM at 0x0000000490000000 - 0x0000000490000fff WARNING: CPU: 2 PID: 416 at arch/x86/mm/ioremap.c:216 __ioremap_caller.constprop.0+0x131/0x35d Call Trace: memremap+0xcb/0x184 xlate_dev_mem_ptr+0x25/0x2f write_mem+0x94/0xfb vfs_write+0x128/0x26d ksys_write+0xac/0xfe do_syscall_64+0x9a/0xfd entry_SYSCALL_64_after_hwframe+0x4b/0x53 The details of command execution process are as follows. In the above resource tree, "System RAM" is a descendant of "CXL Window 0" instead of a top level resource. So, region_intersects() will report no System RAM resources in the CXL memory region incorrectly, because it only checks the top level resources. Consequently, devmem_is_allowed() will return 1 (allow access via /dev/mem) for CXL memory region incorrectly. Fortunately, ioremap() doesn't allow to map System RAM and reject the access. So, region_intersects() needs to be fixed to work correctly with the resource tree with "System RAM" not at top level as above. To fix it, if we found a unmatched resource in the top level, we will continue to search matched resources in its descendant resources. So, we will not miss any matched resources in resource tree anymore. In the new implementation, an example resource tree |------------- "CXL Window 0" ------------| |-- "System RAM" --| will behave similar as the following fake resource tree for region_intersects(, IORESOURCE_SYSTEM_RAM, ), |-- "System RAM" --||-- "CXL Window 0a" --| Where "CXL Window 0a" is part of the original "CXL Window 0" that isn't covered by "System RAM".
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: ocfs2: fix possible null-ptr-deref in ocfs2_set_buffer_uptodate When doing cleanup, if flags without OCFS2_BH_READAHEAD, it may trigger NULL pointer dereference in the following ocfs2_set_buffer_uptodate() if bh is NULL.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: drm/xe: fix UAF around queue destruction We currently do stuff like queuing the final destruction step on a random system wq, which will outlive the driver instance. With bad timing we can teardown the driver with one or more work workqueue still being alive leading to various UAF splats. Add a fini step to ensure user queues are properly torn down. At this point GuC should already be nuked so queue itself should no longer be referenced from hw pov. v2 (Matt B) - Looks much safer to use a waitqueue and then just wait for the xa_array to become empty before triggering the drain. (cherry picked from commit 861108666cc0e999cffeab6aff17b662e68774e3)
high 7.8
In the Linux kernel, the following vulnerability has been resolved: nfsd: map the EBADMSG to nfserr_io to avoid warning Ext4 will throw -EBADMSG through ext4_readdir when a checksum error occurs, resulting in the following WARNING. Fix it by mapping EBADMSG to nfserr_io. nfsd_buffered_readdir iterate_dir // -EBADMSG -74 ext4_readdir // .iterate_shared ext4_dx_readdir ext4_htree_fill_tree htree_dirblock_to_tree ext4_read_dirblock __ext4_read_dirblock ext4_dirblock_csum_verify warn_no_space_for_csum __warn_no_space_for_csum return ERR_PTR(-EFSBADCRC) // -EBADMSG -74 nfserrno // WARNING [ 161.115610] ------------[ cut here ]------------ [ 161.116465] nfsd: non-standard errno: -74 [ 161.117315] WARNING: CPU: 1 PID: 780 at fs/nfsd/nfsproc.c:878 nfserrno+0x9d/0xd0 [ 161.118596] Modules linked in: [ 161.119243] CPU: 1 PID: 780 Comm: nfsd Not tainted 5.10.0-00014-g79679361fd5d #138 [ 161.120684] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qe mu.org 04/01/2014 [ 161.123601] RIP: 0010:nfserrno+0x9d/0xd0 [ 161.124676] Code: 0f 87 da 30 dd 00 83 e3 01 b8 00 00 00 05 75 d7 44 89 ee 48 c7 c7 c0 57 24 98 89 44 24 04 c6 05 ce 2b 61 03 01 e8 99 20 d8 00 <0f> 0b 8b 44 24 04 eb b5 4c 89 e6 48 c7 c7 a0 6d a4 99 e8 cc 15 33 [ 161.127797] RSP: 0018:ffffc90000e2f9c0 EFLAGS: 00010286 [ 161.128794] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 161.130089] RDX: 1ffff1103ee16f6d RSI: 0000000000000008 RDI: fffff520001c5f2a [ 161.131379] RBP: 0000000000000022 R08: 0000000000000001 R09: ffff8881f70c1827 [ 161.132664] R10: ffffed103ee18304 R11: 0000000000000001 R12: 0000000000000021 [ 161.133949] R13: 00000000ffffffb6 R14: ffff8881317c0000 R15: ffffc90000e2fbd8 [ 161.135244] FS: 0000000000000000(0000) GS:ffff8881f7080000(0000) knlGS:0000000000000000 [ 161.136695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 161.137761] CR2: 00007fcaad70b348 CR3: 0000000144256006 CR4: 0000000000770ee0 [ 161.139041] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 161.140291] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 161.141519] PKRU: 55555554 [ 161.142076] Call Trace: [ 161.142575] ? __warn+0x9b/0x140 [ 161.143229] ? nfserrno+0x9d/0xd0 [ 161.143872] ? report_bug+0x125/0x150 [ 161.144595] ? handle_bug+0x41/0x90 [ 161.145284] ? exc_invalid_op+0x14/0x70 [ 161.146009] ? asm_exc_invalid_op+0x12/0x20 [ 161.146816] ? nfserrno+0x9d/0xd0 [ 161.147487] nfsd_buffered_readdir+0x28b/0x2b0 [ 161.148333] ? nfsd4_encode_dirent_fattr+0x380/0x380 [ 161.149258] ? nfsd_buffered_filldir+0xf0/0xf0 [ 161.150093] ? wait_for_concurrent_writes+0x170/0x170 [ 161.151004] ? generic_file_llseek_size+0x48/0x160 [ 161.151895] nfsd_readdir+0x132/0x190 [ 161.152606] ? nfsd4_encode_dirent_fattr+0x380/0x380 [ 161.153516] ? nfsd_unlink+0x380/0x380 [ 161.154256] ? override_creds+0x45/0x60 [ 161.155006] nfsd4_encode_readdir+0x21a/0x3d0 [ 161.155850] ? nfsd4_encode_readlink+0x210/0x210 [ 161.156731] ? write_bytes_to_xdr_buf+0x97/0xe0 [ 161.157598] ? __write_bytes_to_xdr_buf+0xd0/0xd0 [ 161.158494] ? lock_downgrade+0x90/0x90 [ 161.159232] ? nfs4svc_decode_voidarg+0x10/0x10 [ 161.160092] nfsd4_encode_operation+0x15a/0x440 [ 161.160959] nfsd4_proc_compound+0x718/0xe90 [ 161.161818] nfsd_dispatch+0x18e/0x2c0 [ 161.162586] svc_process_common+0x786/0xc50 [ 161.163403] ? nfsd_svc+0x380/0x380 [ 161.164137] ? svc_printk+0x160/0x160 [ 161.164846] ? svc_xprt_do_enqueue.part.0+0x365/0x380 [ 161.165808] ? nfsd_svc+0x380/0x380 [ 161.166523] ? rcu_is_watching+0x23/0x40 [ 161.167309] svc_process+0x1a5/0x200 [ 161.168019] nfsd+0x1f5/0x380 [ 161.168663] ? nfsd_shutdown_threads+0x260/0x260 [ 161.169554] kthread+0x1c4/0x210 [ 161.170224] ? kthread_insert_work_sanity_check+0x80/0x80 [ 161.171246] ret_from_fork+0x1f/0x30
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: i3c: master: svc: Fix use after free vulnerability in svc_i3c_master Driver Due to Race Condition In the svc_i3c_master_probe function, &master->hj_work is bound with svc_i3c_master_hj_work, &master->ibi_work is bound with svc_i3c_master_ibi_work. And svc_i3c_master_ibi_work can start the hj_work, svc_i3c_master_irq_handler can start the ibi_work. If we remove the module which will call svc_i3c_master_remove to make cleanup, it will free master->base through i3c_master_unregister while the work mentioned above will be used. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | svc_i3c_master_hj_work svc_i3c_master_remove | i3c_master_unregister(&master->base)| device_unregister(&master->dev) | device_release | //free master->base | | i3c_master_do_daa(&master->base) | //use master->base Fix it by ensuring that the work is canceled before proceeding with the cleanup in svc_i3c_master_remove.
high 7.0
In the Linux kernel, the following vulnerability has been resolved: mm/filemap: fix filemap_get_folios_contig THP panic Patch series "memfd-pin huge page fixes". Fix multiple bugs that occur when using memfd_pin_folios with hugetlb pages and THP. The hugetlb bugs only bite when the page is not yet faulted in when memfd_pin_folios is called. The THP bug bites when the starting offset passed to memfd_pin_folios is not huge page aligned. See the commit messages for details. This patch (of 5): memfd_pin_folios on memory backed by THP panics if the requested start offset is not huge page aligned: BUG: kernel NULL pointer dereference, address: 0000000000000036 RIP: 0010:filemap_get_folios_contig+0xdf/0x290 RSP: 0018:ffffc9002092fbe8 EFLAGS: 00010202 RAX: 0000000000000002 RBX: 0000000000000002 RCX: 0000000000000002 The fault occurs here, because xas_load returns a folio with value 2: filemap_get_folios_contig() for (folio = xas_load(&xas); folio && xas.xa_index <= end; folio = xas_next(&xas)) { ... if (!folio_try_get(folio)) <-- BOOM "2" is an xarray sibling entry. We get it because memfd_pin_folios does not round the indices passed to filemap_get_folios_contig to huge page boundaries for THP, so we load from the middle of a huge page range see a sibling. (It does round for hugetlbfs, at the is_file_hugepages test). To fix, if the folio is a sibling, then return the next index as the starting point for the next call to filemap_get_folios_contig.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: mm/gup: fix memfd_pin_folios alloc race panic If memfd_pin_folios tries to create a hugetlb page, but someone else already did, then folio gets the value -EEXIST here: folio = memfd_alloc_folio(memfd, start_idx); if (IS_ERR(folio)) { ret = PTR_ERR(folio); if (ret != -EEXIST) goto err; then on the next trip through the "while start_idx" loop we panic here: if (folio) { folio_put(folio); To fix, set the folio to NULL on error.
medium 4.7
In the Linux kernel, the following vulnerability has been resolved: Input: adp5589-keys - fix NULL pointer dereference We register a devm action to call adp5589_clear_config() and then pass the i2c client as argument so that we can call i2c_get_clientdata() in order to get our device object. However, i2c_set_clientdata() is only being set at the end of the probe function which means that we'll get a NULL pointer dereference in case the probe function fails early.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: cachefiles: fix dentry leak in cachefiles_open_file() A dentry leak may be caused when a lookup cookie and a cull are concurrent: P1 | P2 ----------------------------------------------------------- cachefiles_lookup_cookie cachefiles_look_up_object lookup_one_positive_unlocked // get dentry cachefiles_cull inode->i_flags |= S_KERNEL_FILE; cachefiles_open_file cachefiles_mark_inode_in_use __cachefiles_mark_inode_in_use can_use = false if (!(inode->i_flags & S_KERNEL_FILE)) can_use = true return false return false // Returns an error but doesn't put dentry After that the following WARNING will be triggered when the backend folder is umounted: ================================================================== BUG: Dentry 000000008ad87947{i=7a,n=Dx_1_1.img} still in use (1) [unmount of ext4 sda] WARNING: CPU: 4 PID: 359261 at fs/dcache.c:1767 umount_check+0x5d/0x70 CPU: 4 PID: 359261 Comm: umount Not tainted 6.6.0-dirty #25 RIP: 0010:umount_check+0x5d/0x70 Call Trace: <TASK> d_walk+0xda/0x2b0 do_one_tree+0x20/0x40 shrink_dcache_for_umount+0x2c/0x90 generic_shutdown_super+0x20/0x160 kill_block_super+0x1a/0x40 ext4_kill_sb+0x22/0x40 deactivate_locked_super+0x35/0x80 cleanup_mnt+0x104/0x160 ================================================================== Whether cachefiles_open_file() returns true or false, the reference count obtained by lookup_positive_unlocked() in cachefiles_look_up_object() should be released. Therefore release that reference count in cachefiles_look_up_object() to fix the above issue and simplify the code.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: btrfs: send: fix buffer overflow detection when copying path to cache entry Starting with commit c0247d289e73 ("btrfs: send: annotate struct name_cache_entry with __counted_by()") we annotated the variable length array "name" from the name_cache_entry structure with __counted_by() to improve overflow detection. However that alone was not correct, because the length of that array does not match the "name_len" field - it matches that plus 1 to include the NUL string terminator, so that makes a fortified kernel think there's an overflow and report a splat like this: strcpy: detected buffer overflow: 20 byte write of buffer size 19 WARNING: CPU: 3 PID: 3310 at __fortify_report+0x45/0x50 CPU: 3 UID: 0 PID: 3310 Comm: btrfs Not tainted 6.11.0-prnet #1 Hardware name: CompuLab Ltd. sbc-ihsw/Intense-PC2 (IPC2), BIOS IPC2_3.330.7 X64 03/15/2018 RIP: 0010:__fortify_report+0x45/0x50 Code: 48 8b 34 (...) RSP: 0018:ffff97ebc0d6f650 EFLAGS: 00010246 RAX: 7749924ef60fa600 RBX: ffff8bf5446a521a RCX: 0000000000000027 RDX: 00000000ffffdfff RSI: ffff97ebc0d6f548 RDI: ffff8bf84e7a1cc8 RBP: ffff8bf548574080 R08: ffffffffa8c40e10 R09: 0000000000005ffd R10: 0000000000000004 R11: ffffffffa8c70e10 R12: ffff8bf551eef400 R13: 0000000000000000 R14: 0000000000000013 R15: 00000000000003a8 FS: 00007fae144de8c0(0000) GS:ffff8bf84e780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fae14691690 CR3: 00000001027a2003 CR4: 00000000001706f0 Call Trace: <TASK> ? __warn+0x12a/0x1d0 ? __fortify_report+0x45/0x50 ? report_bug+0x154/0x1c0 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x1a/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? __fortify_report+0x45/0x50 __fortify_panic+0x9/0x10 __get_cur_name_and_parent+0x3bc/0x3c0 get_cur_path+0x207/0x3b0 send_extent_data+0x709/0x10d0 ? find_parent_nodes+0x22df/0x25d0 ? mas_nomem+0x13/0x90 ? mtree_insert_range+0xa5/0x110 ? btrfs_lru_cache_store+0x5f/0x1e0 ? iterate_extent_inodes+0x52d/0x5a0 process_extent+0xa96/0x11a0 ? __pfx_lookup_backref_cache+0x10/0x10 ? __pfx_store_backref_cache+0x10/0x10 ? __pfx_iterate_backrefs+0x10/0x10 ? __pfx_check_extent_item+0x10/0x10 changed_cb+0x6fa/0x930 ? tree_advance+0x362/0x390 ? memcmp_extent_buffer+0xd7/0x160 send_subvol+0xf0a/0x1520 btrfs_ioctl_send+0x106b/0x11d0 ? __pfx___clone_root_cmp_sort+0x10/0x10 _btrfs_ioctl_send+0x1ac/0x240 btrfs_ioctl+0x75b/0x850 __se_sys_ioctl+0xca/0x150 do_syscall_64+0x85/0x160 ? __count_memcg_events+0x69/0x100 ? handle_mm_fault+0x1327/0x15c0 ? __se_sys_rt_sigprocmask+0xf1/0x180 ? syscall_exit_to_user_mode+0x75/0xa0 ? do_syscall_64+0x91/0x160 ? do_user_addr_fault+0x21d/0x630 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fae145eeb4f Code: 00 48 89 (...) RSP: 002b:00007ffdf1cb09b0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fae145eeb4f RDX: 00007ffdf1cb0ad0 RSI: 0000000040489426 RDI: 0000000000000004 RBP: 00000000000078fe R08: 00007fae144006c0 R09: 00007ffdf1cb0927 R10: 0000000000000008 R11: 0000000000000246 R12: 00007ffdf1cb1ce8 R13: 0000000000000003 R14: 000055c499fab2e0 R15: 0000000000000004 </TASK> Fix this by not storing the NUL string terminator since we don't actually need it for name cache entries, this way "name_len" corresponds to the actual size of the "name" array. This requires marking the "name" array field with __nonstring and using memcpy() instead of strcpy() as recommended by the guidelines at: https://github.com/KSPP/linux/issues/90
high 7.8
In the Linux kernel, the following vulnerability has been resolved: btrfs: fix a NULL pointer dereference when failed to start a new trasacntion [BUG] Syzbot reported a NULL pointer dereference with the following crash: FAULT_INJECTION: forcing a failure. start_transaction+0x830/0x1670 fs/btrfs/transaction.c:676 prepare_to_relocate+0x31f/0x4c0 fs/btrfs/relocation.c:3642 relocate_block_group+0x169/0xd20 fs/btrfs/relocation.c:3678 ... BTRFS info (device loop0): balance: ended with status: -12 Oops: general protection fault, probably for non-canonical address 0xdffffc00000000cc: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000660-0x0000000000000667] RIP: 0010:btrfs_update_reloc_root+0x362/0xa80 fs/btrfs/relocation.c:926 Call Trace: <TASK> commit_fs_roots+0x2ee/0x720 fs/btrfs/transaction.c:1496 btrfs_commit_transaction+0xfaf/0x3740 fs/btrfs/transaction.c:2430 del_balance_item fs/btrfs/volumes.c:3678 [inline] reset_balance_state+0x25e/0x3c0 fs/btrfs/volumes.c:3742 btrfs_balance+0xead/0x10c0 fs/btrfs/volumes.c:4574 btrfs_ioctl_balance+0x493/0x7c0 fs/btrfs/ioctl.c:3673 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:907 [inline] __se_sys_ioctl+0xf9/0x170 fs/ioctl.c:893 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f [CAUSE] The allocation failure happens at the start_transaction() inside prepare_to_relocate(), and during the error handling we call unset_reloc_control(), which makes fs_info->balance_ctl to be NULL. Then we continue the error path cleanup in btrfs_balance() by calling reset_balance_state() which will call del_balance_item() to fully delete the balance item in the root tree. However during the small window between set_reloc_contrl() and unset_reloc_control(), we can have a subvolume tree update and created a reloc_root for that subvolume. Then we go into the final btrfs_commit_transaction() of del_balance_item(), and into btrfs_update_reloc_root() inside commit_fs_roots(). That function checks if fs_info->reloc_ctl is in the merge_reloc_tree stage, but since fs_info->reloc_ctl is NULL, it results a NULL pointer dereference. [FIX] Just add extra check on fs_info->reloc_ctl inside btrfs_update_reloc_root(), before checking fs_info->reloc_ctl->merge_reloc_tree. That DEAD_RELOC_TREE handling is to prevent further modification to the reloc tree during merge stage, but since there is no reloc_ctl at all, we do not need to bother that.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: btrfs: wait for fixup workers before stopping cleaner kthread during umount During unmount, at close_ctree(), we have the following steps in this order: 1) Park the cleaner kthread - this doesn't destroy the kthread, it basically halts its execution (wake ups against it work but do nothing); 2) We stop the cleaner kthread - this results in freeing the respective struct task_struct; 3) We call btrfs_stop_all_workers() which waits for any jobs running in all the work queues and then free the work queues. Syzbot reported a case where a fixup worker resulted in a crash when doing a delayed iput on its inode while attempting to wake up the cleaner at btrfs_add_delayed_iput(), because the task_struct of the cleaner kthread was already freed. This can happen during unmount because we don't wait for any fixup workers still running before we call kthread_stop() against the cleaner kthread, which stops and free all its resources. Fix this by waiting for any fixup workers at close_ctree() before we call kthread_stop() against the cleaner and run pending delayed iputs. The stack traces reported by syzbot were the following: BUG: KASAN: slab-use-after-free in __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065 Read of size 8 at addr ffff8880272a8a18 by task kworker/u8:3/52 CPU: 1 UID: 0 PID: 52 Comm: kworker/u8:3 Not tainted 6.12.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: btrfs-fixup btrfs_work_helper Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 __lock_acquire+0x77/0x2050 kernel/locking/lockdep.c:5065 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5825 __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0xd5/0x120 kernel/locking/spinlock.c:162 class_raw_spinlock_irqsave_constructor include/linux/spinlock.h:551 [inline] try_to_wake_up+0xb0/0x1480 kernel/sched/core.c:4154 btrfs_writepage_fixup_worker+0xc16/0xdf0 fs/btrfs/inode.c:2842 btrfs_work_helper+0x390/0xc50 fs/btrfs/async-thread.c:314 process_one_work kernel/workqueue.c:3229 [inline] process_scheduled_works+0xa63/0x1850 kernel/workqueue.c:3310 worker_thread+0x870/0xd30 kernel/workqueue.c:3391 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Allocated by task 2: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 unpoison_slab_object mm/kasan/common.c:319 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:345 kasan_slab_alloc include/linux/kasan.h:247 [inline] slab_post_alloc_hook mm/slub.c:4086 [inline] slab_alloc_node mm/slub.c:4135 [inline] kmem_cache_alloc_node_noprof+0x16b/0x320 mm/slub.c:4187 alloc_task_struct_node kernel/fork.c:180 [inline] dup_task_struct+0x57/0x8c0 kernel/fork.c:1107 copy_process+0x5d1/0x3d50 kernel/fork.c:2206 kernel_clone+0x223/0x880 kernel/fork.c:2787 kernel_thread+0x1bc/0x240 kernel/fork.c:2849 create_kthread kernel/kthread.c:412 [inline] kthreadd+0x60d/0x810 kernel/kthread.c:765 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Freed by task 61: kasan_save_stack mm/kasan/common.c:47 [inline] kasan_save_track+0x3f/0x80 mm/kasan/common.c:68 kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579 poison_slab_object mm/kasan/common.c:247 [inline] __kasan_slab_free+0x59/0x70 mm/kasan/common.c:264 kasan_slab_free include/linux/kasan.h:230 [inline] slab_free_h ---truncated---
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: tracing/timerlat: Fix a race during cpuhp processing There is another found exception that the "timerlat/1" thread was scheduled on CPU0, and lead to timer corruption finally: ``` ODEBUG: init active (active state 0) object: ffff888237c2e108 object type: hrtimer hint: timerlat_irq+0x0/0x220 WARNING: CPU: 0 PID: 426 at lib/debugobjects.c:518 debug_print_object+0x7d/0xb0 Modules linked in: CPU: 0 UID: 0 PID: 426 Comm: timerlat/1 Not tainted 6.11.0-rc7+ #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:debug_print_object+0x7d/0xb0 ... Call Trace: <TASK> ? __warn+0x7c/0x110 ? debug_print_object+0x7d/0xb0 ? report_bug+0xf1/0x1d0 ? prb_read_valid+0x17/0x20 ? handle_bug+0x3f/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? debug_print_object+0x7d/0xb0 ? debug_print_object+0x7d/0xb0 ? __pfx_timerlat_irq+0x10/0x10 __debug_object_init+0x110/0x150 hrtimer_init+0x1d/0x60 timerlat_main+0xab/0x2d0 ? __pfx_timerlat_main+0x10/0x10 kthread+0xb7/0xe0 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> ``` After tracing the scheduling event, it was discovered that the migration of the "timerlat/1" thread was performed during thread creation. Further analysis confirmed that it is because the CPU online processing for osnoise is implemented through workers, which is asynchronous with the offline processing. When the worker was scheduled to create a thread, the CPU may has already been removed from the cpu_online_mask during the offline process, resulting in the inability to select the right CPU: T1 | T2 [CPUHP_ONLINE] | cpu_device_down() osnoise_hotplug_workfn() | | cpus_write_lock() | takedown_cpu(1) | cpus_write_unlock() [CPUHP_OFFLINE] | cpus_read_lock() | start_kthread(1) | cpus_read_unlock() | To fix this, skip online processing if the CPU is already offline.
medium 4.7
In the Linux kernel, the following vulnerability has been resolved: drm/xe/vm: move xa_alloc to prevent UAF Evil user can guess the next id of the vm before the ioctl completes and then call vm destroy ioctl to trigger UAF since create ioctl is still referencing the same vm. Move the xa_alloc all the way to the end to prevent this. v2: - Rebase (cherry picked from commit dcfd3971327f3ee92765154baebbaece833d3ca9)
high 7.8
In the Linux kernel, the following vulnerability has been resolved: rxrpc: Fix a race between socket set up and I/O thread creation In rxrpc_open_socket(), it sets up the socket and then sets up the I/O thread that will handle it. This is a problem, however, as there's a gap between the two phases in which a packet may come into rxrpc_encap_rcv() from the UDP packet but we oops when trying to wake the not-yet created I/O thread. As a quick fix, just make rxrpc_encap_rcv() discard the packet if there's no I/O thread yet. A better, but more intrusive fix would perhaps be to rearrange things such that the socket creation is done by the I/O thread.
medium 4.7
In the Linux kernel, the following vulnerability has been resolved: vhost/scsi: null-ptr-dereference in vhost_scsi_get_req() Since commit 3f8ca2e115e5 ("vhost/scsi: Extract common handling code from control queue handler") a null pointer dereference bug can be triggered when guest sends an SCSI AN request. In vhost_scsi_ctl_handle_vq(), `vc.target` is assigned with `&v_req.tmf.lun[1]` within a switch-case block and is then passed to vhost_scsi_get_req() which extracts `vc->req` and `tpg`. However, for a `VIRTIO_SCSI_T_AN_*` request, tpg is not required, so `vc.target` is set to NULL in this branch. Later, in vhost_scsi_get_req(), `vc->target` is dereferenced without being checked, leading to a null pointer dereference bug. This bug can be triggered from guest. When this bug occurs, the vhost_worker process is killed while holding `vq->mutex` and the corresponding tpg will remain occupied indefinitely. Below is the KASAN report: Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 1 PID: 840 Comm: poc Not tainted 6.10.0+ #1 Hardware name: QEMU Ubuntu 24.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:vhost_scsi_get_req+0x165/0x3a0 Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 2b 02 00 00 48 b8 00 00 00 00 00 fc ff df 4d 8b 65 30 4c 89 e2 48 c1 ea 03 <0f> b6 04 02 4c 89 e2 83 e2 07 38 d0 7f 08 84 c0 0f 85 be 01 00 00 RSP: 0018:ffff888017affb50 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88801b000000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff888017affcb8 RBP: ffff888017affb80 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff888017affc88 R14: ffff888017affd1c R15: ffff888017993000 FS: 000055556e076500(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200027c0 CR3: 0000000010ed0004 CR4: 0000000000370ef0 Call Trace: <TASK> ? show_regs+0x86/0xa0 ? die_addr+0x4b/0xd0 ? exc_general_protection+0x163/0x260 ? asm_exc_general_protection+0x27/0x30 ? vhost_scsi_get_req+0x165/0x3a0 vhost_scsi_ctl_handle_vq+0x2a4/0xca0 ? __pfx_vhost_scsi_ctl_handle_vq+0x10/0x10 ? __switch_to+0x721/0xeb0 ? __schedule+0xda5/0x5710 ? __kasan_check_write+0x14/0x30 ? _raw_spin_lock+0x82/0xf0 vhost_scsi_ctl_handle_kick+0x52/0x90 vhost_run_work_list+0x134/0x1b0 vhost_task_fn+0x121/0x350 ... </TASK> ---[ end trace 0000000000000000 ]--- Let's add a check in vhost_scsi_get_req. [whitespace fixes]
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: powercap: intel_rapl: Fix off by one in get_rpi() The rp->priv->rpi array is either rpi_msr or rpi_tpmi which have NR_RAPL_PRIMITIVES number of elements. Thus the > needs to be >= to prevent an off by one access.
high 7.1
In the Linux kernel, the following vulnerability has been resolved: bpf: Fix helper writes to read-only maps Lonial found an issue that despite user- and BPF-side frozen BPF map (like in case of .rodata), it was still possible to write into it from a BPF program side through specific helpers having ARG_PTR_TO_{LONG,INT} as arguments. In check_func_arg() when the argument is as mentioned, the meta->raw_mode is never set. Later, check_helper_mem_access(), under the case of PTR_TO_MAP_VALUE as register base type, it assumes BPF_READ for the subsequent call to check_map_access_type() and given the BPF map is read-only it succeeds. The helpers really need to be annotated as ARG_PTR_TO_{LONG,INT} | MEM_UNINIT when results are written into them as opposed to read out of them. The latter indicates that it's okay to pass a pointer to uninitialized memory as the memory is written to anyway. However, ARG_PTR_TO_{LONG,INT} is a special case of ARG_PTR_TO_FIXED_SIZE_MEM just with additional alignment requirement. So it is better to just get rid of the ARG_PTR_TO_{LONG,INT} special cases altogether and reuse the fixed size memory types. For this, add MEM_ALIGNED to additionally ensure alignment given these helpers write directly into the args via *<ptr> = val. The .arg*_size has been initialized reflecting the actual sizeof(*<ptr>). MEM_ALIGNED can only be used in combination with MEM_FIXED_SIZE annotated argument types, since in !MEM_FIXED_SIZE cases the verifier does not know the buffer size a priori and therefore cannot blindly write *<ptr> = val.
high 7.1
In the Linux kernel, the following vulnerability has been resolved: ACPI: sysfs: validate return type of _STR method Only buffer objects are valid return values of _STR. If something else is returned description_show() will access invalid memory.
high 7.1
In the Linux kernel, the following vulnerability has been resolved: f2fs: fix to check atomic_file in f2fs ioctl interfaces Some f2fs ioctl interfaces like f2fs_ioc_set_pin_file(), f2fs_move_file_range(), and f2fs_defragment_range() missed to check atomic_write status, which may cause potential race issue, fix it.
medium 4.7
In the Linux kernel, the following vulnerability has been resolved: efistub/tpm: Use ACPI reclaim memory for event log to avoid corruption The TPM event log table is a Linux specific construct, where the data produced by the GetEventLog() boot service is cached in memory, and passed on to the OS using an EFI configuration table. The use of EFI_LOADER_DATA here results in the region being left unreserved in the E820 memory map constructed by the EFI stub, and this is the memory description that is passed on to the incoming kernel by kexec, which is therefore unaware that the region should be reserved. Even though the utility of the TPM2 event log after a kexec is questionable, any corruption might send the parsing code off into the weeds and crash the kernel. So let's use EFI_ACPI_RECLAIM_MEMORY instead, which is always treated as reserved by the E820 conversion logic.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: wifi: iwlwifi: mvm: set the cipher for secured NDP ranging The cipher pointer is not set, but is derefereced trying to set its content, which leads to a NULL pointer dereference. Fix it by pointing to the cipher parameter before dereferencing.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: x86/sgx: Fix deadlock in SGX NUMA node search When the current node doesn't have an EPC section configured by firmware and all other EPC sections are used up, CPU can get stuck inside the while loop that looks for an available EPC page from remote nodes indefinitely, leading to a soft lockup. Note how nid_of_current will never be equal to nid in that while loop because nid_of_current is not set in sgx_numa_mask. Also worth mentioning is that it's perfectly fine for the firmware not to setup an EPC section on a node. While setting up an EPC section on each node can enhance performance, it is not a requirement for functionality. Rework the loop to start and end on *a* node that has SGX memory. This avoids the deadlock looking for the current SGX-lacking node to show up in the loop when it never will.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: nbd: fix race between timeout and normal completion If request timetout is handled by nbd_requeue_cmd(), normal completion has to be stopped for avoiding to complete this requeued request, other use-after-free can be triggered. Fix the race by clearing NBD_CMD_INFLIGHT in nbd_requeue_cmd(), meantime make sure that cmd->lock is grabbed for clearing the flag and the requeue.
high 7.0
In the Linux kernel, the following vulnerability has been resolved: block, bfq: fix uaf for accessing waker_bfqq after splitting After commit 42c306ed7233 ("block, bfq: don't break merge chain in bfq_split_bfqq()"), if the current procress is the last holder of bfqq, the bfqq can be freed after bfq_split_bfqq(). Hence recored the bfqq and then access bfqq->waker_bfqq may trigger UAF. What's more, the waker_bfqq may in the merge chain of bfqq, hence just recored waker_bfqq is still not safe. Fix the problem by adding a helper bfq_waker_bfqq() to check if bfqq->waker_bfqq is in the merge chain, and current procress is the only holder.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: firmware: arm_scmi: Fix double free in OPTEE transport Channels can be shared between protocols, avoid freeing the same channel descriptors twice when unloading the stack.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: scsi: elx: libefc: Fix potential use after free in efc_nport_vport_del() The kref_put() function will call nport->release if the refcount drops to zero. The nport->release release function is _efc_nport_free() which frees "nport". But then we dereference "nport" on the next line which is a use after free. Re-order these lines to avoid the use after free.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: tpm: Clean up TPM space after command failure tpm_dev_transmit prepares the TPM space before attempting command transmission. However if the command fails no rollback of this preparation is done. This can result in transient handles being leaked if the device is subsequently closed with no further commands performed. Fix this by flushing the space in the event of command transmission failure.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: bpf: correctly handle malformed BPF_CORE_TYPE_ID_LOCAL relos In case of malformed relocation record of kind BPF_CORE_TYPE_ID_LOCAL referencing a non-existing BTF type, function bpf_core_calc_relo_insn would cause a null pointer deference. Fix this by adding a proper check upper in call stack, as malformed relocation records could be passed from user space. Simplest reproducer is a program: r0 = 0 exit With a single relocation record: .insn_off = 0, /* patch first instruction */ .type_id = 100500, /* this type id does not exist */ .access_str_off = 6, /* offset of string "0" */ .kind = BPF_CORE_TYPE_ID_LOCAL, See the link for original reproducer or next commit for a test case.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: nilfs2: fix potential oob read in nilfs_btree_check_delete() The function nilfs_btree_check_delete(), which checks whether degeneration to direct mapping occurs before deleting a b-tree entry, causes memory access outside the block buffer when retrieving the maximum key if the root node has no entries. This does not usually happen because b-tree mappings with 0 child nodes are never created by mkfs.nilfs2 or nilfs2 itself. However, it can happen if the b-tree root node read from a device is configured that way, so fix this potential issue by adding a check for that case.
high 7.1
In the Linux kernel, the following vulnerability has been resolved: PCI: keystone: Fix if-statement expression in ks_pcie_quirk() This code accidentally uses && where || was intended. It potentially results in a NULL dereference. Thus, fix the if-statement expression to use the correct condition. [kwilczynski: commit log]
medium 5.5
Rejected reason: This CVE ID has been rejected or withdrawn by its CVE Numbering Authority.
In the Linux kernel, the following vulnerability has been resolved: media: mediatek: vcodec: Fix H264 multi stateless decoder smatch warning Fix a smatch static checker warning on vdec_h264_req_multi_if.c. Which leads to a kernel crash when fb is NULL.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: media: mediatek: vcodec: Fix VP8 stateless decoder smatch warning Fix a smatch static checker warning on vdec_vp8_req_if.c. Which leads to a kernel crash when fb is NULL.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: media: mediatek: vcodec: Fix H264 stateless decoder smatch warning Fix a smatch static checker warning on vdec_h264_req_if.c. Which leads to a kernel crash when fb is NULL.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: PCI: kirin: Fix buffer overflow in kirin_pcie_parse_port() Within kirin_pcie_parse_port(), the pcie->num_slots is compared to pcie->gpio_id_reset size (MAX_PCI_SLOTS) which is correct and would lead to an overflow. Thus, fix condition to pcie->num_slots + 1 >= MAX_PCI_SLOTS and move pcie->num_slots increment below the if-statement to avoid out-of-bounds array access. Found by Linux Verification Center (linuxtesting.org) with SVACE. [kwilczynski: commit log]
high 7.8
In the Linux kernel, the following vulnerability has been resolved: RDMA/hns: Fix Use-After-Free of rsv_qp on HIP08 Currently rsv_qp is freed before ib_unregister_device() is called on HIP08. During the time interval, users can still dereg MR and rsv_qp will be used in this process, leading to a UAF. Move the release of rsv_qp after calling ib_unregister_device() to fix it.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: fuse: use exclusive lock when FUSE_I_CACHE_IO_MODE is set This may be a typo. The comment has said shared locks are not allowed when this bit is set. If using shared lock, the wait in `fuse_file_cached_io_open` may be forever.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: RDMA/cxgb4: Added NULL check for lookup_atid The lookup_atid() function can return NULL if the ATID is invalid or does not exist in the identifier table, which could lead to dereferencing a null pointer without a check in the `act_establish()` and `act_open_rpl()` functions. Add a NULL check to prevent null pointer dereferencing. Found by Linux Verification Center (linuxtesting.org) with SVACE.
medium 5.5
In the Linux kernel, the following vulnerability has been resolved: vhost_vdpa: assign irq bypass producer token correctly We used to call irq_bypass_unregister_producer() in vhost_vdpa_setup_vq_irq() which is problematic as we don't know if the token pointer is still valid or not. Actually, we use the eventfd_ctx as the token so the life cycle of the token should be bound to the VHOST_SET_VRING_CALL instead of vhost_vdpa_setup_vq_irq() which could be called by set_status(). Fixing this by setting up irq bypass producer's token when handling VHOST_SET_VRING_CALL and un-registering the producer before calling vhost_vring_ioctl() to prevent a possible use after free as eventfd could have been released in vhost_vring_ioctl(). And such registering and unregistering will only be done if DRIVER_OK is set.
high 7.8
In the Linux kernel, the following vulnerability has been resolved: net: seeq: Fix use after free vulnerability in ether3 Driver Due to Race Condition In the ether3_probe function, a timer is initialized with a callback function ether3_ledoff, bound to &prev(dev)->timer. Once the timer is started, there is a risk of a race condition if the module or device is removed, triggering the ether3_remove function to perform cleanup. The sequence of operations that may lead to a UAF bug is as follows: CPU0 CPU1 | ether3_ledoff ether3_remove | free_netdev(dev); | put_devic | kfree(dev); | | ether3_outw(priv(dev)->regs.config2 |= CFG2_CTRLO, REG_CONFIG2); | // use dev Fix it by ensuring that the timer is canceled before proceeding with the cleanup in ether3_remove.
high 7.0
In the Linux kernel, the following vulnerability has been resolved: mm: call the security_mmap_file() LSM hook in remap_file_pages() The remap_file_pages syscall handler calls do_mmap() directly, which doesn't contain the LSM security check. And if the process has called personality(READ_IMPLIES_EXEC) before and remap_file_pages() is called for RW pages, this will actually result in remapping the pages to RWX, bypassing a W^X policy enforced by SELinux. So we should check prot by security_mmap_file LSM hook in the remap_file_pages syscall handler before do_mmap() is called. Otherwise, it potentially permits an attacker to bypass a W^X policy enforced by SELinux. The bypass is similar to CVE-2016-10044, which bypass the same thing via AIO and can be found in [1]. The PoC: $ cat > test.c int main(void) { size_t pagesz = sysconf(_SC_PAGE_SIZE); int mfd = syscall(SYS_memfd_create, "test", 0); const char *buf = mmap(NULL, 4 * pagesz, PROT_READ | PROT_WRITE, MAP_SHARED, mfd, 0); unsigned int old = syscall(SYS_personality, 0xffffffff); syscall(SYS_personality, READ_IMPLIES_EXEC | old); syscall(SYS_remap_file_pages, buf, pagesz, 0, 2, 0); syscall(SYS_personality, old); // show the RWX page exists even if W^X policy is enforced int fd = open("/proc/self/maps", O_RDONLY); unsigned char buf2[1024]; while (1) { int ret = read(fd, buf2, 1024); if (ret <= 0) break; write(1, buf2, ret); } close(fd); } $ gcc test.c -o test $ ./test | grep rwx 7f1836c34000-7f1836c35000 rwxs 00002000 00:01 2050 /memfd:test (deleted) [PM: subject line tweaks]
high 7.8
In the Linux kernel, the following vulnerability has been resolved: KVM: Use dedicated mutex to protect kvm_usage_count to avoid deadlock Use a dedicated mutex to guard kvm_usage_count to fix a potential deadlock on x86 due to a chain of locks and SRCU synchronizations. Translating the below lockdep splat, CPU1 #6 will wait on CPU0 #1, CPU0 #8 will wait on CPU2 #3, and CPU2 #7 will wait on CPU1 #4 (if there's a writer, due to the fairness of r/w semaphores). CPU0 CPU1 CPU2 1 lock(&kvm->slots_lock); 2 lock(&vcpu->mutex); 3 lock(&kvm->srcu); 4 lock(cpu_hotplug_lock); 5 lock(kvm_lock); 6 lock(&kvm->slots_lock); 7 lock(cpu_hotplug_lock); 8 sync(&kvm->srcu); Note, there are likely more potential deadlocks in KVM x86, e.g. the same pattern of taking cpu_hotplug_lock outside of kvm_lock likely exists with __kvmclock_cpufreq_notifier(): cpuhp_cpufreq_online() | -> cpufreq_online() | -> cpufreq_gov_performance_limits() | -> __cpufreq_driver_target() | -> __target_index() | -> cpufreq_freq_transition_begin() | -> cpufreq_notify_transition() | -> ... __kvmclock_cpufreq_notifier() But, actually triggering such deadlocks is beyond rare due to the combination of dependencies and timings involved. E.g. the cpufreq notifier is only used on older CPUs without a constant TSC, mucking with the NX hugepage mitigation while VMs are running is very uncommon, and doing so while also onlining/offlining a CPU (necessary to generate contention on cpu_hotplug_lock) would be even more unusual. The most robust solution to the general cpu_hotplug_lock issue is likely to switch vm_list to be an RCU-protected list, e.g. so that x86's cpufreq notifier doesn't to take kvm_lock. For now, settle for fixing the most blatant deadlock, as switching to an RCU-protected list is a much more involved change, but add a comment in locking.rst to call out that care needs to be taken when walking holding kvm_lock and walking vm_list. ====================================================== WARNING: possible circular locking dependency detected 6.10.0-smp--c257535a0c9d-pip #330 Tainted: G S O ------------------------------------------------------ tee/35048 is trying to acquire lock: ff6a80eced71e0a8 (&kvm->slots_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x179/0x1e0 [kvm] but task is already holding lock: ffffffffc07abb08 (kvm_lock){+.+.}-{3:3}, at: set_nx_huge_pages+0x14a/0x1e0 [kvm] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (kvm_lock){+.+.}-{3:3}: __mutex_lock+0x6a/0xb40 mutex_lock_nested+0x1f/0x30 kvm_dev_ioctl+0x4fb/0xe50 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #2 (cpu_hotplug_lock){++++}-{0:0}: cpus_read_lock+0x2e/0xb0 static_key_slow_inc+0x16/0x30 kvm_lapic_set_base+0x6a/0x1c0 [kvm] kvm_set_apic_base+0x8f/0xe0 [kvm] kvm_set_msr_common+0x9ae/0xf80 [kvm] vmx_set_msr+0xa54/0xbe0 [kvm_intel] __kvm_set_msr+0xb6/0x1a0 [kvm] kvm_arch_vcpu_ioctl+0xeca/0x10c0 [kvm] kvm_vcpu_ioctl+0x485/0x5b0 [kvm] __se_sys_ioctl+0x7b/0xd0 __x64_sys_ioctl+0x21/0x30 x64_sys_call+0x15d0/0x2e60 do_syscall_64+0x83/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e -> #1 (&kvm->srcu){.+.+}-{0:0}: __synchronize_srcu+0x44/0x1a0 ---truncated---
medium 5.5