CVE-2025-39791

Published Sep 11, 2025

Last updated 7 months ago

Overview

Description
In the Linux kernel, the following vulnerability has been resolved: dm: dm-crypt: Do not partially accept write BIOs with zoned targets Read and write operations issued to a dm-crypt target may be split according to the dm-crypt internal limits defined by the max_read_size and max_write_size module parameters (default is 128 KB). The intent is to improve processing time of large BIOs by splitting them into smaller operations that can be parallelized on different CPUs. For zoned dm-crypt targets, this BIO splitting is still done but without the parallel execution to ensure that the issuing order of write operations to the underlying devices remains sequential. However, the splitting itself causes other problems: 1) Since dm-crypt relies on the block layer zone write plugging to handle zone append emulation using regular write operations, the reminder of a split write BIO will always be plugged into the target zone write plugged. Once the on-going write BIO finishes, this reminder BIO is unplugged and issued from the zone write plug work. If this reminder BIO itself needs to be split, the reminder will be re-issued and plugged again, but that causes a call to a blk_queue_enter(), which may block if a queue freeze operation was initiated. This results in a deadlock as DM submission still holds BIOs that the queue freeze side is waiting for. 2) dm-crypt relies on the emulation done by the block layer using regular write operations for processing zone append operations. This still requires to properly return the written sector as the BIO sector of the original BIO. However, this can be done correctly only and only if there is a single clone BIO used for processing the original zone append operation issued by the user. If the size of a zone append operation is larger than dm-crypt max_write_size, then the orginal BIO will be split and processed as a chain of regular write operations. Such chaining result in an incorrect written sector being returned to the zone append issuer using the original BIO sector. This in turn results in file system data corruptions using xfs or btrfs. Fix this by modifying get_max_request_size() to always return the size of the BIO to avoid it being split with dm_accpet_partial_bio() in crypt_map(). get_max_request_size() is renamed to get_max_request_sectors() to clarify the unit of the value returned and its interface is changed to take a struct dm_target pointer and a pointer to the struct bio being processed. In addition to this change, to ensure that crypt_alloc_buffer() works correctly, set the dm-crypt device max_hw_sectors limit to be at most BIO_MAX_VECS << PAGE_SECTORS_SHIFT (1 MB with a 4KB page architecture). This forces DM core to split write BIOs before passing them to crypt_map(), and thus guaranteeing that dm-crypt can always accept an entire write BIO without needing to split it. This change does not have any effect on the read path of dm-crypt. Read operations can still be split and the BIO fragments processed in parallel. There is also no impact on the performance of the write path given that all zone write BIOs were already processed inline instead of in parallel. This change also does not affect in any way regular dm-crypt block devices.
Source
416baaa9-dc9f-4396-8d5f-8c081fb06d67
NVD status
Analyzed
Products
linux_kernel

Risk scores

CVSS 3.1

Type
Primary
Base score
5.5
Impact score
3.6
Exploitability score
1.8
Vector string
CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
Severity
MEDIUM

Weaknesses

nvd@nist.gov
CWE-667

Social media

Hype score
Not currently trending

Configurations

  1. In the Linux kernel, the following vulnerability has been resolved: coresight: tmc-etr: Fix race condition between sysfs and perf mode When trying to run perf and sysfs mode simultaneously, the WARN_ON() in tmc_etr_enable_hw() is triggered sometimes: WARNING: CPU: 42 PID: 3911571 at drivers/hwtracing/coresight/coresight-tmc-etr.c:1060 tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] [..snip..] Call trace: tmc_etr_enable_hw+0xc0/0xd8 [coresight_tmc] (P) tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] (L) tmc_enable_etr_sink+0x11c/0x250 [coresight_tmc] coresight_enable_path+0x1c8/0x218 [coresight] coresight_enable_sysfs+0xa4/0x228 [coresight] enable_source_store+0x58/0xa8 [coresight] dev_attr_store+0x20/0x40 sysfs_kf_write+0x4c/0x68 kernfs_fop_write_iter+0x120/0x1b8 vfs_write+0x2c8/0x388 ksys_write+0x74/0x108 __arm64_sys_write+0x24/0x38 el0_svc_common.constprop.0+0x64/0x148 do_el0_svc+0x24/0x38 el0_svc+0x3c/0x130 el0t_64_sync_handler+0xc8/0xd0 el0t_64_sync+0x1ac/0x1b0 ---[ end trace 0000000000000000 ]--- Since the enablement of sysfs mode is separeted into two critical regions, one for sysfs buffer allocation and another for hardware enablement, it's possible to race with the perf mode. Fix this by double check whether the perf mode's been used before enabling the hardware in sysfs mode. mode: [sysfs mode] [perf mode] tmc_etr_get_sysfs_buffer() spin_lock(&drvdata->spinlock) [sysfs buffer allocation] spin_unlock(&drvdata->spinlock) spin_lock(&drvdata->spinlock) tmc_etr_enable_hw() drvdata->etr_buf = etr_perf->etr_buf spin_unlock(&drvdata->spinlock) spin_lock(&drvdata->spinlock) tmc_etr_enable_hw() WARN_ON(drvdata->etr_buf) // WARN sicne etr_buf initialized at the perf side spin_unlock(&drvdata->spinlock) With this fix, we retain the check for CS_MODE_PERF in get_etr_sysfs_buf. This ensures we verify whether the perf mode's already running before we actually allocate the buffer. Then we can save the time of allocating/freeing the sysfs buffer if race with the perf mode.CVE-2026-46272