- Description
- In the Linux kernel, the following vulnerability has been resolved:
perf/x86: Move event pointer setup earlier in x86_pmu_enable()
A production AMD EPYC system crashed with a NULL pointer dereference
in the PMU NMI handler:
BUG: kernel NULL pointer dereference, address: 0000000000000198
RIP: x86_perf_event_update+0xc/0xa0
Call Trace:
<NMI>
amd_pmu_v2_handle_irq+0x1a6/0x390
perf_event_nmi_handler+0x24/0x40
The faulting instruction is `cmpq $0x0, 0x198(%rdi)` with RDI=0,
corresponding to the `if (unlikely(!hwc->event_base))` check in
x86_perf_event_update() where hwc = &event->hw and event is NULL.
drgn inspection of the vmcore on CPU 106 showed a mismatch between
cpuc->active_mask and cpuc->events[]:
active_mask: 0x1e (bits 1, 2, 3, 4)
events[1]: 0xff1100136cbd4f38 (valid)
events[2]: 0x0 (NULL, but active_mask bit 2 set)
events[3]: 0xff1100076fd2cf38 (valid)
events[4]: 0xff1100079e990a90 (valid)
The event that should occupy events[2] was found in event_list[2]
with hw.idx=2 and hw.state=0x0, confirming x86_pmu_start() had run
(which clears hw.state and sets active_mask) but events[2] was
never populated.
Another event (event_list[0]) had hw.state=0x7 (STOPPED|UPTODATE|ARCH),
showing it was stopped when the PMU rescheduled events, confirming the
throttle-then-reschedule sequence occurred.
The root cause is commit 7e772a93eb61 ("perf/x86: Fix NULL event access
and potential PEBS record loss") which moved the cpuc->events[idx]
assignment out of x86_pmu_start() and into step 2 of x86_pmu_enable(),
after the PERF_HES_ARCH check. This broke any path that calls
pmu->start() without going through x86_pmu_enable() -- specifically
the unthrottle path:
perf_adjust_freq_unthr_events()
-> perf_event_unthrottle_group()
-> perf_event_unthrottle()
-> event->pmu->start(event, 0)
-> x86_pmu_start() // sets active_mask but not events[]
The race sequence is:
1. A group of perf events overflows, triggering group throttle via
perf_event_throttle_group(). All events are stopped: active_mask
bits cleared, events[] preserved (x86_pmu_stop no longer clears
events[] after commit 7e772a93eb61).
2. While still throttled (PERF_HES_STOPPED), x86_pmu_enable() runs
due to other scheduling activity. Stopped events that need to
move counters get PERF_HES_ARCH set and events[old_idx] cleared.
In step 2 of x86_pmu_enable(), PERF_HES_ARCH causes these events
to be skipped -- events[new_idx] is never set.
3. The timer tick unthrottles the group via pmu->start(). Since
commit 7e772a93eb61 removed the events[] assignment from
x86_pmu_start(), active_mask[new_idx] is set but events[new_idx]
remains NULL.
4. A PMC overflow NMI fires. The handler iterates active counters,
finds active_mask[2] set, reads events[2] which is NULL, and
crashes dereferencing it.
Move the cpuc->events[hwc->idx] assignment in x86_pmu_enable() to
before the PERF_HES_ARCH check, so that events[] is populated even
for events that are not immediately started. This ensures the
unthrottle path via pmu->start() always finds a valid event pointer.
- Source
- 416baaa9-dc9f-4396-8d5f-8c081fb06d67
- NVD status
- Analyzed
- Products
- linux_kernel
[
{
"nodes": [
{
"cpeMatch": [
{
"criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
"matchCriteriaId": "7C298528-1754-41BD-B4E9-84A37AB7BA32",
"versionEndExcluding": "6.18",
"versionStartIncluding": "6.17.13",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
"matchCriteriaId": "29896448-5F31-40A5-A909-2BC59D21590D",
"versionEndExcluding": "6.18.20",
"versionStartIncluding": "6.18.2",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:*:*:*:*:*:*:*:*",
"matchCriteriaId": "D70DEFFE-AC47-4F3A-A2B2-2D67AB4CF3C8",
"versionEndExcluding": "6.19.10",
"versionStartIncluding": "6.19.1",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:6.19:-:*:*:*:*:*:*",
"matchCriteriaId": "35C8A871-4971-433E-A046-FC9F7B7D190A",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc1:*:*:*:*:*:*",
"matchCriteriaId": "F253B622-8837-4245-BCE5-A7BF8FC76A16",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc2:*:*:*:*:*:*",
"matchCriteriaId": "4AE85AD8-4641-4E7C-A2F4-305E2CD9EE64",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc3:*:*:*:*:*:*",
"matchCriteriaId": "F666C8D8-6538-46D4-B318-87610DE64C34",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc4:*:*:*:*:*:*",
"matchCriteriaId": "02259FDA-961B-47BC-AE7F-93D7EC6E90C2",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc5:*:*:*:*:*:*",
"matchCriteriaId": "58A9FEFF-C040-420D-8F0A-BFDAAA1DF258",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc6:*:*:*:*:*:*",
"matchCriteriaId": "1D2315C0-D46F-4F85-9754-F9E5E11374A6",
"vulnerable": true
},
{
"criteria": "cpe:2.3:o:linux:linux_kernel:7.0:rc7:*:*:*:*:*:*",
"matchCriteriaId": "512EE3A8-A590-4501-9A94-5D4B268D6138",
"vulnerable": true
}
],
"negate": false,
"operator": "OR"
}
]
}
]