Two Incomplete Fixes for a Path Traversal Vulnerability in ONNX (CVE-2026-27489)

01/04/2026

Some vulnerabilities are patched once and forgotten. Others keep coming back because each fix only addresses the symptom rather than the root cause. The path traversal vulnerability in onnx is a textbook example of the latter. It has now been patched three times across four years: CVE-2022-25882, CVE-2024-27318, and most recently CVE-2026-27489, which I discovered and reported. Each patch closed one door while leaving another open.

This post walks through all three patches, explains exactly where each fell short, and uses them to illustrate the secure coding lessons every developer should learn when handling file paths.

The vulnerability

Open Neural Network Exchange (ONNX) provides an open source format for AI models. ONNX is widely supported and can be found in many frameworks, tools, and hardware. CVE-2026-27489 has been assigned a CVSS v4 score of 8.7 (High) and is fixed in version 1.21.0.

ONNX models can store tensor bytes in separate files and reference them via external_data.location. If an adversary can control that location value, the library strips a few dangerous characters and then calls os.path.join(base_dir, location). This is an insecure pattern because os.path.join simply concatenates path components. Given inputs:

base_dir="/model_dir"
location=junk/../../sensitive_file

os.path.join returns /model_dir/junk/../../sensitive_file, which the OS resolves to /sensitive_file. This allows an adversary to read arbitrary files on the filesystem, including API keys, SSH keys, and secrets.

What was wrong with the first patch

The patch for CVE-2022-25882 only checked whether a path started with ../ or ..\ and missed semantically equivalent combinations. For example, a/../b resolves to /b on the filesystem and bypasses this check entirely.

// onnx/checker.cc
// Check that normalized relative path starts with "../" or "..\" on windows.
if (relative_path.rfind(".." + k_preferred_path_separator, 0) == 0) {
  fail_check(
      "Data of TensorProto ( tensor name: ",
      tensor.name(),
      ") should be file inside the ",
      ctx.get_model_dir(),
      ", but the '",
      entry.value(),
      "' points outside the directory");
}

This patch rests on a flawed assumption that rejecting unsafe characters automatically makes input safe. I have previously written about why this approach to patching is one of the primary reasons path traversal vulnerabilities keep reappearing, even decades after the first documented case.

What was wrong with the second patch

The patch for CVE-2024-27318 removed the ad-hoc Python helper and replaced it with a C++ function, resolve_external_data_location(base_dir, location, tensor_name).

# onnx/external_data_helper.py
def load_external_data_for_tensor(tensor: TensorProto, base_dir: str) -> None:
    info = ExternalDataInfo(tensor)

    # The following lines were removed:
    # file_location = _sanitize_path(info.location)
    # external_data_file_path = os.path.join(base_dir, file_location)

    # Replaced with:
    external_data_file_path = c_checker._resolve_external_data_location(
        base_dir, info.location, tensor.name
    )

This approach was an improvement because it:

Removed ad-hoc path validation in Python.
Uniformly used a single function to resolve untrusted paths.
Enforced the security check — security checks must not be optional, or developers will forget them.
Encapsulated validation in a dedicated function, cleanly separating business logic from security concerns.

However, it had a shortcoming in the C++ implementation that I found while reviewing the patch.

Looking at resolve_external_data_location in checker.cc:

// onnx/checker.cc
// Do not allow symlinks or directories.
if (data_path.empty() || (data_path_str[0] != '#' && !std::filesystem::is_regular_file(data_path))) {
  fail_check(
      "Data of TensorProto ( tensor name: ",
      tensor_name,
      ") should be stored in ",
      data_path_str,
      ", but it is not regular file.");
}
return data_path_str;

Although the comment says “Do not allow symlinks or directories,” the check does not actually block symlinks. std::filesystem::is_regular_file performs a status(p) call that follows symlinks to determine the file type. If a symlink points to a regular file, the function returns true. An adversary can therefore create a symlink pointing anywhere on the filesystem and read any arbitrary file.

Proof of concept

# Proof of concept for CVE-2026-27489
# Discovered by Pedram (pi3ch) Hayati
#
# 1. Generate a sample ONNX model (model.onnx) with external data (model.data).
# 2. Remove model.data
# 3. Run: ln -s /etc/passwd model.data
# 4. Load the model using the code below.
# 5. Observe that the symlink check is bypassed and the model loads successfully.

import onnx
from onnx.external_data_helper import load_external_data_for_model

def load_onnx_model_basic(model_path="model.onnx"):
    model = onnx.load(model_path)
    return model

def load_onnx_model_explicit(model_path="model.onnx"):
    model = onnx.load(model_path, load_external_data=False)
    load_external_data_for_model(model, ".")
    return model

if __name__ == "__main__":
    model = load_onnx_model_basic("model.onnx")

For a real exploitation scenario, an adversary provides a victim with a compressed archive containing poc.onnx and poc.data (a symlink). Once the victim decompresses and loads the model, the symlink silently reads an attacker-chosen file from the host filesystem. Impact includes reading sensitive files such as /etc/passwd, SSH private keys, cloud credentials, and environment variables (e.g. /proc/1/environ). The issue is not limited to UNIX.

The final patch (fixed in v1.21.0)

The fix is a change to resolve_external_data_location in checker.cc. The old code used std::filesystem::is_regular_file(data_path), which internally calls status(). The fix replaces this with std::filesystem::symlink_status(data_path), which calls lstat() and then passes the result to is_regular_file:

// Check whether the file exists
if (data_path.empty() || (data_path_str[0] != '#' && !std::filesystem::exists(data_path))) {
  fail_check(
      "Data of TensorProto ( tensor name: ",
      tensor_name,
      ") should be stored in ",
      data_path_str,
      ", but it doesn't exist or is not accessible.");
}
// Do not allow symlinks or directories.
auto path_status = std::filesystem::symlink_status(data_path);
if (data_path_str[0] != '#' &&
    (std::filesystem::is_symlink(path_status) || !std::filesystem::is_regular_file(path_status))) {
  fail_check(
      "Data of TensorProto ( tensor name: ",
      tensor_name,
      ") should be stored in ",
      data_path_str,
      ", but it is not a regular file.");
}
return data_path_str;

std::filesystem::symlink_status(p) does not follow symlinks and returns the status of the symlink itself, not its target.

When you call is_regular_file(symlink_status(p)) on a symlink, the symlink itself is not a regular file, so the function returns false and the check catches it. This is the opposite of the old behaviour where is_regular_file(p) silently followed the symlink to its target and returned true if the target was a regular file.

Root cause of path traversal

In my article Input Validation: Necessary but Not Sufficient, I explain why input validation alone cannot solve certain classes of vulnerability. Path traversal is one of them.

The root cause of path traversal is the lack of path canonicalization. In other words, we wrongly compare a relative path to an absolute path. We are comparing apples with oranges.

A common flawed patch involves checking the filename for unsafe sequences like ../. As we have seen across three CVEs, this assumption is repeatedly bypassed. Instead, the path must be converted to an absolute, canonical form before any validation is applied:

# A correct approach
def read_file_content(filename):
    try:
        path = os.path.join("/app/resources/", filename)
        if not os.path.normpath(path).startswith("/app/resources"):
            return HttpResponseBadRequest()
        content = open(path)
        return HttpResponse(content, status=200)
    except Exception as ex:
        return HttpResponse(str(ex), status=404)

For low-level C/C++ code, the kernel-level approach using O_NOFOLLOW shown above is the correct equivalent.

Final notes

When the first patch was submitted in 2022, jnovikov correctly identified the optimal fix in a GitHub comment:

Please note that the ‘optimal’ solution would be to use boost::filesystem or C++17 filesystem to get the absolute path file and resolve possible symlink problems, but AFAIK the project target is C++11 and boost is not used in the project.

This is another reason that shows the complexity of addressing the root cause of a vulnerability when software is released. The right fix was known from day one, but constraints forced a compromise.

The lesson for developers is to drill down to the root cause of the vulnerability and design the software for robustness. Build secure software from the ground up. Following proven defensive programming design principles from the start reduces the long-term cost of vulnerabilities.

Learn how to address path traversal in these hands-on secure coding labs.

Play AppSec WarGames

Want to skill-up in secure coding and AppSec? Try SecDim Wargames to learn how to find, hack and fix security vulnerabilities inspired by real-world incidents.

For Companies Play Now

Got a comment?

Join our secure coding and AppSec community. A discussion board to share and discuss all aspects of secure programming, AppSec, DevSecOps, fuzzing, cloudsec, AIsec code review, and more.