All Articles

The maturity gap in ML pipeline infrastructure

Patrick Smyth, Principal Developer Relations Engineer

When a technical field is mature, practitioners can assume that the most common pitfalls are addressed at the level of tooling and infrastructure. When working with SQL in 1990, programmers had to protect against injection by hand, while today’s programmers can walk the happy path laid out by language runtimes and libraries.

Unfortunately, practitioners setting up ML pipelines in 2026 cannot rely on this happy path. While infrastructure for training and deploying models in production is improving, creating an end-to-end ML pipeline in 2026 is fraught with footguns for the unwary ML Ops engineer or technical manager.

While model capabilities have leapt ahead in the last year, developments in security best practices, infrastructure, and tool maturity have not kept up. The reality is that, in 2026, default tooling around ML pipeline security leaves organizations open to a wide variety of attacks. Some of these attacks, such as data poisoning and model laundering (“white-label”) attacks, are relatively exotic, and mitigations are emerging. However, many attacks exploit longstanding threat vectors that we know how to mitigate in other contexts. This is the maturity gap in ML Ops infrastructure. By “maturity gap,” I mean that the security affordances we take for granted in software engineering haven’t been embedded into ML tooling defaults, creating a gap between where we are and where we should be.

In this post, we’ll cover some of these gaps, describing the state of play in 2026, what ML Ops engineers and managers can do to mitigate in the short term, and where the industry needs to go to make infrastructure secure by default.

Pickle deserialization

It’s early 2026 and, yes, pickle files are still the default serialization format in the dominant ML framework (PyTorch). Pickles are files containing opcodes that allow Python programs to be stored and restored. PyTorch uses them to store model weights, essentially large n-dimensional arrays known as tensors. However, when these files are deserialized (loaded from the pickle file back into a running program), they can execute arbitrary Python code.

The real issue with pickle files is that arbitrary code execution is not technically necessary. Safe model storage formats now exist, most notably safetensors. However, until frameworks like PyTorch and TensorFlow use safetensors by default, ML Ops engineers will need to take the additional step of storing models in a secure format, and in many cases, models will need to be deserialized in isolation and converted to safe formats for use in production systems.

Model signing and provenance

Zero Trust (ZT) is a security philosophy that assumes attackers are already present within a network. Within ZT, prior steps in a build or distribution chain should be regarded as potentially compromised. One powerful tool in the ZT arsenal is robust signing. As software artifacts such as models move through an ML pipeline, they should be signed with useful metadata about the build process and environment. Steps that ingest artifacts should verify their provenance and reject any artifacts that may have been tampered with.

Since they handle the moving of software artifacts from one stage to the next, one might think that AutoML frameworks such as Google Cloud AutoML and H2O AutoML would support, encourage, or even require robust signing. Further, one might assume that models on popular repositories such as Hugging Face would come with cryptographically verifiable metadata in the form of attestations. Unfortunately, in early 2026, frameworks do not encourage signing within pipelines, and signed models are only available on a few platforms, such as NVIDIA NGC.

ML Ops practitioners concerned about security in 2026 should closely follow SLSA guidelines on signing and verifying signatures. Projects such as Sigstore’s Cosign make it straightforward to sign software artifacts and verify signatures within CI/CD. However, these best practices should not be the sole responsibility of individual teams. In order to close the maturity gap in this space, end-to-end ML solutions such as Vertex AI and lifecycle tools like MLflow should bake in verification as software artifacts move through the system, and services that host models should discourage the uploading and distribution of unsigned artifacts.

Ecosystems and the supply chain

The last two years have seen sustained attacks on software repositories such as PyPI, Maven Central, and npm. While efforts such as PyPI’s Trusted Publishing and Maven Central’s publishing requirements have been needed mitigations, major attacks such as the Shai-Hulud supply chain worm show the necessity of securing language ecosystems.

This problem is particularly acute for those assembling secure ML pipelines (which often include many more components than just model code). This heavy dependence on ecosystem packages means the average ML pipeline is unusually exposed to software supply chain security risk.

A few techniques can mitigate software supply chain risk from language ecosystems. Pinning and hashing dependencies — and using a cooling-off period before packages are allowed into production — can significantly reduce exposure by preventing newly published, potentially malicious releases from automatically entering the pipeline. In addition, periodically reviewing requirements to remove unnecessary dependencies can reduce exposure.

For the gold standard in mitigating software supply chain security risk, Chainguard Libraries rebuilds packages across language ecosystems (Java, Python, JavaScript) in our SLSA Level 2-compliant Chainguard Factory, dramatically reducing supply chain security risk. Organizations using Chainguard Libraries for AI/ML would have remained protected in the face of major attacks such as 2023’s PyTorch dependency confusion attack, 2024’s aiocpa token exfiltration attack, and 2024’s Ultralytics YOLO attack.

Container security

Containers in the AI/ML space have an attack surface problem. AI/ML images tend to be large, complex, and built on top of general-purpose base images that ship with thousands of packages—many of which are not needed for training or inference. This creates a broad attack surface, including extraneous libraries, unused system tools, shells, and language runtimes that attackers can leverage.

This manifests in both upstream measures of attack surface (packages and executables) and the final CVE count. For example, the upstream nightly build of the PyTorch runtime image has 269 packages, 1,035 executables, and 161 unique vulnerabilities as measured by grype on January 7, 2026. By contrast, our PyTorch Chainguard Container has 132 packages, 390 executables, and 3 vulnerabilities.

Machine learning has roots in academic and data science projects and, for years, has operated in relatively high-trust environments. However, it’s 2026, and ML pipelines are in production in sensitive industries including manufacturing, the military, and medicine. Securing ML pipelines shouldn’t be a matter of heroics; it should be a matter of infrastructure. As engineers, technical managers, open source contributors, and downstream users of ML Operations infrastructure, let’s do our part to close the machine learning maturity gap in 2026.

Share this article

Related articles

Want to learn more about Chainguard?