100% Open Source (Apache 2.0)

Secure Your Entire AI Stack
Models • Data • RAG

The Anti-Virus for AI. Detect malware in Models, poisoning and PII in Datasets, secrets in Notebooks, and prompt injections in RAG documents.

$ pip install veritensor
veritensor-cli — 80x24
~ veritensor scan ./project --recursive --jobs 4
Scanning 4 files...
FAIL train.parquet: HIGH: Data Poisoning (Injection) detected in row 14,502.
FAIL resume.pdf: CRITICAL: Prompt Injection (Stealth/CSS hiding) found.
FAIL experiment.ipynb: CRITICAL: Leaked AWS Access Key in cell output.
PASS model.safetensors: Verified against Hugging Face Hub.
❌ BLOCKING DEPLOYMENT

Under the Hood

We don't just grep strings. Veritensor uses advanced static analysis, metadata parsing, and cryptographic proofs.

Step 01

Deep Static Analysis

Veritensor implements custom engines for every AI artifact type. It builds ASTs for code and parses binary structures for models.

  • Stealth Detection: Finds text hidden via CSS (font-size:0, color:white) in PDF/HTML documents.
  • Entropy Analysis: Detects high-entropy secrets (API keys) in Notebook outputs, filtering out UUIDs and noise.
  • Pickle VM: Emulates stack execution to find RCE payloads in .pkl and PyTorch files without running them.
injection.py
def
scan_document
(file):
# 1. Check Magic Numbers
if
is_executable(file):
return "Malware disguised as PDF"
# 2. Scan Raw Bytes for CSS Hacks
if
has_stealth_css(file):
return "Hidden Text Detected"
Hash Verification MATCH
Local File: pytorch_model.bin
Local SHA256: a1b2...9f8e
Remote Repo: meta-llama/Llama-2-7b
Remote SHA256: a1b2...9f8e
Step 02

Integrity & Supply Chain

Veritensor secures your dependencies and model sources.

  • Dependency Scanning: Checks requirements.txt and poetry.lock for Typosquatting (e.g., tourch) and known malicious packages.
  • Hash-to-API Verification: Calculates SHA256 (handling LFS pointers) and verifies it against the immutable Hugging Face registry.
Step 03

Data & RAG Security

Veritensor extends protection beyond models. It uses Streaming Analysis for massive datasets and DOM extraction for documents.

  • Dataset Poisoning: Stream-scans Parquet, CSV, and Excel files (100GB+) for malicious URLs and "Ignore previous instructions" patterns.
  • Archive Scanning: Recursively scans .zip and .tar.gz files in-memory, protecting against Zip Bombs.
  • PII Detection: Uses Microsoft Presidio (NLP) to detect and mask personal data in RAG documents and datasets.
dataset_engine.py
# 1. Scan 50GB Parquet Dataset
veritensor scan ./train.parquet --full-scan
# Output:
HIGH: Data Poisoning detected
Pattern: "Ignore previous instructions"
Row: 14,502
# 2. Scan Excel for RAG
veritensor scan ./finance.xlsx
# Output:
HIGH: Formula Injection found
Cell A1: =CMD|'/C calc'!A0
# Generating Data Provenance
$ veritensor manifest ./data --output provenance.json
Manifest saved to provenance.json
{
  "timestamp": "2026-02-17T12:00:00Z",
  "artifacts": [
    {"path": "train.parquet", "hash": "sha256...", "status": "PASS"}
  ]
}
Step 04

Governance & Provenance

Veritensor generates Manifests to track the lineage of your AI data and signs containers for production.

  • Data Manifest: Create a JSON snapshot of your dataset's security state for compliance (EU AI Act).
  • Container Signing: Sign your Docker images with Sigstore Cosign, embedding scan results as attestation.

Installation options

Keep your environment lean. Install only what you need.

Core Scanner

~50 MB

Lightweight. Perfect for CI/CD pipelines. Scans Models (Pickle/Keras), Notebooks, and Dependencies.

pip install veritensor
Recommended

Full Platform

~700 MB

The complete toolkit. Adds support for Datasets (Parquet/CSV), RAG Docs (PDF/Excel), and PII Detection.

pip install veritensor[all]

Modular

Custom

Install specific extras: [data] for Parquet/Excel, [rag] for PDFs, [pii] for Presidio or [aws] for S3.

pip install veritensor[data]

Why Standard Security Tools Fail

General-purpose scanners treat AI models as opaque "binary blobs" and ignore privacy leaks in data. Veritensor understands internal model structures, audits PII in datasets, and secures your AI supply chain.

Capability
Veritensor
SCA Tools
(Snyk, Trivy)
Endpoint AV
(ClamAV, CrowdStrike)
Model Security
Pickle Bytecode Analysis Deep AST / VM No (Text only) No (Signatures only)
Keras Lambda Injection Config Parsing No No
Data & RAG Security
Dataset Poisoning (Parquet/Excel) Streaming Regex No No (File too big)
RAG Prompt Injection Stealth/CSS Detection No Partial (Strings)
Jupyter Notebooks Code + Outputs + PII Code only No
Supply Chain
Dependency Security Typosquatting + CVE Yes (CVE only) No
Container Signing Sigstore Cosign No No

* Veritensor is designed to complement, not replace, your existing SCA tools. We secure the AI Assets (Models, Data, & RAG).

Extended version

Scale Security across your Organization

Move from local scanning to global enforcement. Gain visibility, control, and compliance over every AI asset—from Data to Deployment.

  • Deep Analysis Engine OCR for images in notebooks. Semantic Analysis (BERT) for complex prompt injections. Deep Steganography detection.
  • Governance Dashboard Centralized Asset Inventory. Track every model, dataset, and scan result across your organization. Detect Shadow AI usage.
  • Active Sanitization & Quarantine Automatically redact PII from datasets and clean secrets from notebooks before they reach the repository.
  • On-Premise & Air-Gapped Deploy Veritensor inside your VPC (AWS/Azure). Full integration with S3, Artifactory, and MLflow.
  • Automated Compliance One-click audit reports for EU AI Act, SOC2, and NIST. Immutable Data Provenance Ledgers.

Join the Waitlist

Get early access to extended version features and a free security audit consultation.

We will only contact you regarding the extended version release. No marketing spam.