ofensiva
vector database
Pinecone
Qdrant

Vector database security: Pinecone, Qdrant, Weaviate in 2026

Vector database security: multi-tenancy, embedding inversion, namespace access control, encryption at rest and auditing for enterprise RAG.

SecraJune 8, 202614 min read

Vector databases are the persistent backbone that every RAG pipeline and most modern LLM applications rely on. Each internal copilot, each corporate semantic search engine and each agent with long term memory ends up leaning on a vector store. Their security, however, is a new category in 2025 and 2026, and a significant share of teams still treats them with the assumptions of a traditional relational database. Those assumptions do not hold: content is stored as potentially invertible vectors, default authentication is often permissive, multi tenancy is delegated to fragile metadata filters and engines evolve faster than the controls applied on top of them.

This guide explains what a vector database is, who the leading players are in 2026, the risks that set them apart from a relational store and a realistic engine by engine hardening plan. It targets CISOs, AI architects and security teams that are deploying or auditing vector databases in production.

Key takeaways on vector database security

  • A vector database is not a traditional database: it stores potentially invertible vector representations, not opaque fields.
  • Embedding inversion turns vectors into partially reconstructable data, with unresolved GDPR implications.
  • Multi tenancy almost always relies on metadata filters: a filter bypass turns isolation into fiction.
  • Default installations of several engines expose ports without authentication and end up reachable from the internet.
  • A proper vector database audit goes beyond the store itself: it covers the engine, the embedding model, the retriever, ingestion and downstream consumers.

What a vector database is and where they appear

A vector database is a system designed to index and query dense high dimensional vectors. Instead of searching by equality or by range, it runs similarity searches: given a query vector, it returns the top-k nearest vectors according to a metric such as cosine distance, dot product or Euclidean distance.

To do this at scale, it does not scan every vector. It builds an approximate index that trades recall for latency. The most common index families are HNSW (Hierarchical Navigable Small World), IVF (Inverted File Index), PQ (Product Quantization) and combinations such as IVF_PQ or HNSW_PQ. Each introduces its own tuning parameters and trade offs.

Vectors come from an embedding model that turns text, image, audio or any other signal into a dense numerical representation. That representation captures semantic meaning, not characters: two sentences with different words can sit very close if they say the same thing, and two sentences with nearly identical words can be far apart if their meaning diverges.

Vector databases show up today in four dominant use cases:

  • RAG (Retrieval Augmented Generation), where corporate documents are indexed to ground LLM answers.
  • Semantic search, where an internal or catalog engine returns results by meaning instead of keyword.
  • Modern recommender systems, where embeddings of users and items enable refined matching.
  • Anomaly detection, where the distance of a new vector from a known cluster reveals atypical behavior.

In all four, the vector store stops being an optional building block and becomes the persistent layer the entire application depends on.

Main players in 2026

The market has consolidated around a handful of engines, each with its own deployment profile and native security model.

  • Pinecone. Fully managed SaaS, opaque, multi region. Offers namespaces, API keys, project level access control and encryption at rest. It is the most popular choice among startups and small teams thanks to its serverless model.
  • Qdrant. Open source engine written in Rust with its own cloud offering. Supports rich payload filters, RBAC in its enterprise tier and either self hosted or managed deployment.
  • Weaviate. Open source with managed cloud. Adds a strong schema over the data, integrated embedding modules and native multi tenancy support.
  • Milvus. Open source with distributed deployment for very large volumes. The managed offering is Zilliz Cloud. It is preferred when handling billions of vectors.
  • Chroma. Embedded design oriented to development and prototyping. Appears inside many early LangChain or LlamaIndex applications and is marketed as a simple local store.
  • pgvector. PostgreSQL extension that adds a vector type and similarity operators. It is the logical pick when a PostgreSQL database with relational data is already in place and a separate system is undesirable.
  • Redis Stack with its vector search module, OpenSearch k-NN and Elasticsearch with its vector layer. They allow reusing existing infrastructure without introducing a new engine.

The engine decision should not be based only on QPS benchmarks or integrations, but also on native guarantees for isolation, authentication, encryption, audit logging and multi tenant support. Those guarantees vary considerably across engines.

Specific risks of vector databases

A vector store introduces attack vectors that do not appear in a traditional relational database. They must be evaluated explicitly because inherited defensive tooling rarely covers them.

  • Multi tenancy weakness. Most multi tenant implementations rely on filters over namespace, metadata or payload. A filter bypass, an injection into query parameters or a malformed query end up returning vectors that belong to another tenant. The result is cross tenant exposure of confidential content.
  • Embedding inversion. Recent academic research has shown that embeddings from generalist models are partially invertible. From a vector and some knowledge of the source model, an approximation of the original text can be reconstructed. This turns embeddings of PII into potentially personal data under GDPR.
  • Membership inference. Through targeted queries and observation of responses, an attacker can infer whether a specific document is indexed. In sensitive corpora, that knowledge alone is valuable information.
  • Data poisoning. An attacker with access to ingestion inserts vectors designed to dominate the results of certain queries. The cosine distance to a family of queries is artificially manipulated and the attacker fragment shows up as context in every answer.
  • Indirect prompt injection through indexed documents. The vector database is the natural gateway through which disguised instructions enter and get executed by the LLM at runtime. A seemingly legitimate document with hidden instructions in metadata, HTML comments or alt text becomes a persistent prompt injection vector.
  • API key leakage. Pinecone, Qdrant Cloud or Zilliz Cloud keys frequently show up hardcoded in repositories or notebooks uploaded to public environments. A leaked key grants full access to the index and its contents.
  • Weak default authentication. Qdrant local and Weaviate in dev mode start without authentication. More than one team assumes the default binding is local only and ends up exposing the port to the corporate network or, worse, to the internet.
  • Network exposure. Ports 6333 (Qdrant), 19530 (Milvus), 8080 (Weaviate) and 5432 (PostgreSQL with pgvector) show up accidentally exposed when a container is published without restrictions. Services like Shodan reveal accessible instances without authentication with surprising frequency.
  • Encryption at rest not by default. Several engines require explicit configuration to encrypt index and payload storage. A wrong assumption of default encryption is discovered too late.
  • Logging side channels. Query logs may contain the plaintext of the queries and, in some verbose modes, fragments of retrieved content. Those logs flow to observability systems rarely treated with the sensitivity of the original data.

Multi tenancy patterns and their risks

When a single deployment serves multiple customers, departments or products, the isolation pattern dictates the threat model.

  • Shared collection with metadata filter. All vectors live in the same collection and are segregated through a tenant_id field applied as a filter on every query. It is the cheapest pattern and offers the weakest isolation. A validation flaw in the filter parameter, a client library that builds it incorrectly or an SDK quirk end up enabling cross tenant access.
  • Namespace or collection per tenant. Each tenant has its own namespace or collection inside the same cluster. Isolation is stronger because the API forces explicit namespace selection and filters carry less weight. Management scales worse: creating, migrating and monitoring hundreds or thousands of namespaces requires custom automation.
  • Cluster per tenant. Each customer has its own cluster. Isolation is maximum, equivalent to dedicated relational databases, and it is the pattern that allows residency and sovereignty requirements at the lowest risk. Infrastructure and operational cost rises sharply and demands tight capacity control.

The practical rule is to pick the most restrictive pattern that the product economics allow, not the loosest one the technology tolerates. A provider that offers shared multi tenant RAG without a clear isolation contract is a provider accumulating security debt for the first serious pentest it receives.

Embedding inversion attacks

The intuitive hypothesis for years was that embeddings were opaque and irreversible. Academic research between 2023 and 2024 dismantled that hypothesis for several model families. Through optimization over the input space and knowledge of the embedding model, it is possible to reconstruct approximations of the original text, sometimes with high fidelity.

The practical implication is direct: if a vector database stores embeddings derived from personal data and an attacker exfiltrates the vectors, claiming pseudonymization is not enough. The process is partially reversible and the data remains personal in the sense of the GDPR.

Defense relies on treating vectors as sensitive content equivalent to the original. That means encryption at rest, strict access control, audit logging of mass exports and, when feasible, embedding models that add controlled noise to make inversion harder, accepting the cost on retrieval quality.

Hardening basics for Pinecone, Qdrant and Weaviate

A realistic hardening plan shares most controls, with engine specific details.

  • Mandatory authentication. No engine should start without auth, not even in development. API keys rotated on a documented cadence, one key per consumer service and never a shared global key.
  • TLS in transit and encryption at rest. Verify that both are active and, in cloud, require customer managed keys for encryption when the provider supports it.
  • Namespace isolation for multi tenant. Prefer namespace or collection per tenant over the shared filter approach. If the shared filter is the only option, harden the query layer with strict validation and dedicated tests.
  • Metadata based access control and row level security. In pgvector, leverage native PostgreSQL RLS to enforce per user policies. In engines without RLS, propagate user identity to the retriever and apply the filter at index time, not as post processing.
  • Restrictive network policy. Private endpoint, VPC peering or PrivateLink. Never expose the port to the internet. Firewall rules listing origin and destination and periodic reviews using attack surface management tooling.
  • Rate limiting per key. Limit queries per second and top-k volume. Rate limiting is the most effective defense against membership inference and reconnaissance.
  • Audit logging integrated with the SIEM. Logs for queries, mutations, key management and configuration changes flowing to Splunk, Sentinel, Elastic or the corporate SIEM. Retention has to allow forensic investigation.
  • Backup strategy. Encrypted snapshots, separated from the production cluster, with a tested restoration plan. Losing an embedding index without a backup forces re embedding the entire corpus, an expensive and slow operation.
  • Embedding model integrity. Pin the exact model version, sign the artifact when served self hosted and validate the hash before loading. The supply chain of open source models is a real attack vector.

At engine level, Pinecone requires good hygiene around API keys and project access. Qdrant requires enabling API key authentication and limiting unauthenticated_anon, plus reviewing the gRPC port alongside REST. Weaviate requires configuring AUTHENTICATION_APIKEY_ENABLED and the authorization modules when running multi tenant.

Security audit of a vector database

An audit focused on the vector database covers four blocks of tests, with repeatable methodology.

  • Recon and inventory. Identify engine, version, deployment mode, exposed ports, TLS configuration, authentication mechanisms, key types and associated permissions, indices present, corpus size and embedding model in use.
  • Multi tenant isolation testing. Create identities in two or more tenants and design queries aimed at crossing the boundary: manipulated filters, boundary values, parameter injection, SDK abuse and races on namespace creation. Document every technique tested and its result.
  • Limited embedding inversion experiment. Over a controlled and consented sample, run an academic inversion attack to assess the real reversibility level with the embedding model in use. The goal is not to break the system, but to give the CISO actionable evidence about residual risk.
  • Supply chain and configuration. Review engine versions, dependencies, embedding model, client libraries and container image origin. Compare actual configuration with a hardened baseline.

Deliverables must prioritize findings by exploitability and regulatory impact, and propose concrete remediation per engine.

Regulatory framing

The regulatory framework that applies to a vector database is not AI specific and depends on the indexed content.

GDPR kicks in whenever embeddings derive from personal data. The open discussion is whether the embedding itself is personal data. The prudent position, given the embedding inversion literature, is to treat it as such when the source corpus was personal. This entails documented legal basis, retention periods, right to erasure reaching the vectors and their backups, and an impact assessment when risk is high.

The EU AI Act affects the vector database when it feeds a system classified as high risk. A RAG that supports decisions in HR, credit, educational evaluation or access to essential services drags the vector store into the governance, human oversight, activity logging and risk management obligations of the system.

NIS2 applies to essential and important entities and, within the obligations of Article 21, requires supply chain risk management. The vector database engine and the embedding model are pieces of that chain, and their selection, monitoring and incident response plan must be formalized.

Frequently asked questions

Is an embedding personal data under GDPR?

The prudent interpretation is yes when the source text was personal, given that embeddings from generalist models are partially invertible. The AEPD has not published specific guidance, but the precautionary principle and case law on pseudonymization advise treating them as personal data and applying the same controls.

Is a shared multi tenant Pinecone deployment safe?

It can be safe if namespaces are used properly, if each tenant identity is propagated to the retriever and if API keys are segregated per service. It is not safe if every tenant coexists in the same collection with a metadata filter enforced by the client application. For regulated workloads, prefer namespace per tenant or a dedicated index.

Is pgvector equivalent to Pinecone?

For mid sized volumes and where PostgreSQL is already in place, pgvector is perfectly valid and brings a clear advantage: it benefits from RLS, roles and policies already operated by the team. It loses against Pinecone in massive horizontal scalability and large scale index tuning. The decision should respond to workload profile, not fashion.

Can an embedding be deleted and meet the right to erasure?

Yes, all mature engines support deletion by ID or by filter. The operational challenge is ensuring deletion reaches backups, snapshots, secondary indices and any derived copy used in evaluations. Without a formal process, deletion can be partial and leave compliance exposed.

How is isolation tested in a vector database?

With two identities in different tenants, a battery of queries designed to cross the boundary and a matrix of techniques: filter manipulation, boundary parameters, payload injection, SDK abuse and race conditions. Everything traced with evidence. A serious test does not fit in an hour, a reasonable budget is two to five days per engine and isolation pattern.

Can a SaaS vector database be audited without backend access?

Yes, but scope changes. The audit focuses on tenant configuration, key management, network policy, RBAC, audit logs and observable API behavior. It is complemented with a documentary review of provider commitments (SOC 2, ISO 27001, GDPR addendum) and, when available, with their technical isolation documentation. The internal engine layer remains out of scope, which has to be reflected in the report.

Vector database audit with Secra

At Secra we audit vector database deployments in production with an offensive focus and regulatory traceability. The service includes engine specific review (Pinecone, Qdrant, Weaviate, Milvus, Chroma or pgvector), multi tenant isolation testing with two or more identities, a limited embedding inversion experiment over a consented sample, supply chain validation of the embedding model and hardening recommendations prioritized by exploitability and impact. If your organization has a corporate RAG, a semantic search engine or an agent with long term memory backed by a vector store, we can audit it. Reach out from /en/contact/ and we will coordinate an initial assessment with no commitment.

About the author

Secra Solutions team

Ethical hackers with OSCP, OSEP, OSWE, CRTO, CRTL and CARTE certifications, 7+ years of experience in offensive cybersecurity, and authors of CVE-2025-40652 and CVE-2023-3512.

Share article