PostgreSQL

pgvector: Semantic Search and AI Embeddings in PostgreSQL

Add vector similarity search to PostgreSQL with pgvector. Covers HNSW and IVFFlat indexes, cosine similarity queries, OpenAI embedding integration, and hybrid search.

JusDB Team
October 3, 2025
5 min read
164 views

pgvector adds vector similarity search to PostgreSQL, making it a viable vector database for AI embeddings, semantic search, and recommendation systems — without adding a separate service.

Install pgvector

bash
# Ubuntu/Debian
apt-get install postgresql-15-pgvector

# Enable in database
CREATE EXTENSION vector;

Store Embeddings

sql
CREATE TABLE documents (
  id        BIGSERIAL PRIMARY KEY,
  content   TEXT,
  embedding VECTOR(1536)  -- OpenAI text-embedding-3-small dimension
);

-- Insert with embedding
INSERT INTO documents (content, embedding)
VALUES ('PostgreSQL is a powerful database', '[0.1, 0.2, ...]');
sql
-- Cosine similarity (best for normalized embeddings)
SELECT id, content,
       1 - (embedding <=> '[0.1, 0.2, ...]'::vector) AS similarity
FROM documents
ORDER BY embedding <=> '[0.1, 0.2, ...]'::vector
LIMIT 10;

-- Operators:
-- <=>  cosine distance
-- <->  L2 (Euclidean) distance
-- <#>  negative inner product

HNSW Index (pgvector 0.5+)

sql
-- HNSW: faster queries, higher memory, best for most use cases
CREATE INDEX idx_docs_embedding_hnsw ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- IVFFlat: lower memory, good for large datasets
CREATE INDEX idx_docs_embedding_ivf ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

Python Integration

python
import openai, psycopg2
from pgvector.psycopg2 import register_vector

conn = psycopg2.connect('postgresql://localhost/mydb')
register_vector(conn)

# Generate embedding
resp = openai.embeddings.create(model='text-embedding-3-small', input='search query')
query_vec = resp.data[0].embedding

# Semantic search
cur = conn.cursor()
cur.execute(
    'SELECT id, content FROM documents ORDER BY embedding <=> %s LIMIT 5',
    (query_vec,)
)
results = cur.fetchall()

Hybrid Search: Vector + Full-Text

sql
-- Combine vector similarity with keyword filtering
SELECT id, content,
       embedding <=> $1 AS vec_dist
FROM documents
WHERE search_vector @@ to_tsquery('english', 'postgres & performance')
ORDER BY embedding <=> $1
LIMIT 10;

Key Takeaways

  • HNSW index is faster at query time; IVFFlat uses less memory — use HNSW for most cases
  • Index must be created after inserting data — empty-table indexes are inaccurate for IVFFlat
  • Use vector_cosine_ops for normalized embeddings (OpenAI, Cohere); vector_l2_ops for unnormalized
  • Hybrid search combining pgvector with pg_tsvector gives better results than either alone

JusDB Can Help

pgvector lets you add AI-powered search to your existing PostgreSQL database. JusDB can design your vector search schema and tune index parameters.

Share this article

JusDB Team

Official JusDB content team