What is a Vector Database?
A vector database is a specialized database designed to efficiently store and search high-dimensional vector data. It converts text, images, audio, and other data into numerical vectors (embeddings) and enables searching based on semantic similarity.
flowchart LR
subgraph Input["Input Data"]
Text["Text"]
Image["Image"]
Audio["Audio"]
end
subgraph Embedding["Embedding Model"]
Model["AI Model<br/>(OpenAI, Cohere, etc.)"]
end
subgraph VectorDB["Vector DB"]
Index["Index"]
Storage["Vector Storage"]
end
subgraph Output["Search Results"]
Similar["Similar Data"]
end
Text --> Model
Image --> Model
Audio --> Model
Model --> Index
Index --> Storage
Storage --> Similar
Why Vector Databases?
Traditional Search vs Semantic Search
| Feature | Keyword Search | Vector Search |
|---|---|---|
| Search Method | Exact/Partial Match | Semantic Similarity |
| Search “dog” | Documents containing “dog” | Also finds “puppy”, “canine” |
| Multilingual | Separate processing per language | Language-agnostic if meaning matches |
| Synonyms | Dictionary registration required | Automatically understood |
How Embedding Vectors Work
Vectorization Example
import { OpenAI } from 'openai';
const openai = new OpenAI();
// Convert text to vector
async function getEmbedding(text: string): Promise<number[]> {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text,
});
return response.data[0].embedding; // 1536-dimensional vector
}
// Example: Similarity calculation
function cosineSimilarity(a: number[], b: number[]): number {
const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
return dotProduct / (magnitudeA * magnitudeB);
}
// Usage example
const vec1 = await getEmbedding('learning programming');
const vec2 = await getEmbedding('mastering coding');
const vec3 = await getEmbedding('cooking recipes');
console.log(cosineSimilarity(vec1, vec2)); // 0.92 (high similarity)
console.log(cosineSimilarity(vec1, vec3)); // 0.21 (low similarity)
Key Similarity Search Algorithms
1. HNSW (Hierarchical Navigable Small World)
The most widely used algorithm, offering both speed and accuracy.
flowchart TB
subgraph Layer2["Layer 2 (Sparse)"]
A2((A)) --- B2((B))
end
subgraph Layer1["Layer 1 (Medium)"]
A1((A)) --- B1((B))
B1 --- C1((C))
A1 --- D1((D))
end
subgraph Layer0["Layer 0 (Dense)"]
A0((A)) --- B0((B))
B0 --- C0((C))
C0 --- E0((E))
A0 --- D0((D))
D0 --- F0((F))
end
A2 -.-> A1
B2 -.-> B1
A1 -.-> A0
B1 -.-> B0
C1 -.-> C0
D1 -.-> D0
2. IVF (Inverted File Index)
Divides data into clusters to narrow down the search scope.
# IVF usage with Faiss
import faiss
import numpy as np
# 1 million 1536-dimensional vectors
vectors = np.random.random((1_000_000, 1536)).astype('float32')
# Create IVF index (nlist=1000 clusters)
quantizer = faiss.IndexFlatL2(1536)
index = faiss.IndexIVFFlat(quantizer, 1536, 1000)
# Train and add data
index.train(vectors)
index.add(vectors)
# Search (probe 10 clusters)
index.nprobe = 10
query = np.random.random((1, 1536)).astype('float32')
distances, indices = index.search(query, k=10)
Algorithm Comparison
| Algorithm | Search Speed | Memory Usage | Accuracy | Use Case |
|---|---|---|---|---|
| Flat (Brute Force) | Slow | Low | 100% | Small datasets |
| HNSW | Fast | High | 95%+ | General purpose |
| IVF | Medium | Medium | 90%+ | Large datasets |
| PQ (Quantization) | Very Fast | Very Low | 80%+ | Massive scale |
Popular Vector Databases
1. Pinecone
import { Pinecone } from '@pinecone-database/pinecone';
const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const index = pinecone.Index('my-index');
// Add data
await index.upsert([
{
id: 'doc1',
values: embedding,
metadata: { title: 'Article Title', category: 'tech' },
},
]);
// Search
const results = await index.query({
vector: queryEmbedding,
topK: 10,
filter: { category: { $eq: 'tech' } },
includeMetadata: true,
});
2. Weaviate
import weaviate from 'weaviate-ts-client';
const client = weaviate.client({
scheme: 'https',
host: 'your-instance.weaviate.network',
});
// Schema definition
await client.schema.classCreator().withClass({
class: 'Article',
vectorizer: 'text2vec-openai',
properties: [
{ name: 'title', dataType: ['text'] },
{ name: 'content', dataType: ['text'] },
],
}).do();
// Semantic search
const result = await client.graphql
.get()
.withClassName('Article')
.withFields('title content')
.withNearText({ concepts: ['machine learning basics'] })
.withLimit(5)
.do();
3. pgvector (PostgreSQL Extension)
-- Enable extension
CREATE EXTENSION vector;
-- Create table
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(1536)
);
-- Create index (HNSW)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops);
-- Similarity search
SELECT content, 1 - (embedding <=> query_embedding) AS similarity
FROM documents
ORDER BY embedding <=> query_embedding
LIMIT 10;
RAG Implementation
RAG (Retrieval-Augmented Generation) combines vector databases with LLMs.
flowchart LR
Query["User Question"] --> Embed["Embedding"]
Embed --> Search["Vector Search"]
Search --> Context["Retrieve Docs"]
Context --> LLM["LLM"]
Query --> LLM
LLM --> Answer["Generate Answer"]
Implementation Example
import { OpenAI } from 'openai';
import { Pinecone } from '@pinecone-database/pinecone';
async function ragQuery(question: string): Promise<string> {
const openai = new OpenAI();
const pinecone = new Pinecone();
const index = pinecone.Index('knowledge-base');
// 1. Vectorize question
const embeddingResponse = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: question,
});
const queryVector = embeddingResponse.data[0].embedding;
// 2. Search for relevant documents
const searchResults = await index.query({
vector: queryVector,
topK: 5,
includeMetadata: true,
});
// 3. Build context
const context = searchResults.matches
.map(match => match.metadata?.content)
.join('\n\n');
// 4. Generate answer with LLM
const completion = await openai.chat.completions.create({
model: 'gpt-4',
messages: [
{
role: 'system',
content: `Answer the question based on the following context.\n\n${context}`,
},
{ role: 'user', content: question },
],
});
return completion.choices[0].message.content ?? '';
}
Best Practices
1. Chunking Strategy
// Split into appropriate chunk sizes
function chunkText(text: string, chunkSize = 500, overlap = 50): string[] {
const chunks: string[] = [];
let start = 0;
while (start < text.length) {
const end = Math.min(start + chunkSize, text.length);
chunks.push(text.slice(start, end));
start = end - overlap;
}
return chunks;
}
2. Leverage Metadata
// Improve search accuracy with filtering
const results = await index.query({
vector: queryVector,
topK: 10,
filter: {
$and: [
{ category: { $eq: 'documentation' } },
{ date: { $gte: '2024-01-01' } },
{ language: { $eq: 'en' } },
],
},
});
3. Hybrid Search
// Combine keyword and vector search
const hybridResults = await weaviate.graphql
.get()
.withClassName('Article')
.withHybrid({
query: 'TypeScript type safety',
alpha: 0.5, // 0=keyword, 1=vector
})
.withLimit(10)
.do();
Related Articles
- RAG Trends - RAG technology advances
- SQL vs NoSQL - Database selection
- AI Coding - AI development tools
Summary
Vector databases are essential infrastructure for search in the AI era.
- Embedding Vectors: Convert text and images to numerical representations
- Similarity Search: Fast approximate nearest neighbor search with HNSW and IVF
- RAG: Build knowledge bases by combining with LLMs
- Hybrid Search: Fusion of keyword and semantic approaches
Choose the right vector database and algorithm to build the foundation for your AI applications.
← Back to list