Vector Databases - The Data Search Foundation for the AI Era

18 min read | 2025.01.21

What is a Vector Database?

A vector database is a specialized database designed to efficiently store and search high-dimensional vector data. It converts text, images, audio, and other data into numerical vectors (embeddings) and enables searching based on semantic similarity.

flowchart LR
    subgraph Input["Input Data"]
        Text["Text"]
        Image["Image"]
        Audio["Audio"]
    end

    subgraph Embedding["Embedding Model"]
        Model["AI Model<br/>(OpenAI, Cohere, etc.)"]
    end

    subgraph VectorDB["Vector DB"]
        Index["Index"]
        Storage["Vector Storage"]
    end

    subgraph Output["Search Results"]
        Similar["Similar Data"]
    end

    Text --> Model
    Image --> Model
    Audio --> Model
    Model --> Index
    Index --> Storage
    Storage --> Similar

Why Vector Databases?

FeatureKeyword SearchVector Search
Search MethodExact/Partial MatchSemantic Similarity
Search “dog”Documents containing “dog”Also finds “puppy”, “canine”
MultilingualSeparate processing per languageLanguage-agnostic if meaning matches
SynonymsDictionary registration requiredAutomatically understood

How Embedding Vectors Work

Vectorization Example

import { OpenAI } from 'openai';

const openai = new OpenAI();

// Convert text to vector
async function getEmbedding(text: string): Promise<number[]> {
  const response = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: text,
  });
  return response.data[0].embedding; // 1536-dimensional vector
}

// Example: Similarity calculation
function cosineSimilarity(a: number[], b: number[]): number {
  const dotProduct = a.reduce((sum, val, i) => sum + val * b[i], 0);
  const magnitudeA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
  const magnitudeB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
  return dotProduct / (magnitudeA * magnitudeB);
}

// Usage example
const vec1 = await getEmbedding('learning programming');
const vec2 = await getEmbedding('mastering coding');
const vec3 = await getEmbedding('cooking recipes');

console.log(cosineSimilarity(vec1, vec2)); // 0.92 (high similarity)
console.log(cosineSimilarity(vec1, vec3)); // 0.21 (low similarity)

Key Similarity Search Algorithms

1. HNSW (Hierarchical Navigable Small World)

The most widely used algorithm, offering both speed and accuracy.

flowchart TB
    subgraph Layer2["Layer 2 (Sparse)"]
        A2((A)) --- B2((B))
    end

    subgraph Layer1["Layer 1 (Medium)"]
        A1((A)) --- B1((B))
        B1 --- C1((C))
        A1 --- D1((D))
    end

    subgraph Layer0["Layer 0 (Dense)"]
        A0((A)) --- B0((B))
        B0 --- C0((C))
        C0 --- E0((E))
        A0 --- D0((D))
        D0 --- F0((F))
    end

    A2 -.-> A1
    B2 -.-> B1
    A1 -.-> A0
    B1 -.-> B0
    C1 -.-> C0
    D1 -.-> D0

2. IVF (Inverted File Index)

Divides data into clusters to narrow down the search scope.

# IVF usage with Faiss
import faiss
import numpy as np

# 1 million 1536-dimensional vectors
vectors = np.random.random((1_000_000, 1536)).astype('float32')

# Create IVF index (nlist=1000 clusters)
quantizer = faiss.IndexFlatL2(1536)
index = faiss.IndexIVFFlat(quantizer, 1536, 1000)

# Train and add data
index.train(vectors)
index.add(vectors)

# Search (probe 10 clusters)
index.nprobe = 10
query = np.random.random((1, 1536)).astype('float32')
distances, indices = index.search(query, k=10)

Algorithm Comparison

AlgorithmSearch SpeedMemory UsageAccuracyUse Case
Flat (Brute Force)SlowLow100%Small datasets
HNSWFastHigh95%+General purpose
IVFMediumMedium90%+Large datasets
PQ (Quantization)Very FastVery Low80%+Massive scale

1. Pinecone

import { Pinecone } from '@pinecone-database/pinecone';

const pinecone = new Pinecone({ apiKey: process.env.PINECONE_API_KEY });
const index = pinecone.Index('my-index');

// Add data
await index.upsert([
  {
    id: 'doc1',
    values: embedding,
    metadata: { title: 'Article Title', category: 'tech' },
  },
]);

// Search
const results = await index.query({
  vector: queryEmbedding,
  topK: 10,
  filter: { category: { $eq: 'tech' } },
  includeMetadata: true,
});

2. Weaviate

import weaviate from 'weaviate-ts-client';

const client = weaviate.client({
  scheme: 'https',
  host: 'your-instance.weaviate.network',
});

// Schema definition
await client.schema.classCreator().withClass({
  class: 'Article',
  vectorizer: 'text2vec-openai',
  properties: [
    { name: 'title', dataType: ['text'] },
    { name: 'content', dataType: ['text'] },
  ],
}).do();

// Semantic search
const result = await client.graphql
  .get()
  .withClassName('Article')
  .withFields('title content')
  .withNearText({ concepts: ['machine learning basics'] })
  .withLimit(5)
  .do();

3. pgvector (PostgreSQL Extension)

-- Enable extension
CREATE EXTENSION vector;

-- Create table
CREATE TABLE documents (
  id SERIAL PRIMARY KEY,
  content TEXT,
  embedding vector(1536)
);

-- Create index (HNSW)
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops);

-- Similarity search
SELECT content, 1 - (embedding <=> query_embedding) AS similarity
FROM documents
ORDER BY embedding <=> query_embedding
LIMIT 10;

RAG Implementation

RAG (Retrieval-Augmented Generation) combines vector databases with LLMs.

flowchart LR
    Query["User Question"] --> Embed["Embedding"]
    Embed --> Search["Vector Search"]
    Search --> Context["Retrieve Docs"]
    Context --> LLM["LLM"]
    Query --> LLM
    LLM --> Answer["Generate Answer"]

Implementation Example

import { OpenAI } from 'openai';
import { Pinecone } from '@pinecone-database/pinecone';

async function ragQuery(question: string): Promise<string> {
  const openai = new OpenAI();
  const pinecone = new Pinecone();
  const index = pinecone.Index('knowledge-base');

  // 1. Vectorize question
  const embeddingResponse = await openai.embeddings.create({
    model: 'text-embedding-3-small',
    input: question,
  });
  const queryVector = embeddingResponse.data[0].embedding;

  // 2. Search for relevant documents
  const searchResults = await index.query({
    vector: queryVector,
    topK: 5,
    includeMetadata: true,
  });

  // 3. Build context
  const context = searchResults.matches
    .map(match => match.metadata?.content)
    .join('\n\n');

  // 4. Generate answer with LLM
  const completion = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [
      {
        role: 'system',
        content: `Answer the question based on the following context.\n\n${context}`,
      },
      { role: 'user', content: question },
    ],
  });

  return completion.choices[0].message.content ?? '';
}

Best Practices

1. Chunking Strategy

// Split into appropriate chunk sizes
function chunkText(text: string, chunkSize = 500, overlap = 50): string[] {
  const chunks: string[] = [];
  let start = 0;

  while (start < text.length) {
    const end = Math.min(start + chunkSize, text.length);
    chunks.push(text.slice(start, end));
    start = end - overlap;
  }

  return chunks;
}

2. Leverage Metadata

// Improve search accuracy with filtering
const results = await index.query({
  vector: queryVector,
  topK: 10,
  filter: {
    $and: [
      { category: { $eq: 'documentation' } },
      { date: { $gte: '2024-01-01' } },
      { language: { $eq: 'en' } },
    ],
  },
});
// Combine keyword and vector search
const hybridResults = await weaviate.graphql
  .get()
  .withClassName('Article')
  .withHybrid({
    query: 'TypeScript type safety',
    alpha: 0.5, // 0=keyword, 1=vector
  })
  .withLimit(10)
  .do();

Summary

Vector databases are essential infrastructure for search in the AI era.

  • Embedding Vectors: Convert text and images to numerical representations
  • Similarity Search: Fast approximate nearest neighbor search with HNSW and IVF
  • RAG: Build knowledge bases by combining with LLMs
  • Hybrid Search: Fusion of keyword and semantic approaches

Choose the right vector database and algorithm to build the foundation for your AI applications.

← Back to list