ACADEMY NEWS

Building Vector Search in .NET with PgVector, A Developer's Guide

If you're building .NET applications that deal with large volumes of data, documents, product catalogs, customer records, or scanned files, search is always a challenge. Keyword search misses context. Full-text search has its limits. But vector search changes the equation entirely.

We recently came across an excellent guide by Milan Jovanović, one of the most respected voices in the .NET community and a Microsoft MVP, that walks through exactly how to implement vector search in .NET using PgVector, a PostgreSQL extension that brings semantic search capabilities directly into your existing database.

What is vector search and how does it work?

Traditional search matches exact words or phrases. If a user searches for "invoice overdue," a keyword search will only find documents containing those exact words. It won't surface a document that says "payment pending" or "balance due", even though they mean the same thing.

Vector search works differently. Instead of matching words, it matches meaning.

Here is how the pipeline works in practice:

First, text is converted into a numerical representation called an embedding, a high-dimensional array of floating point numbers that captures the semantic meaning of the content. For example, the sentences "overdue invoice" and "payment pending" would produce embeddings that are mathematically close to each other in vector space, even though they share no common words.

These embeddings are generated by a machine learning model, typically via an API like OpenAI's text-embedding models and stored alongside your data in the database.

Image 1

When a user runs a search query, that query is also converted into an embedding using the same model. The database then calculates the distance between the query embedding and every stored embedding, returning the results that are closest in vector space, meaning the most semantically similar, not just keyword-matched.

PgVector enables this directly inside PostgreSQL, supporting efficient similarity searches right next to your relational data, without requiring a dedicated vector database.

Initializing the Database

Before storing vectors, enable the PgVector extension and set up the table.


var builder = DistributedApplication.CreateBuilder(args);

var ollama = builder.AddOllama("ollama")
    .WithLifetime(ContainerLifetime.Persistent)
    .WithDataVolume()
    .WithGPUSupport();

var embeddingModel = ollama.AddModel("qwen3-embedding:0.6b");

var postgres = builder.AddPostgres("postgres", port: 6432)
    .WithLifetime(ContainerLifetime.Persistent)
    .WithDataVolume()
    .WithImage("pgvector/pgvector", "pg17")
    .AddDatabase("articles");

builder.AddProject<Projects.PgVector_Articles>("pgvector-articles")
    .WithReference(embeddingModel)
    .WithReference(postgres)
    .WaitFor(embeddingModel)
    .WaitFor(postgres);

builder.Build().Run();

var builder = DistributedApplication.CreateBuilder(args);

var ollama = builder.AddOllama("ollama")
    .WithLifetime(ContainerLifetime.Persistent)
    .WithDataVolume()
    .WithGPUSupport();

var embeddingModel = ollama.AddModel("qwen3-embedding:0.6b");

var postgres = builder.AddPostgres("postgres", port: 6432)
    .WithLifetime(ContainerLifetime.Persistent)
    .WithDataVolume()
    .WithImage("pgvector/pgvector", "pg17")
    .AddDatabase("articles");

builder.AddProject<Projects.PgVector_Articles>("pgvector-articles")
    .WithReference(embeddingModel)
    .WithReference(postgres)
    .WaitFor(embeddingModel)
    .WaitFor(postgres);

builder.Build().Run();

If you're not using Aspire, you can run the same pgvector/pgvector:pg17 image via docker-compose and point to it with a regular connection string.

This section is based on Milan Jovanović's original article. Full code examples and implementation details are available there.

Why this matters for Iron Software customers

Many of our customers use IronPDF, IronOCR, and IronBarcode to process high volumes of documents; invoices, reports, scanned records, shipping labels.

A practical workflow combining Iron Software libraries with PgVector could look like this:

  1. Extract – Use IronOCR to extract text from scanned PDFs or images
  2. Embed – Send extracted text to an embedding model to generate vector representations
  3. Store – Save embeddings alongside document metadata in PostgreSQL using Pgvector
  4. Search – Query by meaning, returning the most semantically relevant documents rather than exact keyword matches

The result is a smarter document search system built entirely within your existing .NET and PostgreSQL stack, no additional infrastructure required.

What Milan's guide covers

Milan's article walks through the full C# implementation: setting up the PgVector extension in PostgreSQL, configuring Entity Framework Core with Npgsql, generating embeddings, creating vector indexes for performance, and running similarity queries. It is hands-on, production-oriented, and immediately applicable for any .NET developer.

The Iron Software dev team regularly shares .NET resources, tutorials, and engineering insights with our community.

Building document workflows in .NET? Iron Suite has everything you need.