Overview

I have accumulated the usual digital junk drawer over the years:

  • manuals for appliances I never remember how to reset
  • notes from talks and courses
  • markdown snippets from side projects
  • receipts and invoices I swear I will organize properly one day
  • little technical writeups saved with filenames like final-v3-real-final.md

What I wanted was not full-blown enterprise search. I wanted something much smaller and much more useful: semantic search over my own documents, with a cost profile that does not feel absurd for a personal project.

That is where Amazon S3 Vectors became interesting.

With general availability in December 2025, AWS positioned S3 Vectors as a low-cost vector store integrated with S3 concepts such as vector buckets and vector indexes. For personal knowledge search, that is exactly the right level of abstraction.


Why I Like This Service

The big win is not “yet another vector database”. The big win is that AWS finally made a vector store that feels structurally closer to object storage than to a cluster you now have to babysit.

A few details from the GA announcement matter:

  • up to 2 billion vectors per index
  • up to 10,000 indexes per vector bucket
  • infrequent queries still under one second, warm queries around 100 ms
  • up to 90% lower cost than more traditional alternatives in the right scenarios

For a small personal search engine, the important part is simply this: no nodes, no OCUs to keep warm, no always-on domain just because I want to search my notes.


The Architecture

The design is simple enough that I would actually keep it:

  1. documents land in a normal S3 bucket
  2. an ingestion Lambda extracts text and chunks it
  3. Bedrock generates embeddings
  4. the chunks are written into an S3 vector index
  5. a small query Lambda embeds the user query and runs query_vectors

That is it.

No OpenSearch cluster. No pgvector instance. No separate managed vendor just for the vector layer.


Step 1: Create a Vector Bucket and Index

This is the first place where I had to correct my own assumptions.

With S3 Vectors you do not just create an index by name. You first create a vector bucket, then create a vector index inside that bucket. The CLI also expects a data type.

aws s3vectors create-vector-bucket \
  --vector-bucket-name personal-search-vectors

aws s3vectors create-index \
  --vector-bucket-name personal-search-vectors \
  --index-name personal-docs-index \
  --data-type float32 \
  --dimension 1024 \
  --distance-metric cosine

I used 1024 dimensions because that lines up nicely with Titan Embeddings v2.


Step 2: Index the Documents

For the ingestion side, the core loop is:

  • read a file from S3
  • extract text
  • split into chunks
  • generate embeddings
  • call put_vectors

The important correction here is the S3 Vectors object shape. The API expects key and data.float32, not a generic id and vector.

import json
import hashlib
import os
import boto3
from typing import Generator

s3 = boto3.client("s3")
bedrock_runtime = boto3.client("bedrock-runtime", region_name="eu-west-1")
s3vectors = boto3.client("s3vectors")

VECTOR_BUCKET = os.environ["VECTOR_BUCKET_NAME"]
VECTOR_INDEX = os.environ["VECTOR_INDEX_NAME"]


def chunk_text(text: str, chunk_size: int = 800, overlap: int = 100) -> Generator[str, None, None]:
    start = 0
    while start < len(text):
        end = min(start + chunk_size, len(text))
        yield text[start:end]
        start += chunk_size - overlap


def embed_text(text: str) -> list[float]:
    response = bedrock_runtime.invoke_model(
        modelId="amazon.titan-embed-text-v2:0",
        body=json.dumps({
            "inputText": text,
            "dimensions": 1024,
            "normalize": True,
        }),
        contentType="application/json",
        accept="application/json",
    )
    payload = json.loads(response["body"].read())
    return payload["embedding"]


def handler(event, context):
    for record in event["Records"]:
        bucket = record["s3"]["bucket"]["name"]
        key = record["s3"]["object"]["key"]

        body = s3.get_object(Bucket=bucket, Key=key)["Body"].read().decode("utf-8", errors="replace")

        vectors = []
        for idx, chunk in enumerate(chunk_text(body)):
            if not chunk.strip():
                continue

            vector_key = hashlib.sha256(f"{key}:{idx}".encode()).hexdigest()[:32]

            vectors.append({
                "key": vector_key,
                "data": {
                    "float32": embed_text(chunk),
                },
                "metadata": {
                    "source_key": key,
                    "chunk_index": idx,
                    "preview": chunk[:240],
                },
            })

        if vectors:
            s3vectors.put_vectors(
                vector_bucket_name=VECTOR_BUCKET,
                index_name=VECTOR_INDEX,
                vectors=vectors,
            )

Two practical notes:

  • S3 Vectors stores float32, so if you start getting fancy with numeric types, cast deliberately.
  • Deterministic keys make re-indexing much less annoying.

Step 3: Query It

On the query path, the API returns distance, not a magically named semantic score. If you want a more UX-friendly field, compute it yourself in the response layer.

import json
import os
import boto3

bedrock_runtime = boto3.client("bedrock-runtime", region_name="eu-west-1")
s3vectors = boto3.client("s3vectors")

VECTOR_BUCKET = os.environ["VECTOR_BUCKET_NAME"]
VECTOR_INDEX = os.environ["VECTOR_INDEX_NAME"]


def embed_text(text: str) -> list[float]:
    response = bedrock_runtime.invoke_model(
        modelId="amazon.titan-embed-text-v2:0",
        body=json.dumps({
            "inputText": text,
            "dimensions": 1024,
            "normalize": True,
        }),
        contentType="application/json",
        accept="application/json",
    )
    payload = json.loads(response["body"].read())
    return payload["embedding"]


def handler(event, context):
    body = json.loads(event.get("body", "{}"))
    query = body.get("query", "").strip()

    if not query:
        return {"statusCode": 400, "body": json.dumps({"error": "query is required"})}

    results = s3vectors.query_vectors(
        vector_bucket_name=VECTOR_BUCKET,
        index_name=VECTOR_INDEX,
        top_k=5,
        query_vector={"float32": embed_text(query)},
        return_metadata=True,
        return_distance=True,
    )

    matches = []
    for match in results["vectors"]:
        metadata = match.get("metadata", {})
        matches.append({
            "distance": match["distance"],
            "document": metadata.get("source_key"),
            "chunk_index": metadata.get("chunk_index"),
            "preview": metadata.get("preview", ""),
        })

    return {
        "statusCode": 200,
        "headers": {"Content-Type": "application/json"},
        "body": json.dumps({"results": matches}),
    }

From there, turning distance into something user-friendly is straightforward.


Where It Fits Really Well

I’d use S3 Vectors without hesitation for personal document search, note and markdown search, lightweight RAG over internal docs, or really anything that needs a cheap retrieval layer and doesn’t justify standing up a dedicated vector database. If you’ve been putting off a hobby agent because the storage costs felt absurd, this is probably the unlock.

Where I’d be more careful: if you need advanced hybrid ranking logic, highly specialized retrieval workflows, or if your application already lives inside a larger search platform with its own native search. In those cases S3 Vectors ends up as just another layer you have to wrangle on top of something that’s already doing similar work.

The point isn’t that it replaces every vector store. The point is that it removes a lot of accidental over-engineering for the vast majority of semantic retrieval projects.


Cost Reality

The exact number depends on how often you query, how much you embed, and how much metadata you store.

But the general shape is the appealing one:

  • storage is cheap
  • queries are cheap
  • embeddings are usually the part you notice first
  • there is no always-on search cluster quietly billing you in the background

That last part is why this service feels so appropriate for side projects.

I do not want a semantic search layer that makes me feel guilty every month. I want one that sits there quietly and helps me find the note I forgot I wrote.


Final Thoughts

S3 Vectors is one of those launches that instantly unlocks a bunch of “I should build this someday” ideas because it fixes the economics.

Before this, a personal semantic search engine usually meant one of two bad trade-offs:

  • spend too much money
  • accept too much operational complexity

S3 Vectors gives a third option: keep the architecture small, keep the bill boring, and still get a real semantic retrieval layer.

That is enough to make it a very practical addition to the AWS toolbox.


See also: