Overview

A few months ago, I started a fun project: an AI-powered platform that generates unique bedtime stories for children. The idea was simple: use AWS Bedrock to create personalized stories with illustrations and audio narration, then deliver them via email subscriptions based on age groups.

What started as a fully serverless application on AWS eventually evolved into a hybrid architecture, with the frontend running on my home Kubernetes cluster while keeping all the AI processing on AWS. This article tells the story of that journey.

---

🎯 The Original Vision

The goal was to create a platform where parents could subscribe to receive AI-generated bedtime stories tailored to their children’s age group. Each story would include:

  • AI-generated text using AWS Bedrock (Claude/Nova models)
  • Magical illustrations created with Bedrock Titan Image Generator
  • Audio narration with Amazon Polly in Italian and English
  • Email notifications when new stories are ready

The system would automatically generate stories at bedtime (around 7 PM) for three age groups: 3-5 years, 6-9 years, and 10-13 years.


☁️ Phase 1: The Fully Serverless Architecture

The first version was a classic serverless architecture on AWS, built entirely with CDK.

Architecture Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        AWS Cloud                                β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚  EventBridge │───▢│    Lambda    │───▢│   DynamoDB   β”‚       β”‚
β”‚  β”‚  Scheduler   β”‚    β”‚ Create Story β”‚    β”‚   Stories    β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                             β”‚                    β”‚              β”‚
β”‚                             β–Ό                    β–Ό              β”‚
β”‚                      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚                      β”‚   Bedrock    β”‚    β”‚  DDB Stream  β”‚       β”‚
β”‚                      β”‚ Claude/Nova  β”‚    β”‚              β”‚       β”‚
β”‚                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                                                  β”‚              β”‚
β”‚                                          β”Œβ”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚                                          β”‚  EventBridge β”‚       β”‚
β”‚                                          β”‚    Pipes     β”‚       β”‚
β”‚                                          β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                                                  β”‚              β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚         β”‚                    β”‚                   β”‚       β”‚      β”‚
β”‚         β–Ό                    β–Ό                   β–Ό       β–Ό      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚    Lambda    β”‚    β”‚    Lambda    β”‚    β”‚    Lambda    β”‚       β”‚
β”‚  β”‚ Gen. Images  β”‚    β”‚  Gen. Audio  β”‚    β”‚ Notification β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚         β”‚                   β”‚                   β”‚               β”‚
β”‚         β–Ό                   β–Ό                   β–Ό               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚   Bedrock    β”‚    β”‚    Polly     β”‚    β”‚     SNS      β”‚       β”‚
β”‚  β”‚ Titan Image  β”‚    β”‚              β”‚    β”‚   Topics     β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                      App Runner                          β”‚   β”‚
β”‚  β”‚                   Nuxt.js Frontend                       β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Event-Driven Story Generation

The magic happens through an event-driven pipeline:

  1. EventBridge Scheduler triggers story generation at configured times for each age group
  2. Create Story Lambda uses Bedrock to generate the story text with a multi-model fallback system
  3. DynamoDB Streams capture the new story event
  4. EventBridge Pipes transform and route the event to multiple consumers
  5. Parallel Lambda functions generate images, audio, and send notifications

The Multi-Model AI System

To ensure reliability, I implemented a cascading fallback system for story generation:

// Story generation with multi-model fallback
const models = [
  'eu.amazon.nova-pro-v1:0',           // Primary: high quality
  'eu.anthropic.claude-3-haiku-20240307-v1:0',  // Fallback: reliable
  'eu.amazon.nova-lite-v1:0',          // Tertiary: fast
];

for (const model of models) {
  try {
    const story = await generateWithModel(model, prompt);
    if (story.length >= config.story.minCharacters) {
      return story;
    }
  } catch (error) {
    console.log(`Model ${model} failed, trying next...`);
  }
}

// Emergency fallback: built-in template
return generateBuiltInStory(ageGroup);

The Frontend on App Runner

The frontend was a Nuxt.js (see https://nuxt.com/) application deployed on AWS App Runner. It provided:

  • Multi-language support (Italian/English)
  • Cognito authentication
  • Story browsing with images and audio playback
  • Subscription management for age groups

App Runner was convenient: push a Docker image, and it handles scaling, HTTPS, and health checks automatically.


🏠 Phase 2: The Migration to Home Kubernetes

After running the platform for a while, I decided to migrate the frontend to my home Kubernetes cluster. Why? Several reasons:

  • Cost optimization: App Runner charges per vCPU-hour, while my K3s cluster was already running
  • Learning opportunity: I wanted to explore IRSA (IAM Roles for Service Accounts) on a self-hosted cluster
  • Unified management: All my home applications in one place
  • Lower latency: Direct access without going through AWS - better, leveraging Cognito for User access but still having full control on my setup without incurring in extra cost.

The Challenge: AWS Authentication

The biggest challenge was authentication. The frontend needs to access:

  • DynamoDB for stories and user preferences
  • S3 for images and audio files
  • SNS for subscription management
  • Cognito for user authentication

On App Runner, this was easy: the instance role had all the necessary permissions. On a self-hosted Kubernetes cluster, I needed a different approach.


πŸ” Implementing Pod Identity (IRSA) on K3s

IRSA (IAM Roles for Service Accounts) allows Kubernetes pods to assume AWS IAM roles without storing long-lived credentials. Here’s how I set it up.

Step 1: Create the OIDC Provider

First, I needed to expose my cluster’s service account tokens to AWS. This involves:

  1. Extracting the cluster’s public key used to sign service account tokens
  2. Converting it to JWKS format (JSON Web Key Set)
  3. Hosting it on S3 as an OIDC discovery endpoint
  4. Registering it as an IAM OIDC Identity Provider
# Extract the public key from K3s
ssh k3s-master "sudo cat /var/lib/rancher/k3s/server/tls/service.key" | \
  openssl rsa -pubout > sa.pub

# Convert to JWKS format
./sa-key-to-jwks.js sa.pub > jwks.json

The CDK stack creates the S3 bucket and IAM OIDC provider:

// S3 bucket for OIDC discovery documents
const oidcBucket = new s3.Bucket(this, 'OidcBucket', {
  bucketName: `${props.clusterName}-oidc`,
  publicReadAccess: false,
  blockPublicAccess: s3.BlockPublicAccess.BLOCK_ACLS,
});

// Upload OIDC discovery document
new s3deploy.BucketDeployment(this, 'OidcDiscovery', {
  sources: [s3deploy.Source.jsonData('.well-known/openid-configuration', {
    issuer: `https://${oidcBucket.bucketName}.s3.${region}.amazonaws.com`,
    jwks_uri: `https://${oidcBucket.bucketName}.s3.${region}.amazonaws.com/keys.json`,
    response_types_supported: ['id_token'],
    subject_types_supported: ['public'],
    id_token_signing_alg_values_supported: ['RS256'],
  })],
  destinationBucket: oidcBucket,
});

// IAM OIDC Identity Provider
new iam.OpenIdConnectProvider(this, 'OidcProvider', {
  url: `https://${oidcBucket.bucketName}.s3.${region}.amazonaws.com`,
  clientIds: ['sts.amazonaws.com'],
});

Step 2: Configure K3s API Server

The K3s API server needs to issue tokens with the correct issuer and audience:

# /etc/rancher/k3s/config.yaml
kube-apiserver-arg:
  - "service-account-issuer=https://my-oidc-bucket.s3.eu-west-1.amazonaws.com"
  - "service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key"
  - "api-audiences=sts.amazonaws.com"

Step 3: Deploy the Pod Identity Webhook

The Pod Identity Webhook is an admission controller that automatically injects AWS credentials into pods with annotated service accounts:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: pod-identity-webhook
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: pod-identity-webhook
        image: amazon/amazon-eks-pod-identity-webhook:v0.5.5
        command:
        - /webhook
        - --annotation-prefix=eks.amazonaws.com
        - --token-audience=sts.amazonaws.com
        - --aws-default-region=eu-west-1

Step 4: Create the IAM Role

The IAM role needs a trust policy that allows the OIDC provider to assume it:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Principal": {
      "Federated": "arn:aws:iam::ACCOUNT:oidc-provider/my-bucket.s3.eu-west-1.amazonaws.com"
    },
    "Action": "sts:AssumeRoleWithWebIdentity",
    "Condition": {
      "StringEquals": {
        "my-bucket.s3.eu-west-1.amazonaws.com:aud": "sts.amazonaws.com",
        "my-bucket.s3.eu-west-1.amazonaws.com:sub": "system:serviceaccount:made2591-stories:made2591-stories-frontend"
      }
    }
  }]
}

Step 5: Deploy the Frontend

Finally, the Kubernetes deployment with the annotated service account:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: made2591-stories-frontend
  namespace: made2591-stories
  annotations:
    eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/made2591-stories-frontend-k8s-role"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: made2591-stories-frontend
spec:
  replicas: 2
  template:
    spec:
      serviceAccountName: made2591-stories-frontend
      containers:
      - name: frontend
        image: ACCOUNT.dkr.ecr.eu-west-1.amazonaws.com/made2591-stories-frontend:latest
        env:
        - name: NUXT_PUBLIC_AWS_REGION
          value: "eu-west-1"
        - name: NUXT_PUBLIC_STORIES_TABLE_NAME
          value: "dev-aiStoriesTables-AiStory-Stories"

The webhook automatically injects:

  • AWS_ROLE_ARN environment variable
  • AWS_WEB_IDENTITY_TOKEN_FILE pointing to the mounted token
  • A projected volume with the service account token

πŸ”„ How It All Works Together

Here’s the complete flow when a user accesses a story:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Home Kubernetes Cluster                      β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                    Frontend Pod                          β”‚   β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚   β”‚
β”‚  β”‚  β”‚  1. AWS SDK reads service account token            β”‚  β”‚   β”‚
β”‚  β”‚  β”‚  2. Calls sts:AssumeRoleWithWebIdentity            β”‚  β”‚   β”‚
β”‚  β”‚  β”‚  3. Gets temporary credentials (1 hour)            β”‚  β”‚   β”‚
β”‚  β”‚  β”‚  4. Uses credentials to access AWS services        β”‚  β”‚   β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                              β”‚                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                               β”‚
                               β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        AWS Cloud                                β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚     STS      β”‚    β”‚   DynamoDB   β”‚    β”‚      S3      β”‚       β”‚
β”‚  β”‚  Validates   β”‚    β”‚   Stories    β”‚    β”‚ Images/Audio β”‚       β”‚
β”‚  β”‚    Token     β”‚    β”‚              β”‚    β”‚              β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚              Backend (Lambda + EventBridge)              β”‚   β”‚
β”‚  β”‚         Story Generation Pipeline (unchanged)            β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The beauty of this architecture is that:

  • No credentials are stored in the cluster
  • Tokens are short-lived (1 hour) and automatically refreshed
  • Fine-grained access control per service account
  • Full audit trail via CloudTrail

πŸ“Š Results and Lessons Learned

What Worked Well

  • Seamless AWS SDK integration: The SDK automatically detects and uses the injected credentials
  • Zero code changes: The frontend code didn’t need any modifications
  • High availability: Running 2 replicas across different nodes with pod anti-affinity
  • Automatic DNS: Using external-dns-enhanced for automatic Cloudflare DNS updates

Challenges Encountered

  1. OIDC thumbprint: AWS requires the S3 certificate thumbprint, which can change
  2. Token audience: Must match exactly between K3s config and IAM trust policy
  3. Webhook timing: Pods created before the webhook is ready won’t get credentials

Cost Comparison

ComponentApp RunnerK3s
Compute~$15/month$0 (existing cluster)
Data TransferVariableMinimal
ManagementAutomaticManual

πŸš€ Future Improvements

The platform continues to evolve:

  • Analytics dashboard for subscription metrics
  • Custom story themes based on user preferences
  • Mobile app for better bedtime experience
  • Additional languages (Spanish, French)
  • AI-powered recommendations based on reading history

🌟 Conclusion

This project demonstrates how serverless and Kubernetes can coexist beautifully. The backend remains fully serverless on AWS, benefiting from automatic scaling and pay-per-use pricing. The frontend runs on my home cluster, giving me full control and reducing costs.

The key enabler was IRSA, which provides secure, credential-free authentication between Kubernetes pods and AWS services. While it requires some initial setup, the security benefits and operational simplicity make it worthwhile.

If you’re running a self-hosted Kubernetes cluster and need to access AWS services, I highly recommend exploring Pod Identity. It’s the same technology that powers EKS, and it works just as well on K3s.

Happy storytelling! πŸ“šβœ¨