Overview
A few months ago, I started a fun project: an AI-powered platform that generates unique bedtime stories for children. The idea was simple: use AWS Bedrock to create personalized stories with illustrations and audio narration, then deliver them via email subscriptions based on age groups.
What started as a fully serverless application on AWS eventually evolved into a hybrid architecture, with the frontend running on my home Kubernetes cluster while keeping all the AI processing on AWS. This article tells the story of that journey.

π― The Original Vision
The goal was to create a platform where parents could subscribe to receive AI-generated bedtime stories tailored to their children’s age group. Each story would include:
- AI-generated text using AWS Bedrock (Claude/Nova models)
- Magical illustrations created with Bedrock Titan Image Generator
- Audio narration with Amazon Polly in Italian and English
- Email notifications when new stories are ready
The system would automatically generate stories at bedtime (around 7 PM) for three age groups: 3-5 years, 6-9 years, and 10-13 years.
βοΈ Phase 1: The Fully Serverless Architecture
The first version was a classic serverless architecture on AWS, built entirely with CDK.
Architecture Components
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS Cloud β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β EventBridge βββββΆβ Lambda βββββΆβ DynamoDB β β
β β Scheduler β β Create Story β β Stories β β
β ββββββββββββββββ ββββββββββββββββ ββββββββ¬ββββββββ β
β β β β
β βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ β
β β Bedrock β β DDB Stream β β
β β Claude/Nova β β β β
β ββββββββββββββββ ββββββββ¬ββββββββ β
β β β
β ββββββββΌββββββββ β
β β EventBridge β β
β β Pipes β β
β ββββββββ¬ββββββββ β
β β β
β ββββββββββββββββββββββββββββββββββββββββββΌββββββββ β
β β β β β β
β βΌ βΌ βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Lambda β β Lambda β β Lambda β β
β β Gen. Images β β Gen. Audio β β Notification β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Bedrock β β Polly β β SNS β β
β β Titan Image β β β β Topics β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β App Runner β β
β β Nuxt.js Frontend β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Event-Driven Story Generation
The magic happens through an event-driven pipeline:
- EventBridge Scheduler triggers story generation at configured times for each age group
- Create Story Lambda uses Bedrock to generate the story text with a multi-model fallback system
- DynamoDB Streams capture the new story event
- EventBridge Pipes transform and route the event to multiple consumers
- Parallel Lambda functions generate images, audio, and send notifications
The Multi-Model AI System
To ensure reliability, I implemented a cascading fallback system for story generation:
// Story generation with multi-model fallback
const models = [
'eu.amazon.nova-pro-v1:0', // Primary: high quality
'eu.anthropic.claude-3-haiku-20240307-v1:0', // Fallback: reliable
'eu.amazon.nova-lite-v1:0', // Tertiary: fast
];
for (const model of models) {
try {
const story = await generateWithModel(model, prompt);
if (story.length >= config.story.minCharacters) {
return story;
}
} catch (error) {
console.log(`Model ${model} failed, trying next...`);
}
}
// Emergency fallback: built-in template
return generateBuiltInStory(ageGroup);
The Frontend on App Runner
The frontend was a Nuxt.js (see https://nuxt.com/) application deployed on AWS App Runner. It provided:
- Multi-language support (Italian/English)
- Cognito authentication
- Story browsing with images and audio playback
- Subscription management for age groups
App Runner was convenient: push a Docker image, and it handles scaling, HTTPS, and health checks automatically.
π Phase 2: The Migration to Home Kubernetes
After running the platform for a while, I decided to migrate the frontend to my home Kubernetes cluster. Why? Several reasons:
- Cost optimization: App Runner charges per vCPU-hour, while my K3s cluster was already running
- Learning opportunity: I wanted to explore IRSA (IAM Roles for Service Accounts) on a self-hosted cluster
- Unified management: All my home applications in one place
- Lower latency: Direct access without going through AWS - better, leveraging Cognito for User access but still having full control on my setup without incurring in extra cost.
The Challenge: AWS Authentication
The biggest challenge was authentication. The frontend needs to access:
- DynamoDB for stories and user preferences
- S3 for images and audio files
- SNS for subscription management
- Cognito for user authentication
On App Runner, this was easy: the instance role had all the necessary permissions. On a self-hosted Kubernetes cluster, I needed a different approach.
π Implementing Pod Identity (IRSA) on K3s
IRSA (IAM Roles for Service Accounts) allows Kubernetes pods to assume AWS IAM roles without storing long-lived credentials. Here’s how I set it up.
Step 1: Create the OIDC Provider
First, I needed to expose my cluster’s service account tokens to AWS. This involves:
- Extracting the cluster’s public key used to sign service account tokens
- Converting it to JWKS format (JSON Web Key Set)
- Hosting it on S3 as an OIDC discovery endpoint
- Registering it as an IAM OIDC Identity Provider
# Extract the public key from K3s
ssh k3s-master "sudo cat /var/lib/rancher/k3s/server/tls/service.key" | \
openssl rsa -pubout > sa.pub
# Convert to JWKS format
./sa-key-to-jwks.js sa.pub > jwks.json
The CDK stack creates the S3 bucket and IAM OIDC provider:
// S3 bucket for OIDC discovery documents
const oidcBucket = new s3.Bucket(this, 'OidcBucket', {
bucketName: `${props.clusterName}-oidc`,
publicReadAccess: false,
blockPublicAccess: s3.BlockPublicAccess.BLOCK_ACLS,
});
// Upload OIDC discovery document
new s3deploy.BucketDeployment(this, 'OidcDiscovery', {
sources: [s3deploy.Source.jsonData('.well-known/openid-configuration', {
issuer: `https://${oidcBucket.bucketName}.s3.${region}.amazonaws.com`,
jwks_uri: `https://${oidcBucket.bucketName}.s3.${region}.amazonaws.com/keys.json`,
response_types_supported: ['id_token'],
subject_types_supported: ['public'],
id_token_signing_alg_values_supported: ['RS256'],
})],
destinationBucket: oidcBucket,
});
// IAM OIDC Identity Provider
new iam.OpenIdConnectProvider(this, 'OidcProvider', {
url: `https://${oidcBucket.bucketName}.s3.${region}.amazonaws.com`,
clientIds: ['sts.amazonaws.com'],
});
Step 2: Configure K3s API Server
The K3s API server needs to issue tokens with the correct issuer and audience:
# /etc/rancher/k3s/config.yaml
kube-apiserver-arg:
- "service-account-issuer=https://my-oidc-bucket.s3.eu-west-1.amazonaws.com"
- "service-account-signing-key-file=/var/lib/rancher/k3s/server/tls/service.key"
- "api-audiences=sts.amazonaws.com"
Step 3: Deploy the Pod Identity Webhook
The Pod Identity Webhook is an admission controller that automatically injects AWS credentials into pods with annotated service accounts:
apiVersion: apps/v1
kind: Deployment
metadata:
name: pod-identity-webhook
namespace: kube-system
spec:
template:
spec:
containers:
- name: pod-identity-webhook
image: amazon/amazon-eks-pod-identity-webhook:v0.5.5
command:
- /webhook
- --annotation-prefix=eks.amazonaws.com
- --token-audience=sts.amazonaws.com
- --aws-default-region=eu-west-1
Step 4: Create the IAM Role
The IAM role needs a trust policy that allows the OIDC provider to assume it:
{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT:oidc-provider/my-bucket.s3.eu-west-1.amazonaws.com"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"my-bucket.s3.eu-west-1.amazonaws.com:aud": "sts.amazonaws.com",
"my-bucket.s3.eu-west-1.amazonaws.com:sub": "system:serviceaccount:made2591-stories:made2591-stories-frontend"
}
}
}]
}
Step 5: Deploy the Frontend
Finally, the Kubernetes deployment with the annotated service account:
apiVersion: v1
kind: ServiceAccount
metadata:
name: made2591-stories-frontend
namespace: made2591-stories
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::ACCOUNT:role/made2591-stories-frontend-k8s-role"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: made2591-stories-frontend
spec:
replicas: 2
template:
spec:
serviceAccountName: made2591-stories-frontend
containers:
- name: frontend
image: ACCOUNT.dkr.ecr.eu-west-1.amazonaws.com/made2591-stories-frontend:latest
env:
- name: NUXT_PUBLIC_AWS_REGION
value: "eu-west-1"
- name: NUXT_PUBLIC_STORIES_TABLE_NAME
value: "dev-aiStoriesTables-AiStory-Stories"
The webhook automatically injects:
AWS_ROLE_ARNenvironment variableAWS_WEB_IDENTITY_TOKEN_FILEpointing to the mounted token- A projected volume with the service account token
π How It All Works Together
Here’s the complete flow when a user accesses a story:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Home Kubernetes Cluster β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Frontend Pod β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β β β 1. AWS SDK reads service account token β β β
β β β 2. Calls sts:AssumeRoleWithWebIdentity β β β
β β β 3. Gets temporary credentials (1 hour) β β β
β β β 4. Uses credentials to access AWS services β β β
β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
ββββββββββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AWS Cloud β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β STS β β DynamoDB β β S3 β β
β β Validates β β Stories β β Images/Audio β β
β β Token β β β β β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Backend (Lambda + EventBridge) β β
β β Story Generation Pipeline (unchanged) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
The beauty of this architecture is that:
- No credentials are stored in the cluster
- Tokens are short-lived (1 hour) and automatically refreshed
- Fine-grained access control per service account
- Full audit trail via CloudTrail
π Results and Lessons Learned
What Worked Well
- Seamless AWS SDK integration: The SDK automatically detects and uses the injected credentials
- Zero code changes: The frontend code didn’t need any modifications
- High availability: Running 2 replicas across different nodes with pod anti-affinity
- Automatic DNS: Using external-dns-enhanced for automatic Cloudflare DNS updates
Challenges Encountered
- OIDC thumbprint: AWS requires the S3 certificate thumbprint, which can change
- Token audience: Must match exactly between K3s config and IAM trust policy
- Webhook timing: Pods created before the webhook is ready won’t get credentials
Cost Comparison
| Component | App Runner | K3s |
|---|---|---|
| Compute | ~$15/month | $0 (existing cluster) |
| Data Transfer | Variable | Minimal |
| Management | Automatic | Manual |
π Future Improvements
The platform continues to evolve:
- Analytics dashboard for subscription metrics
- Custom story themes based on user preferences
- Mobile app for better bedtime experience
- Additional languages (Spanish, French)
- AI-powered recommendations based on reading history
π Conclusion
This project demonstrates how serverless and Kubernetes can coexist beautifully. The backend remains fully serverless on AWS, benefiting from automatic scaling and pay-per-use pricing. The frontend runs on my home cluster, giving me full control and reducing costs.
The key enabler was IRSA, which provides secure, credential-free authentication between Kubernetes pods and AWS services. While it requires some initial setup, the security benefits and operational simplicity make it worthwhile.
If you’re running a self-hosted Kubernetes cluster and need to access AWS services, I highly recommend exploring Pod Identity. It’s the same technology that powers EKS, and it works just as well on K3s.
Happy storytelling! πβ¨