llm aws

Ollama with WebUI on AWS ECS with GPU Support

Serkan H March 30, 2025

This project deploys Ollama and Open WebUI on AWS ECS with GPU support, allowing you to run any large language models that are supported by Ollama and serve it as an api endpoint to a powerful web ui.

Architecture Overview

The infrastructure consists of:

ECS cluster running on EC2 instances with GPU support (g4dn.xlarge)
Ollama service for running LLMs
Open WebUI service for the user interface
Application Load Balancer for routing traffic
Service Connect for service discovery
CloudWatch for logging

Features

GPU Acceleration: Uses g4dn.xlarge instances with NVIDIA T4 GPUs
Secure Architecture: Services run in private subnets with public access through ALB
Service Discovery: Uses ECS Service Connect for internal communication
Logging: CloudWatch logs with 1-day retention

Prerequisites

AWS Account
VPC with public and private subnets
NAT Gateway for outbound internet access from private subnets
Terraform installed

Deployment

Clone this repository
Update terraform.tfvars with your configuration
Initialize Terraform terraform init
Apply via terraform apply

Accessing the WebUI

After deployment, you can access the WebUI using the URL provided in the Terraform outputs:

terraform output webui_url

Click on the link and this will direct you to webui interface where you can register as an admin and start using the app.

Loading Models

To load a model, simply go to <alb-url>/admin/settings and click on models tab on the left. Once on the model page click on the download button to select the model you’d like to download.

Scaling

To adjust the number of instances:

aws autoscaling update-auto-scaling-group \
  --auto-scaling-group-name ecs-gpu-asg \
  --min-size 2 --max-size 4 --desired-capacity 2

Troubleshooting

Common Issues

GPU not detected: Check NVIDIA driver installation with nvidia-smi
Services not starting: Check ECS service events in AWS Console
Cannot connect to WebUI: Verify security groups and load balancer health checks

Viewing Logs

# View WebUI logs
aws logs get-log-events --log-group-name /ecs/webui-service --log-stream-name <LOG_STREAM>

# View Ollama logs
aws logs get-log-events --log-group-name /ecs/ollama-service --log-stream-name <LOG_STREAM>

Cost Optimization

g4dn.xlarge instances cost approximately $0.526/hour
CloudWatch logs are configured with 1-day retention to minimize storage costs

Back to Blog