llm aws

Ollama with WebUI on AWS ECS with GPU Support

Serkan H

This project deploys Ollama and Open WebUI on AWS ECS with GPU support, allowing you to run any large language models that are supported by Ollama and serve it as an api endpoint to a powerful web ui.

Architecture Overview

The infrastructure consists of:

  • ECS cluster running on EC2 instances with GPU support (g4dn.xlarge)
  • Ollama service for running LLMs
  • Open WebUI service for the user interface
  • Application Load Balancer for routing traffic
  • Service Connect for service discovery
  • CloudWatch for logging

Features

  • GPU Acceleration: Uses g4dn.xlarge instances with NVIDIA T4 GPUs
  • Secure Architecture: Services run in private subnets with public access through ALB
  • Service Discovery: Uses ECS Service Connect for internal communication
  • Logging: CloudWatch logs with 1-day retention

Prerequisites

  • AWS Account
  • VPC with public and private subnets
  • NAT Gateway for outbound internet access from private subnets
  • Terraform installed

Deployment

  1. Clone this repository
  2. Update terraform.tfvars with your configuration
  3. Initialize Terraform terraform init
  4. Apply via terraform apply

Accessing the WebUI

After deployment, you can access the WebUI using the URL provided in the Terraform outputs:

terraform output webui_url

Click on the link and this will direct you to webui interface where you can register as an admin and start using the app.

Loading Models

To load a model, simply go to <alb-url>/admin/settings and click on models tab on the left. Once on the model page click on the download button to select the model you’d like to download.

Scaling

To adjust the number of instances:

aws autoscaling update-auto-scaling-group \
  --auto-scaling-group-name ecs-gpu-asg \
  --min-size 2 --max-size 4 --desired-capacity 2

Troubleshooting

Common Issues

  1. GPU not detected: Check NVIDIA driver installation with nvidia-smi
  2. Services not starting: Check ECS service events in AWS Console
  3. Cannot connect to WebUI: Verify security groups and load balancer health checks

Viewing Logs

# View WebUI logs
aws logs get-log-events --log-group-name /ecs/webui-service --log-stream-name <LOG_STREAM>

# View Ollama logs
aws logs get-log-events --log-group-name /ecs/ollama-service --log-stream-name <LOG_STREAM>

Cost Optimization

  • g4dn.xlarge instances cost approximately $0.526/hour
  • CloudWatch logs are configured with 1-day retention to minimize storage costs