Hi, I'm Ayush Adarsh
Sr. DevOps Engineer
5+ years building AWS-native infrastructure, CI/CD platforms, and observability stacks for production healthcare systems.
AWS Solutions Architect - Associate certified, with a track record of cutting costs, accelerating deployments, and shipping reliable automation.
About Me
A Sr. DevOps Engineer with 5+ years building AWS-native infrastructure and automation for production healthcare systems.
I started in test automation and development at Publicis Sapient before moving deep into cloud, CI/CD, and observability.
Today I focus on cost-efficient AWS architecture, Jenkins & CodeDeploy pipelines, Docker, MongoDB operations, and Grafana/OpenTelemetry observability platforms.
Technologies
Cloud & Infrastructure
CI/CD & Automation
Observability
Databases & Caches
Backend & Languages
Integrations & Tools
Skills & Competencies
The full toolkit I work with across cloud, delivery, observability, data, and security
Cloud (AWS)
- EC2
- Lambda
- S3
- CloudWatch
- SES
- Systems Manager (Parameter Store + Documents)
- Auto Scaling
- Launch Templates
- CodeDeploy
- CodeCommit
- IAM
- VPC (dual-stack IPv4/IPv6)
- ALB (SNI + SSL termination)
- EBS
- Data Lifecycle Manager
- IMDS
CI/CD
- Jenkins (declarative pipelines, Groovy)
- AWS CodeDeploy
- Bitbucket
- Git-driven deploys
- Multi-env rollout (OneAtATime / HalfAtATime / AllAtOnce)
Containers & Orchestration
- Docker
- Docker Compose V2 (profiles, YAML anchors, multi-stage builds)
- uv
- PM2
- systemd
Observability
- OpenTelemetry (Collector, auto-instrumentation)
- Grafana
- Loki
- Tempo
- Prometheus
- CloudWatch
- Kibana
- Loguru
- Structured JSON logging
Databases & Caches
- MongoDB (4.x->6.x migration, profiler, mongodump/mongorestore, PyMongo)
- Redis (Streams, pooling, eviction tuning)
- SQL via SQLAlchemy 2.0 / Alembic
Backend / App Ops
- Ruby on Rails (Unicorn, Puma, Sidekiq, Rake)
- FastAPI
- Celery + Beat + Flower
- Node.js
Security & Compliance
- ISO 27001 controls
- ModSecurity WAF
- OWASP Top 10
- GeoIP blocking
- AES-encrypted audit artifacts (pyzipper)
- JWT/OAuth2
- SSM-backed secrets
- bcrypt
Languages
- Python 3.12
- Bash
- Groovy
- YAML
- Ruby
- JavaScript/Node.js
Integrations
- Google Sheets API (gspread/OAuth2)
- Google Chat webhooks
- AWS SES (MIME multipart)
- JIRA Cloud
- Ansible (dynamic inventory)
Experience
A track record of building production-grade cloud infrastructure, delivery platforms, and observability stacks across healthcare and enterprise systems
Sr. DevOps Engineer - Foss Health
March 2022 - Present
Remote
AWS Infrastructure & Cost Optimization
- Refactored non-prod server strategy, cutting infrastructure spend by ~$7,000/month without reducing dev/test throughput.
- Leveraged Auto Scaling Group instance weights and a custom Spot-instance interruption handler (Python/Boto3) that drains workloads and pages operators - capturing spot savings with no availability loss.
- Authored a Lambda-based Launch Template version manager that auto-publishes new LT versions on AMI changes.
- Built an EBS auto-attach service (Bash + Boto3) mounting environment-specific volumes on EC2 boot across prod/non-prod/Jenkins/Kibana fleets.
- Integrated AWS Data Lifecycle Manager for daily EBS snapshots with cross-region replication for DR compliance.
- Migrated hardcoded resource names to SSM Parameter Store and deployed a highly available dual-stack (IPv4 + IPv6) VPC for production healthcare workloads.
CI/CD & Release Engineering
- Built and maintained Jenkins declarative pipelines for Rails, FastAPI, and Rake-task deployments across 15+ environments - cutting manual deployment time by >70%.
- Owned an AWS CodeDeploy fleet of 9+ pipelines with tunable rollout strategies (OneAtATime / HalfAtATime / AllAtOnce) and full lifecycle-hook coverage for Unicorn, Puma, and 7 Sidekiq queues.
- Engineered Bitbucket -> AWS CodeCommit Git failover with 60-second health checks and automatic remote switching, so deploys survive primary-SCM outages.
- Implemented SSM-driven branch selection, enabling auditable non-engineer deploys.
- Built ChatOps feedback loops via Google Chat webhooks and SES MIME email, and configured AWS ALB with SNI + SSL termination for multiple backends on one load balancer.
Containerization & Platform Engineering
- Architected a multi-service Docker Compose V2 stack (FastAPI, 3 Celery queues, Redis, optional Grafana LGTM, optional Ollama/vLLM) with profile-gated opt-in services.
- Built pluggable model-server composition via COMPOSE_FILE merging - swapping Ollama (CPU dev) for vllm/vllm-openai with NVIDIA GPU runtime in prod with zero app-code changes.
- Authored a multi-stage Dockerfile on python:3.12-slim using uv with a cached dependency layer, so code-only rebuilds skip dependency resolution.
- Collapsed Celery worker definitions into a reusable YAML anchor and hardened startup with layered healthchecks and service_healthy conditions.
Observability Platform
- Built a self-hosted Grafana LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) fed by an OpenTelemetry Collector with independent trace/metric/log pipelines.
- Enabled zero-code auto-instrumentation for FastAPI, HTTPX, Redis, Celery, and stdlib logging.
- Wrote a custom Loguru -> OpenTelemetry bridge tagging every log line with trace_id/span_id, giving Grafana cross-links between traces, logs, and metrics.
- Developed Redis latency monitoring (Bash + Python + redis-cli) published to a Google Sheets dashboard via gspread/OAuth2.
Backend & Async Processing
- Improved Sidekiq queue performance ~40% by isolating report-heavy jobs, tuning concurrency, and adding pre-stop/post-start drain/restore scripts.
- Designed 3 isolated Celery queues on Redis 8 with gzip-compressed JSON payloads, 1-day result TTL, and exponential publish retries.
- Built a Redis aggregation pattern using per-session hashes where the final chunk triggers downstream pipeline tasks - decoupling parallelism from ordering.
- Designed a Celery + Redis Streams integration in Python for delayed, reliable background execution.
Database Operations & Migrations
- Executed a MongoDB 4.x -> 6.x migration runbook across production with downtime planning, service orchestration, and zero data corruption.
- Built an end-to-end MongoDB data lifecycle platform (Node.js + Python) with tar.gz-on-S3 export/import, ThreadPoolExecutor (up to 20 workers), tiered S3 storage classes, and 3 conflict-resolution modes.
- Wrote an Excel + PandasSQL config layer so data/ops can onboard collections without touching Python - cutting tenant onboarding from hours to minutes.
- Automated MongoDB credential rotation across 8 production DBs and 100+ users with SSM-backed secrets and AES-encrypted ZIP audit reports over SES.
- Built a Jenkins + CodeDeploy pipeline for whitelisted MongoDB admin commands (~45 allowed ops) across 13 environments, and tuned ulimit/kernel limits under load.
Security, Compliance & ISO 27001
- Aligned AWS accounts with ISO 27001 controls: DLM backup/retention, sudoers + SSH key policy, centralized CloudWatch logging, and HTTPS/ALB encryption enforcement.
- Deployed ModSecurity WAF on Nginx for OWASP Top 10 coverage and GeoIP blocking, with separate rule sets per environment.
- Built an Nginx bad-bot-blocker test harness (Bash + curl) that fuzzes user-agent headers to validate blocking rules in CI.
- Moved all admin credentials, ZIP encryption keys, and deployment secrets to AWS SSM Parameter Store (SecureString), eliminating hardcoded secrets.
Internal Platforms & Developer Tooling
- Built HG Internal App - a FastAPI + Celery + Redis + SQLAlchemy/Alembic platform for server lifecycle management across 18 environments, with JIRA-gated provisioning, JWT/OAuth2, Redis-backed rate limiting, and 5-min ASG sync supervised by PM2.
- Built hg_cli, a Bash control-plane orchestrating two Rails apps with 5 isolated Sidekiq queues behind one CLI with RVM isolation and PID-safe process management.
- Wrote a dynamic Ansible inventory generator (Python/Boto3) producing environment-grouped inventories from EC2 tag filters.
- Shipped AWS SSM Documents for server introspection and CodeDeploy restarts, and a security-group ingress CLI for managing SSH allow-lists per environment.
Process & Service Supervision
- Implemented a multi-Unicorn (Rails) process monitor that detects duplicate masters, restarts services, and pages operators via SES.
- Built a DAAS job watchdog that detects hung Rake jobs, terminates the EC2 host, and triggers ASG replacement - systemd-integrated for reliability.
Automation Engineer (L1 - L2) - Publicis Sapient
January 2019 - February 2022
Gurugram, India - Wellington Account
Test Automation & QA
- Authored Test Plans, Test Cases, Test Reports, and Requirement Traceability Matrices across Agile and Waterfall cycles spanning the full STLC/SDLC.
- Led development of production-grade smoke and regression suites for multiple web apps, including cross-browser automation across Chromium-based browsers.
- Built UFT automation for Java Swing and WPF/.NET desktop apps in latency-sensitive algorithmic-trading environments; replaced the Excel-based reporting dependency with a custom framework, cutting licensing cost and triage time.
- Co-developed internal tooling with cross-functional teams and owned delivery of critical production-management activities.
Development
- Developed a Spring MVC web application on AWS and upgraded its UI to a fully mobile-responsive layout.
Projects
Cloud-Automation Platform
A 33-module AWS-native automation platform driving CI/CD, spot-instance handling, Launch Template versioning, EBS auto-attach, and SSM-backed configuration across 15+ production and non-prod environments.
Stack- AWS
- Python
- Boto3
- Jenkins
MongoDB Data Lifecycle Platform
An end-to-end MongoDB archival and hydration platform (Node.js + Python) with mongoexport/mongoimport, tar.gz on S3, ThreadPoolExecutor parallelism, tiered storage classes, and three conflict-resolution modes. Backed a zero-data-loss 4.x to 6.x migration.
Stack- MongoDB
- Node.js
- Python
- AWS S3
HG Internal App
A FastAPI + Celery + Redis + SQLAlchemy/Alembic platform for server lifecycle management across 18 environments, with JIRA-gated provisioning, JWT/OAuth2 auth, Redis-backed rate limiting, and 5-minute ASG sync, supervised by PM2.
Stack- FastAPI
- Celery
- Redis
- SQLAlchemy
Observability LGTM Stack
A self-hosted Grafana LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) fed by an OpenTelemetry Collector with zero-code auto-instrumentation and a custom Loguru-to-OpenTelemetry bridge linking traces, logs, and metrics.
Stack- Grafana
- OpenTelemetry
- Loki
- Tempo
My Journey
A timeline of key milestones across my career
2019
Joined Publicis Sapient as an Automation Engineer
2019
Graduated B.Tech in Computer Science (Dr. A.K.T.U.)
2022
Became Sr. DevOps Engineer at HealthGraph
2023
Earned AWS Solutions Architect - Associate
2026
Drove ~$7K/month cloud cost savings & 70% faster deploys
Certifications
Industry credentials validating my expertise
AWS Certified Solutions Architect - Associate
Amazon Web Services
Certified in designing distributed, cost-optimized, and resilient systems on AWS.
Personal Accomplishments
$7K
Monthly Cloud Cost Savings
70%
Faster Deployments
18
Environments Managed
5+
Years of Experience
Education
Academic foundations behind my work
B.Tech, Computer Science & Engineering
Dr. A.P.J. Abdul Kalam Technical University (A.K.T.U.)
Graduated 2019.