Hi, I'm Ayush Adarsh
Sr. DevOps Engineer

5+ years building AWS-native infrastructure, CI/CD platforms, and observability stacks for production healthcare systems.
AWS Solutions Architect - Associate certified, with a track record of cutting costs, accelerating deployments, and shipping reliable automation.

Download CV


About Me

A Sr. DevOps Engineer with 5+ years building AWS-native infrastructure and automation for production healthcare systems.
I started in test automation and development at Publicis Sapient before moving deep into cloud, CI/CD, and observability.
Today I focus on cost-efficient AWS architecture, Jenkins & CodeDeploy pipelines, Docker, MongoDB operations, and Grafana/OpenTelemetry observability platforms.



Technologies

Cloud & Infrastructure

CI/CD & Automation

Observability

Databases & Caches

Backend & Languages

Integrations & Tools



Skills & Competencies

The full toolkit I work with across cloud, delivery, observability, data, and security

Cloud (AWS)

  • EC2
  • Lambda
  • S3
  • CloudWatch
  • SES
  • Systems Manager (Parameter Store + Documents)
  • Auto Scaling
  • Launch Templates
  • CodeDeploy
  • CodeCommit
  • IAM
  • VPC (dual-stack IPv4/IPv6)
  • ALB (SNI + SSL termination)
  • EBS
  • Data Lifecycle Manager
  • IMDS

CI/CD

  • Jenkins (declarative pipelines, Groovy)
  • AWS CodeDeploy
  • Bitbucket
  • Git-driven deploys
  • Multi-env rollout (OneAtATime / HalfAtATime / AllAtOnce)

Containers & Orchestration

  • Docker
  • Docker Compose V2 (profiles, YAML anchors, multi-stage builds)
  • uv
  • PM2
  • systemd

Observability

  • OpenTelemetry (Collector, auto-instrumentation)
  • Grafana
  • Loki
  • Tempo
  • Prometheus
  • CloudWatch
  • Kibana
  • Loguru
  • Structured JSON logging

Databases & Caches

  • MongoDB (4.x->6.x migration, profiler, mongodump/mongorestore, PyMongo)
  • Redis (Streams, pooling, eviction tuning)
  • SQL via SQLAlchemy 2.0 / Alembic

Backend / App Ops

  • Ruby on Rails (Unicorn, Puma, Sidekiq, Rake)
  • FastAPI
  • Celery + Beat + Flower
  • Node.js

Security & Compliance

  • ISO 27001 controls
  • ModSecurity WAF
  • OWASP Top 10
  • GeoIP blocking
  • AES-encrypted audit artifacts (pyzipper)
  • JWT/OAuth2
  • SSM-backed secrets
  • bcrypt

Languages

  • Python 3.12
  • Bash
  • Groovy
  • YAML
  • Ruby
  • JavaScript/Node.js

Integrations

  • Google Sheets API (gspread/OAuth2)
  • Google Chat webhooks
  • AWS SES (MIME multipart)
  • JIRA Cloud
  • Ansible (dynamic inventory)


Experience

A track record of building production-grade cloud infrastructure, delivery platforms, and observability stacks across healthcare and enterprise systems

Sr. DevOps Engineer - Foss Health

March 2022 - Present
Remote

AWS Infrastructure & Cost Optimization

  • Refactored non-prod server strategy, cutting infrastructure spend by ~$7,000/month without reducing dev/test throughput.
  • Leveraged Auto Scaling Group instance weights and a custom Spot-instance interruption handler (Python/Boto3) that drains workloads and pages operators - capturing spot savings with no availability loss.
  • Authored a Lambda-based Launch Template version manager that auto-publishes new LT versions on AMI changes.
  • Built an EBS auto-attach service (Bash + Boto3) mounting environment-specific volumes on EC2 boot across prod/non-prod/Jenkins/Kibana fleets.
  • Integrated AWS Data Lifecycle Manager for daily EBS snapshots with cross-region replication for DR compliance.
  • Migrated hardcoded resource names to SSM Parameter Store and deployed a highly available dual-stack (IPv4 + IPv6) VPC for production healthcare workloads.

CI/CD & Release Engineering

  • Built and maintained Jenkins declarative pipelines for Rails, FastAPI, and Rake-task deployments across 15+ environments - cutting manual deployment time by >70%.
  • Owned an AWS CodeDeploy fleet of 9+ pipelines with tunable rollout strategies (OneAtATime / HalfAtATime / AllAtOnce) and full lifecycle-hook coverage for Unicorn, Puma, and 7 Sidekiq queues.
  • Engineered Bitbucket -> AWS CodeCommit Git failover with 60-second health checks and automatic remote switching, so deploys survive primary-SCM outages.
  • Implemented SSM-driven branch selection, enabling auditable non-engineer deploys.
  • Built ChatOps feedback loops via Google Chat webhooks and SES MIME email, and configured AWS ALB with SNI + SSL termination for multiple backends on one load balancer.

Containerization & Platform Engineering

  • Architected a multi-service Docker Compose V2 stack (FastAPI, 3 Celery queues, Redis, optional Grafana LGTM, optional Ollama/vLLM) with profile-gated opt-in services.
  • Built pluggable model-server composition via COMPOSE_FILE merging - swapping Ollama (CPU dev) for vllm/vllm-openai with NVIDIA GPU runtime in prod with zero app-code changes.
  • Authored a multi-stage Dockerfile on python:3.12-slim using uv with a cached dependency layer, so code-only rebuilds skip dependency resolution.
  • Collapsed Celery worker definitions into a reusable YAML anchor and hardened startup with layered healthchecks and service_healthy conditions.

Observability Platform

  • Built a self-hosted Grafana LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) fed by an OpenTelemetry Collector with independent trace/metric/log pipelines.
  • Enabled zero-code auto-instrumentation for FastAPI, HTTPX, Redis, Celery, and stdlib logging.
  • Wrote a custom Loguru -> OpenTelemetry bridge tagging every log line with trace_id/span_id, giving Grafana cross-links between traces, logs, and metrics.
  • Developed Redis latency monitoring (Bash + Python + redis-cli) published to a Google Sheets dashboard via gspread/OAuth2.

Backend & Async Processing

  • Improved Sidekiq queue performance ~40% by isolating report-heavy jobs, tuning concurrency, and adding pre-stop/post-start drain/restore scripts.
  • Designed 3 isolated Celery queues on Redis 8 with gzip-compressed JSON payloads, 1-day result TTL, and exponential publish retries.
  • Built a Redis aggregation pattern using per-session hashes where the final chunk triggers downstream pipeline tasks - decoupling parallelism from ordering.
  • Designed a Celery + Redis Streams integration in Python for delayed, reliable background execution.

Database Operations & Migrations

  • Executed a MongoDB 4.x -> 6.x migration runbook across production with downtime planning, service orchestration, and zero data corruption.
  • Built an end-to-end MongoDB data lifecycle platform (Node.js + Python) with tar.gz-on-S3 export/import, ThreadPoolExecutor (up to 20 workers), tiered S3 storage classes, and 3 conflict-resolution modes.
  • Wrote an Excel + PandasSQL config layer so data/ops can onboard collections without touching Python - cutting tenant onboarding from hours to minutes.
  • Automated MongoDB credential rotation across 8 production DBs and 100+ users with SSM-backed secrets and AES-encrypted ZIP audit reports over SES.
  • Built a Jenkins + CodeDeploy pipeline for whitelisted MongoDB admin commands (~45 allowed ops) across 13 environments, and tuned ulimit/kernel limits under load.

Security, Compliance & ISO 27001

  • Aligned AWS accounts with ISO 27001 controls: DLM backup/retention, sudoers + SSH key policy, centralized CloudWatch logging, and HTTPS/ALB encryption enforcement.
  • Deployed ModSecurity WAF on Nginx for OWASP Top 10 coverage and GeoIP blocking, with separate rule sets per environment.
  • Built an Nginx bad-bot-blocker test harness (Bash + curl) that fuzzes user-agent headers to validate blocking rules in CI.
  • Moved all admin credentials, ZIP encryption keys, and deployment secrets to AWS SSM Parameter Store (SecureString), eliminating hardcoded secrets.

Internal Platforms & Developer Tooling

  • Built HG Internal App - a FastAPI + Celery + Redis + SQLAlchemy/Alembic platform for server lifecycle management across 18 environments, with JIRA-gated provisioning, JWT/OAuth2, Redis-backed rate limiting, and 5-min ASG sync supervised by PM2.
  • Built hg_cli, a Bash control-plane orchestrating two Rails apps with 5 isolated Sidekiq queues behind one CLI with RVM isolation and PID-safe process management.
  • Wrote a dynamic Ansible inventory generator (Python/Boto3) producing environment-grouped inventories from EC2 tag filters.
  • Shipped AWS SSM Documents for server introspection and CodeDeploy restarts, and a security-group ingress CLI for managing SSH allow-lists per environment.

Process & Service Supervision

  • Implemented a multi-Unicorn (Rails) process monitor that detects duplicate masters, restarts services, and pages operators via SES.
  • Built a DAAS job watchdog that detects hung Rake jobs, terminates the EC2 host, and triggers ASG replacement - systemd-integrated for reliability.

Automation Engineer (L1 - L2) - Publicis Sapient

January 2019 - February 2022
Gurugram, India - Wellington Account

Test Automation & QA

  • Authored Test Plans, Test Cases, Test Reports, and Requirement Traceability Matrices across Agile and Waterfall cycles spanning the full STLC/SDLC.
  • Led development of production-grade smoke and regression suites for multiple web apps, including cross-browser automation across Chromium-based browsers.
  • Built UFT automation for Java Swing and WPF/.NET desktop apps in latency-sensitive algorithmic-trading environments; replaced the Excel-based reporting dependency with a custom framework, cutting licensing cost and triage time.
  • Co-developed internal tooling with cross-functional teams and owned delivery of critical production-management activities.

Development

  • Developed a Spring MVC web application on AWS and upgraded its UI to a fully mobile-responsive layout.

Projects

Cloud-Automation Platform


A 33-module AWS-native automation platform driving CI/CD, spot-instance handling, Launch Template versioning, EBS auto-attach, and SSM-backed configuration across 15+ production and non-prod environments.

Stack
  • AWS
  • Python
  • Boto3
  • Jenkins

MongoDB Data Lifecycle Platform


An end-to-end MongoDB archival and hydration platform (Node.js + Python) with mongoexport/mongoimport, tar.gz on S3, ThreadPoolExecutor parallelism, tiered storage classes, and three conflict-resolution modes. Backed a zero-data-loss 4.x to 6.x migration.

Stack
  • MongoDB
  • Node.js
  • Python
  • AWS S3

HG Internal App


A FastAPI + Celery + Redis + SQLAlchemy/Alembic platform for server lifecycle management across 18 environments, with JIRA-gated provisioning, JWT/OAuth2 auth, Redis-backed rate limiting, and 5-minute ASG sync, supervised by PM2.

Stack
  • FastAPI
  • Celery
  • Redis
  • SQLAlchemy

Observability LGTM Stack


A self-hosted Grafana LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) fed by an OpenTelemetry Collector with zero-code auto-instrumentation and a custom Loguru-to-OpenTelemetry bridge linking traces, logs, and metrics.

Stack
  • Grafana
  • OpenTelemetry
  • Loki
  • Tempo


My Journey

A timeline of key milestones across my career



Certifications

Industry credentials validating my expertise

AWS Certified Solutions Architect - Associate

Amazon Web Services

Certified in designing distributed, cost-optimized, and resilient systems on AWS.



Personal Accomplishments

$7K

Monthly Cloud Cost Savings

70%

Faster Deployments

18

Environments Managed

5+

Years of Experience



Education

Academic foundations behind my work

B.Tech, Computer Science & Engineering

Dr. A.P.J. Abdul Kalam Technical University (A.K.T.U.)

Graduated 2019.

Sharpen your curiousity