Tech Stack Reference

The full
engineering toolkit.
Layer by layer.

Every tool, framework, and platform in the production stack — from LLM serving infrastructure to governance, observability, and CI/CD. No hand-waving.

Cloud Platforms5 Active
OrchestrationLangGraph · AutoGen
Vector DBspgvector · Milvus · Weaviate
ServingvLLM · Ollama · TGI
MLOpsMLflow · KFP · Airflow
ObservabilityPrometheus · Langfuse
01

Stack Architecture — Layer by Layer

The complete production AI/ML stack from infrastructure up to the application layer — every layer battle-tested in enterprise deployments.

Layer 07
Application & UX
React / Next.jsFastAPIClaude DesktopSlack botsServiceNow UIStreamlit
Production
Layer 06
Agent Orchestration
LangGraphAutoGenCrewAILangChainLlamaIndexMCP SDK
Production
Layer 05
LLM Gateway / Routing
LiteLLMPortkeyCustom routerAnthropic SDKOpenAI SDK
Production
Layer 04
Retrieval & Memory
pgvectorMilvusWeaviateRedisElasticsearchMem0
Production
Layer 03
Model Serving
vLLMOllamaTGIBedrockVertex AIAzure OpenAI
Production
Layer 02
MLOps & Training
MLflowKFPHuggingFacePEFTSageMakerVertex Pipelines
Active
Layer 01
Infrastructure
Kubernetes / EKS / GKE / AKSOpenShift AITerraformHelmGPU nodes
Production
02

Cloud-specific Reference Stacks

Standard reference architectures per cloud — mix and match based on data residency, cost, and existing enterprise agreements.

☁️
AWS
Bedrock · EKS · SageMaker
LLM: Claude via Bedrock, cross-region inference profiles
Orchestration: LangGraph on EKS, Lambda for event triggers
Vector DB: RDS PostgreSQL + pgvector, OpenSearch
MLOps: SageMaker Pipelines, Model Registry, Clarify
Auth: IAM roles, Cognito, Secrets Manager
Observability: CloudWatch, X-Ray, Cost Explorer
🌐
GCP
Vertex AI · GKE · BigQuery
LLM: Gemini via Vertex AI, Model Garden
Orchestration: Cloud Run agents, Pub/Sub triggers
Vector DB: AlloyDB pgvector, Vertex AI Vector Search
MLOps: Vertex AI Pipelines, Experiments, Model Registry
Auth: Workload Identity, IAM, Secret Manager
Observability: Cloud Monitoring, Trace, Looker
💎
Azure
AOAI · AKS · Fabric
LLM: Azure OpenAI, GPT-4o, Claude on Azure
Orchestration: AKS workloads, Azure Functions
Vector DB: Azure AI Search, Cosmos DB
MLOps: Azure ML, Fabric, Responsible AI dashboard
Auth: Entra ID, RBAC, Key Vault, Managed Identity
Observability: Azure Monitor, App Insights, Sentinel
03

LLM Observability & FinOps

You can't improve what you can't observe. Every production LLM deployment ships with full trace, cost, and quality instrumentation from day one.

🔍
Trace & Evaluation
End-to-end request tracing, prompt/response logging, latency breakdowns, tool call inspection, and automated quality evals on sampled production traffic.
LangfuseLangSmithHeliconeOpenTelemetry
💰
FinOps & Cost Attribution
Per-team, per-feature token consumption tracking. Budget alerts, cost anomaly detection, model substitution recommendations, and monthly cost forecasting.
LiteLLMPrometheusGrafanaAWS Cost Explorer
📊
Model Performance
Custom eval suites on domain benchmarks, A/B model comparison, regression detection on new model versions before rollout, drift monitoring on fine-tuned models.
MLflowWeights & BiasesRAGASTruLens
🛡️
Safety & Guardrails
Input/output filtering, PII detection and redaction, prompt injection detection, output policy compliance checks, and audit logs for regulated workloads.
Bedrock GuardrailsPresidioRebuffCustom classifiers
04

AI Governance & Compliance

Enterprise AI without governance is just technical debt. Every deployment includes a governance framework from day one.

Model Risk Management
Formal model risk assessment for every production AI system — bias testing, performance envelope documentation, failure mode analysis, and tiered approval process based on business impact. Aligned with SR 11-7 for financial services clients.
Data Privacy & Residency
Data classification frameworks, PII handling policies, residency enforcement via cloud-native controls, and zero-data-egress architectures for regulated industries. GDPR, HIPAA, and SOC2 compliant deployments.
Access Control & Audit
Role-based access to LLM capabilities, per-user prompt/response audit logs, immutable audit trails in tamper-evident storage, and quarterly access reviews. Full SIEM integration for security operations teams.
Responsible AI Standards
Fairness metrics tracked per model and use case, human review pipelines for high-risk decisions, model cards for all deployed models, and executive AI risk dashboard. Aligned with EU AI Act risk tiers.