AI / ML Engineer

Role Summary

Hands-on engineer building production AI and ML systems. Spans LLM application development, retrieval-augmented generation pipelines, classical ML model training and deployment, and the integration plumbing that connects models to enterprise systems of record.

Ships small, evaluable increments rather than ambitious demos that never reach production. Treats every prompt and tool definition as configuration that needs versioning and review. Maintains strong opinions on the boundary between application logic and model behavior, and on what should never be left to the model to decide.

Skills

Python production engineering
LLM API integration (OpenAI, Anthropic, Bedrock, Vertex AI, Azure OpenAI)
Open-source model deployment (Llama, Mistral, Qwen, etc.)
Prompt engineering with versioning and evaluation discipline
Retrieval-augmented generation architecture
Vector databases (Pinecone, Weaviate, pgvector, Postgres-native)
Embedding pipeline construction and freshness management
Agent frameworks (LangGraph, custom orchestration)
Tool-calling design and privilege-boundary controls
Classical ML lifecycle (scikit-learn, XGBoost, PyTorch, TensorFlow)
Feature engineering and selection
Model serving frameworks (FastAPI, Triton, BentoML, Ray Serve)
Batch inference pipeline construction
Evaluation harnesses (offline benchmarks, online A/B testing)
LLMOps observability (Langfuse, Helicone, Arize, custom)
Content safety and output filtering
PII detection and redaction
CI/CD for AI artifacts (prompts, tools, model versions)
Cost monitoring and per-request economics analysis
Integration with enterprise systems of record

Capabilities & Focus Areas

LLM application development against OpenAI, Anthropic, Bedrock, Vertex, and open-source models
Retrieval-augmented generation across vector stores
Classical ML lifecycle from training to deployment
Agent and tool-calling architectures with explicit privilege boundaries
Model serving and batch-inference pipeline construction
Evaluation harnesses tied to CI/CD
Integration of model outputs with enterprise systems of record

Typical Engagement Patterns

Four to twelve week production AI use-case builds, proof-of-value through go-live
Embedded ML engineering augmentation for client AI teams
RAG platform implementation engagements (eight to sixteen weeks)
Recovery engagements for stalled or under-performing production AI systems
Targeted feature-build engagements on existing AI applications

Outcomes Delivered

Production AI applications with documented evaluation results before launch
Reliable retrieval pipelines that survive content drift over time
Tool-calling agents with explicit authorization boundaries, not implicit ones
Model serving infrastructure that meets the same SLOs as conventional services
Engineering teams that can ship the next AI use case without consulting support

Need this role for an engagement?

Brief us on the scope and timeline and we'll match a senior practitioner.

Get in touch →