Staff Machine Learning Engineer – Software Engineering
Locations: Santa Barbara, CA; San Diego, CA; Remote - San Francisco, CA; Remote - Denver, CO.
Overview
We’re building an AI‑native platform for the real estate industry and are looking for a Staff Machine Learning Engineer to advance the ML platform that underpins all of AppFolio’s AI initiatives.
Your Impact
- ML Platform: Design and operate AppFolio’s ML infrastructure on AWS – ECS, SageMaker, GPU fleets, model serving, autoscaling, and cost controls.
- Drive AI Cost Discipline: Optimize cost across all AI applications – provider routing, caching, batch vs. real‑time, model-size selection, and inference economics.
- Multi‑Provider Reliability: Maintain reliable, multi‑provider LLM access across Google, OpenAI, and Anthropic with sensible fallbacks and abstractions.
- Training & Fine‑Tuning Stack: Build the training and fine‑tuning stack for small language models, including data pipelines, GPU orchestration, and evaluation.
- Productionize Research: Partner with Voice & Agents and Research ML engineers to harden prototypes into production systems with SLOs, on‑call rotations, and observability.
- AI Safety & Guardrails: Operate AppFolio’s AI safety and authorization layer – guardrails on AWS, scoped tool permissions, and human‑in‑the‑loop gates for autonomous agent actions.
Qualifications
- Systems thinker: Think in terms of platforms and long‑term leverage, not just features.
- Production builder: Built and scaled ML infrastructure in production with meaningful business impact.
- Ambiguity: Operate effectively in high ambiguity, turning unclear infra problems into clear direction.
- Owner‑operator: Take ownership with a founder/owner‑operator mindset, act with urgency, and focus on outcomes.
- Pace: Strong desire to move fast and deliver impact while maintaining sound engineering judgment.
- Collaboration: Humble, collaborative, low‑ego, and elevate those around you.
- Sustainability: Value work‑life balance as a foundation for sustained high performance.
- Reliability mindset: Treat ML infra like any other production system – SLOs, on‑call, observability, postmortems.
Must Have
- ML infra at scale: Built and operated production ML infrastructure on AWS – ECS, SageMaker, GPUs, autoscaling, and cost controls.
- Inference platforms: Production experience with model serving for both LLMs and custom models; understands quantization, batching, and routing.
- Provider breadth: Direct experience integrating with Google (Vertex/Gemini), OpenAI, and Anthropic APIs in production.
- Training capability: Trained or fine‑tuned language models end‑to‑end; comfortable with deep learning, evaluation, and inference.
- Cloud‑native engineering: Strong Python, Docker, dependency management, and CI/CD for AI workloads.
- RAG & agents: Working knowledge of LangChain / LangGraph and modern RAG patterns over structured and unstructured data.
- Cost optimization: Demonstrated experience reducing unit cost of AI workloads without regressing quality or latency.
- AI safety & authorization: Hands‑on experience operating AI guardrails, scoped tool permissions, and authorization layers for production AI systems.
Nice to Have
- Experience training small language models for production use.
- GPU performance tuning (vLLM, TensorRT, Triton, or similar).
- Prior staff‑level role at a company with a significant AI infra footprint.
- Experience with ontology‑driven systems or knowledge graphs supporting AI applications.
- Contributions to open‑source ML infrastructure or LLM tooling.
Compensation & Benefits
- Base pay range: $200,000 – $250,000. Additional benefits and bonuses may apply.
- Regular full‑time employees are eligible for benefits.
Statement of Equal Opportunity
At AppFolio, we value diversity in backgrounds and perspectives. We are a proud Equal Opportunity Employer and welcome applicants of all races, colors, religions, sexes, sexual orientations, gender identifications, national origins, ages, marital statuses, ancestries, physical or mental disabilities, or veteran status.
#J-18808-Ljbffr