← all jobs

[Remote] Principal Machine Learning Engineer, ML Platform

Work from home Full-time role Hiring

Note: The job is a remote job and is open to candidates in USA. Shippo is on a mission to make every merchant successful through excellent shipping and logistics technology. They are seeking a Principal Machine Learning Engineer for their ML Platform to build a standardized, production-grade ML platform that enhances model reliability and speeds up product development.

Responsibilities

  • Set technical strategy and drive a multi-quarter roadmap for ML platform capabilities aligned to Shippo’s business priorities
  • Own cross-team architecture decisions, RFCs, and design reviews for ML lifecycle and inference
  • Raise the engineering bar through mentorship, production readiness standards, and reusable platform primitives
  • Be accountable for platform adoption, reliability, and cost-performance outcomes
  • Build and operate core ML platform components:
  • + ML lifecycle foundation (experiment tracking, reproducibility, artifact management, model registry, versioning, and controlled promotion workflows using MLflow or equivalent)
  • + Training and experimentation enablement (standardized environments, reusable pipelines/templates, evaluation harnesses, and repeatable workflows that let data scientists move from exploration to production with confidence)
  • + Kubernetes-native model serving for real-time inference (safe rollout and rollback, autoscaling, reliability practices, and cost controls)
  • + Batch inference and scoring pipelines (repeatable backfills, retraining triggers, consistent packaging between training and inference)
  • + Observability for ML systems (service health metrics, alerting, and model-quality signals such as drift and data quality)
  • + Developer experience (templates, reference implementations, documentation, and self-service workflows)
  • Evaluate and recommend inference frameworks and deployment patterns, and document tradeoffs for Shippo’s workloads
  • Identify and resolve performance bottlenecks across the inference stack (model runtime, compute utilization, networking, serialization, and autoscaling behavior)
  • Establish ML engineering standards across training, evaluation, testing, model packaging, CI/CD, production readiness, and incident response
  • Partner with Data Science teams to bridge research and production environments by creating repeatable frameworks, shared standards for code quality and reproducibility, and self-serve paths to deploy models safely
  • Collaborate with Data and Engineering teams to ensure the platform supports real workflows, drives adoption, and meets reliability expectations
  • Mentor engineers through design reviews, architecture guidance, and shared best practices across platform and ML development

Skills

  • 15+ years of software engineering experience, including ownership of production systems (platform, infrastructure, or distributed systems)
  • 4+ years owning ML systems end-to-end in production, including on-call and incident response, and making architecture decisions based on operational constraints (latency, throughput, availability, and cost)
  • Strong experience building and running services on Kubernetes, including deployments, autoscaling, and observability
  • Hands-on experience with ML lifecycle tooling such as MLflow or equivalent (tracking, registry, packaging, and promotion workflows)
  • Demonstrated ability to evaluate inference tradeoffs across batch and real-time serving, CPU versus GPU, latency and throughput, cost, and operational complexity
  • Demonstrated Principal-level technical leadership, including setting technical direction, driving cross-team alignment via RFCs/design reviews, and delivering multi-quarter roadmaps
  • Proven ownership of reliability and operational outcomes for production systems (SLOs, incident response, and measurable improvements in stability and performance)
  • Demonstrated ability to ship incrementally, prioritize production reliability over perfect solutions, and drive adoption through pragmatic platform design
  • Experience working with or evaluating managed ML platforms (Databricks, SageMaker, Vertex AI, or similar), with clear judgement on strengths, limitations, and build-vs-buy decisions
  • Databricks experience (useful, not required), including Databricks workflows and ML tooling integration
  • Experience with inference and serving frameworks
  • Experience with feature store patterns, online and offline consistency, and model evaluation at scale
  • Experience supporting optimization systems and decision engines in production
  • LLM or agent workflow experience, especially evaluation harnesses, deployment patterns, guardrails, and monitoring

Benefits

  • Healthcare coverage for medical, dental, and vision (90% covered by the company, incl. dependents). Pets coverage is also avail

More open positions

entry level machine learning engineer-remote/Data Scientist/Engineer - Junior (Remote)

Work from home Full-time role

Staff Machine Learning Engineer, Underwriting and Credit

Work from home Full-time role

Senior Machine Learning Engineer (AI Research)

Work from home Full-time role

Senior Machine Learning Engineer (Applied Science)

Work from home Full-time role

Prompt Engineer, Remote (MA,NH,RI, GA only), Full-Time

Work from home Full-time role

Job Title: Customer Service Representative - Pharmacy Benefits & Health Care Support | Work From Home Career Opportunity | Full Time Position in Las Vegas Area

Work from home Full-time role

Senior Full Stack Engineer (Remote)

Work from home Full-time role

QA Automation Tester - Remote US

Work from home Full-time role

Experienced Customer Service Representative – Meaningful Home Goods and Keepsakes

Work from home Full-time role

Part-Time Evening Remote Data Entry Specialist – Flexible Hours, Growth‑Focused Role, No Experience Required

Work from home Full-time role

Customer Service & Benefits Advisor – Remote Representative

Work from home Full-time role

AI Training Specialist Fully Upto $80 hr

Work from home Full-time role

Assistant Controller

Work from home Full-time role

Staff Messaging (SMS/MMS) Engineer

Work from home Full-time role

Content Marketing Manager (6 Month Contract)

Work from home Full-time role

Java Full Stack Developer - Remote Most of the time - Full time

Work from home Full-time role

C-Unix developer

Work from home Full-time role

Experienced Overnight Call Center Customer Service Representative – Multicultural Communication Support

Work from home Full-time role

Applied AI Analyst

Work from home Full-time role

Steuerfachkraft (m/w/d) in Grünwald mindestens 52.000€ - 100% Remote möglich

Work from home Full-time role

Analyst, Compliance - Remote must have Medicare Advantage exp

Work from home Full-time role