Every day, the models make financial decisions that affect real people. Credit approvals, fraud blocks, transaction risk scores. If a model drifts silently in production, customers get wrongly declined. If a pipeline breaks at 2am, no one catches it until the damage is done.

This role exists to make sure that does not happen.

You will own the infrastructure that takes ML models from a data scientist's notebook into production systems processing millions of events daily, and keeps them running reliably across multiple regulatory jurisdictions. Not maintaining someone else's setup. Building and owning it.

What You Will Work On

Model pipelines: Design and operate automated training, validation, deployment, and rollback workflows across our credit scoring, fraud detection, and transaction risk models
Production monitoring: Build observability for ML specific failure modes including data drift, prediction drift, and feature skew, not just system uptime
Compliance instrumentation: Maintain full audit trails and model cards required for internal model risk reviews and regulatory examination under EU AI Act and GDPR
Infrastructure ownership: Run Kubernetes based ML serving on AWS or Azure, manage CI/CD pipelines that version code, data, and models simultaneously
Reliability and incident response: Define SLAs for latency sensitive scoring models and own the full response when something breaks in production
Cost management: Optimise cloud spend for GPU training jobs and batch inference workloads, compute budgets in fintech are scrutinised closely
5 Non-Negotiable Requirements

1. Production ML pipelines you built yourself
You have designed and operated automated training, validation, and deployment pipelines serving real users in a live environment. Not internal tooling. Not a prototype. If the pipeline broke, you were the one who fixed it.

2. Kubernetes in production
You have deployed and managed containerised ML workloads on Kubernetes including autoscaling, resource limits, and failure recovery. EKS, AKS, or GKE.

3. ML lifecycle ownership
Hands on model versioning, experiment tracking, and registry management using MLflow, Weights and Biases, or equivalent. You managed promotion gates and rollback procedures, not just tracked experiments.

4. Monitoring for ML specific failures
You have built observability for data drift, prediction drift, and feature skew, not just CPU and memory. Evidently AI, Whylogs, Prometheus, or equivalent. You defined what an alert means and what to do when it fires.

5. Regulated environment experience
You have worked in fintech, banking, or insurance where model decisions required audit trails, explainability artefacts, or sign off from a risk or compliance function. You know what SR 11-7, EU AI Act, or GDPR means for an ML pipeline in practice.

Full Technical Stack

Core: Python, Docker, Kubernetes, GitHub Actions or GitLab CI

ML Platform: MLflow, Apache Airflow or Prefect

Cloud: AWS SageMaker with EKS, or Azure ML with AKS

Monitoring: Prometheus, Grafana, Evidently AI

Data: Spark, PostgreSQL, S3 or Azure Blob

Useful but not required on day one: Terraform, feature stores such as Feast or Tecton, LangChain for LLM pipeline integration, SHAP or LIME for explainability

What This Role Is Not

Not a data science role. You will not be building models.

Not a generic DevOps role. Kubernetes experience without ML context is not sufficient.

Not a research or platform architecture role. All work is production focused with hard reliability and compliance constraints.

How to Apply

This role is open to EU based candidates only. We are not considering applications from outside the European Union at this time, regardless of remote working arrangements or timezone compatibility.

Submit your CV and record a short video answer to one question:

Describe a machine learning pipeline you built and owned in production. What broke, how did you detect it, and what did you change?

The video format is uncomfortable. We know that. If you still do it, that already tells us something.

How Applications Are Assessed

I want to be upfront about how this works before you invest your time.

Every CV is scored against the 5 non-negotiable requirements only. One point per requirement. 5 out of 5 to proceed. Not 4. If a requirement is listed as a tool or skill without context describing what you built and what it served, it scores 0.

I compare all applications before advancing anyone. If the pool of 5 out of 5 scores is larger than 15, I rank by depth of regulated environment experience and scale of systems owned. The top 15 go forward. If fewer than 15 score 5 out of 5, all of them go forward.

The video is reviewed by me and the team together. We are not assessing your camera confidence. We are assessing whether your answer is specific, whether you owned what you are describing, and whether your response to a real production failure was sound.

I do not follow up to ask for clarification on an ambiguous CV. What is written is what is scored.

You will hear back from us regardless of outcome. That is a promise, not a pleasantry.

Hubert Warszta
Tech Recruiter | WhyHireWrong? |

MLOps Engineer: ML Risk Platform

MLOps Engineer: Own production ML risk pipelines, from data to live scoring, ensuring reliability, observability, and compliance across fintech environments.