Skip to main content
HealthTech 8 min read

AI Research Lead in HealthTech: The Two-Round Search

We told them the first candidate was wrong. They hired him anyway. Three months later we got the call we expected. This is the story of a search that failed, why it failed, and what happened when we ran it again with a profile built on what the failure taught us.

2

Search rounds

3 months

Total active search

+12%

Real-world accuracy gain

Passed

Probation result

The client

Series B HealthTech company in Berlin. Around 120 people.

Diagnostic imaging software for hospitals across the DACH region.

Strong engineering team. Existing computer vision pipeline already in production.

Not a research lab. A regulated medical device company with CE marking obligations.

They had hit the ceiling on their existing models. They needed someone to lead AI research and push model accuracy — while working with product and clinical teams.

Why the client came to us

The CTO had been searching for three months through academic networks and conference contacts. Plenty of researchers. None with production experience in regulated medical devices.

The standard recruiting approach — post on LinkedIn, filter by PhD — was producing the same profile over and over: strong papers, zero shipped products.

They needed someone who understood the difference between a model that wins a benchmark and a model that gets CE clearance. That filter doesn't exist on LinkedIn.

The core difficulty

There was a fundamental contradiction in this profile.

Research-depth people — PhDs, published in top venues — rarely have production experience.

Production-oriented ML engineers rarely have the research depth for medical imaging.

So the person needed to:

  • have deep research chops (computer vision, medical imaging),
  • speak with engineers AND clinicians,
  • understand regulatory constraints (CE marking, medical devices),
  • go hands-on when needed,
  • eventually build a team of 3–5.

This is the research-to-production chasm. Most people are on one side or the other.

Research-first profile

  • Strong publications
  • Novel architectures
  • Benchmark-focused
  • Rarely shipped to production

Production-first profile

  • Fewer publications
  • Proven architectures
  • Real-world data focused
  • Shipped and validated

We needed the second column with enough of the first to push the science forward.

1

First round: we warned them

First candidate: PhD from a top European university. Published in MICCAI and Nature Medicine. Glowing academic references.

We raised concerns during screening: strong researcher, never shipped anything to production.

The client was excited about the publication record. They hired him anyway.

Three months in: he optimized for paper-worthy accuracy improvements — 0.3% gains on benchmarks — while the product team needed models that work reliably across different scanner types.

He couldn't bridge research perfection and production pragmatism.

They parted ways.

What's important: we didn't disappear

This is a crucial moment.

After the unsuccessful first hire, we didn't walk away.

We went back to the CTO for a real conversation:

  • what exactly didn't work,
  • pure academia doesn't translate to product,
  • publication count is not the filter,
  • the real question: “has this person deployed a model that doctors actually used?”
  • non-negotiable: must have worked in a regulated environment.

The focus shifted:

  • less publication record,
  • more production deployment experience,
  • higher bar for regulatory understanding,
  • must understand that a model needs to work on a 5-year-old GE scanner, not just pristine research datasets.
2

Second round: the right hire

Search took 6 weeks.

Candidate found in Amsterdam.

She'd been at a medical device company building ML models for radiology.

Fewer publications. But:

  • shipped 2 FDA/CE-cleared AI products,
  • understood validation protocols,
  • understood clinical feedback loops,
  • spoke the language of both engineers and radiologists.

Not everyone was convinced immediately. Strong hires at this level rarely produce unanimous enthusiasm.

The CTO made the call.

She passed probation.

Within 5 months:

  • real-world model accuracy improved by 12%,
  • built a reproducible validation pipeline,
  • hired 2 ML engineers.

“You were right about the first candidate. We should have listened.”

— CTO

2

Search rounds

3 months

Total active search

35+

Profiles reviewed

Passed

Probation

What this person actually does

This wasn't a pure researcher. This was a research-to-production bridge.

Model Development

Push accuracy on diagnostic imaging models. Design experiments. Choose architectures. But always with production constraints in mind.

Validation & Regulatory

Build reproducible validation pipelines that satisfy CE marking requirements. Understand what “clinically validated” actually means — not just statistically significant.

Clinical Collaboration

Work directly with radiologists. Translate clinical feedback into model improvements. Understand that a 0.3% benchmark gain means nothing if the model fails on a GE scanner from 2019.

Production Engineering

Ensure models work across scanner types, image qualities, and hospital IT environments. Not just research-grade data.

Team Building

Hire 2-3 ML engineers. Set research direction. Build a function that bridges the lab and the product.

Business impact

Two rounds. Three months of active search. One failed hire. But the outcome justified everything:

  • Real-world diagnostic accuracy improved by 12% -- measured on clinical data, not benchmarks
  • Reproducible validation pipeline built -- critical for CE marking renewal
  • Two ML engineers hired -- research function now self-sustaining
  • Clinical team went from skeptical to collaborative -- the new hire spoke their language
  • CTO regained confidence in AI research after the first hire's failure

Our value

  • Warned the client about the first candidate. They didn't listen, but we were right.
  • Didn't disappear after the failure.
  • Recalibrated the profile with the CTO.
  • Found the balance between research depth and production pragmatism.

Two rounds. Three months. How you handle failure defines whether you're a partner or a vendor.

Research depth. Production reality. Regulated environments.

We stay until the right hire is in the seat.

Start the conversation