Research Scientist - Vision-Language Modeling

epsilon-healthยท Epsilon Health
Apply Now โ†—
๐Ÿ“ San Francisco, CAFullTime

About this role

About Us

We're tackling one of healthcare's most critical challenges in medical imaging and diagnostics. Our company operates at the intersection of cutting-edge AI and clinical practice, building technology that directly impacts patient outcomes. We've assembled one of the industry's most comprehensive and diverse medical imaging datasets and have a proven product-market fit with a substantial customer pipeline already in place.

Role Overview

We're seeking a Research Scientist with deep expertise in Vision Language Modeling (VLMs) to join our ML team. You'll be at the forefront of developing and deploying state-of-the-art multimodal models for clinical use in radiology settings. This role focuses on training and fine-tuning vision-language models (VLMs) that can generate accurate & grounded radiology reports across multiple imaging modalities including X-rays, CT scans, and MRI. You'll work with one of the largest and most diverse medical imaging datasets in the industry, advancing the state-of-the-art in grounded medical report generation, model alignment, and inference-time reasoning while maintaining the clinical rigor required for healthcare deployment.

Key Responsibilities

  • Design, train, and scale vision-language foundation models for radiology applications.

  • Develop and implement advanced post-training strategies including preference optimization (DPO, IPO, KTO), reinforcement learning from human feedback (RLHF), and other alignment techniques to improve clinical accuracy and reduce hallucinations.

  • Research and deploy inference-time compute scaling techniques such as chain-of-thought reasoning, self-refinement, and test-time training to enhance model performance on complex diagnostic cases.

  • Pioneer grounded report generation capabilities, enabling models to spatially localize findings within medical images using bounding boxes or segmentation masks.

  • Design rigorous evaluation frameworks that assess text for medical accuracy and writing style.

  • Contribute hands-on to all stages of model development including dataset curation, architecture design, distributed training, post-training optimization, and production deployment.

  • Stay current with cutting-edge research in vision-language modeling, medical AI, and model alignment techniques.

  • Drive research and technical excellence through conference publications and technical blog posts, establishing best practices for training robust medical VLMs at scale.

Qualifications

  • 6+ years of academia/industry experience in vision-language modeling, multimodal learning, or related fields

  • Deep expertise in training and fine-tuning large vision-language models (e.g., LLaVA, Flamingo, CogVLM, Qwen-VL, or similar architectures)

  • Strong foundation in modern post-training techniques including:

    • Preference optimization methods (DPO, IPO, ORPO, KTO)

    • RLHF and reward modeling

    • Inference-time compute scaling and reasoning strategies

    • Constitutional AI and other alignment techniques

  • Track record of implementing complex models from research papers and adapting them to new domains

  • Proficiency in PyTorch or JAX, with experience training large models on multi-GPU/distributed systems

  • Experience with autoregressive language modeling and instruction tuning

  • Hands-on experience with medical imaging applications, particularly radiology report generation

  • Strong software engineering skills and ability to write production-quality code

Preferred Qualifications

  • Publications at top-tier conferences (NeurIPS, ICML, ICLR, CVPR, ACL, EMNLP, MICCAI)

  • Experience with grounded generation tasks (visual grounding, referring expression comprehension)

  • Knowledge of evaluation methodologies for long-form generation, including factuality assessment and hallucination detection

  • Experience with 3D medical image processing and temporal modeling

  • Familiarity with clinical NLP and medical knowledge representation

  • Experience with model interpretability, explainability, and uncertainty quantification in safety-critical applications

Frequently Asked Questions

Is the salary disclosed for the Research Scientist - Vision-Language Modeling position at epsilon-health?
The salary for this Research Scientist - Vision-Language Modeling role at epsilon-health is not publicly listed. Click "Apply Now" to learn more about the compensation package on their official careers page.
Where is the Research Scientist - Vision-Language Modeling position at epsilon-health located?
This Research Scientist - Vision-Language Modeling role at epsilon-health is based in San Francisco, CA. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the Research Scientist - Vision-Language Modeling role at epsilon-health full-time or part-time?
This is listed as a FullTime position. It is posted as a Research Scientist - Vision-Language Modeling role in the Epsilon Health department at epsilon-health.
Which team or department does the Research Scientist - Vision-Language Modeling at epsilon-health belong to?
This Research Scientist - Vision-Language Modeling position is part of the Epsilon Health department at epsilon-health. See the full job description for more information about the team structure and responsibilities.
How do I apply for the Research Scientist - Vision-Language Modeling position at epsilon-health?
Click the "Apply Now" button on this page. You will be redirected to epsilon-health's official application portal hosted on ashby where you can submit your application directly.
When was the Research Scientist - Vision-Language Modeling job at epsilon-health posted?
This Research Scientist - Vision-Language Modeling position at epsilon-health was posted on Oct 31, 2025. Apply as soon as possible โ€” early applications are often reviewed first.
Research Scientist - Vision-Language Modeling
epsilon-health
Apply for this role โ†—

You'll be redirected to epsilon-health's official application page on Ashby ATS.