Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

The Evaluation Stack for Health AI

6 minute read

Published: March 15, 2026

Why medical knowledge benchmarks aren’t enough for real-world health assistants, and what an evaluation stack should look like as Health AI moves from static vignettes to conversational, action-taking systems.

Launching Amazon Health AI for Prime Members

2 minute read

Published: March 10, 2026

After months of work, Health AI is launching on Amazon.com and in the Amazon app, giving customers a personal AI health assistant to understand their health information, explore care options, and connect directly to care.

Launching Amazon One Medical Health AI

1 minute read

Published: January 23, 2026

Today we launched Amazon One Medical, an agentic Health AI assistant providing 24/7 health guidance, appointment booking, lab reading, and medication management.

From Stochastic Parrots to Software as a Doctor

8 minute read

Published: December 07, 2025

How LLMs are evolving from stochastic parrots into regulated medical devices — exploring the FDA frameworks, regulatory pathways, and business implications of software as a doctor.

AWS re:Invent 2025

less than 1 minute read

Published: December 06, 2025

Attended AWS re:Invent 2025 in Las Vegas.

Amazon Machine Learning Conference (AMLC 2025)

4 minute read

Published: November 06, 2025

Attended Amazon’s internal Machine Learning Conference (AMLC 2025) and gave a talk on LLM trustworthiness in medical product question answering.

Launching Buy For Me on the Amazon Shopping App

1 minute read

Published: April 03, 2025

Buy For Me, my first agentic AI system at Amazon, is now live for US customers — an AI agent that browses, selects, and purchases products from third-party brand websites on behalf of customers.

PhDone

less than 1 minute read

Published: October 03, 2019

Successfully defended my PhD thesis at MIT in the Harvard–MIT Program in Health Sciences and Technology.

Research Featured on MIT News Front Page

less than 1 minute read

Published: September 12, 2019

My PhD research on detecting pain levels from brain signals was featured on the front page of MIT News.

Featured in La Razón: Spanish Innovation at the MIT Media Lab

less than 1 minute read

Published: May 19, 2019

I was featured in a double-page spread by the national Spanish newspaper La Razón, highlighting leading researchers at the MIT Media Lab.

Spanish TV Interview at MIT with Iñaki Gabilondo

less than 1 minute read

Published: June 26, 2018

I was interviewed at MIT by Spanish journalist Iñaki Gabilondo for a Movistar+ documentary episode on MIT scientists and the future of technology.

From the Lab to the Market: Bridging Technology and Business at Harvard Business School and MIT Sloan

1 minute read

Published: June 15, 2018

During my PhD, I explored what it takes to move technology beyond the lab, studying venture scaling and private equity finance at Harvard Business School, and healthcare innovation and deep-tech commercialization at MIT Sloan.

Tim Cook at the MIT Media Lab

less than 1 minute read

Published: June 08, 2017

Tim Cook, Apple’s CEO, visited our research group (Affective Computing group) at the MIT Media Lab, where our team presented our research on AI and emotion recognition.

Collaborating with BCG Digital Ventures and Intermountain Health on Venture Design

less than 1 minute read

Published: July 08, 2016

Invited to represent the MIT Media Lab in a multidisciplinary healthcare venture design and strategic forecasting workshop with BCG Digital Ventures.

La Caixa Fellowship for Postgraduate Studies

less than 1 minute read

Published: June 06, 2014

Awarded the La Caixa Fellowship for postgraduate studies abroad, presented by King Juan Carlos I of Spain at a ceremony in Madrid.

Carpe Diem Enterprise Trust Bursary — University of Cambridge

less than 1 minute read

Published: January 22, 2014

Awarded the Carpe Diem Enterprise Trust educational bursary to support my studies in the Master’s in Bioscience Enterprise programme at the University of Cambridge.

ACGI Medal for Excellence — Imperial College London

less than 1 minute read

Published: June 28, 2013

Awarded the ACGI Medal for Excellence by the Faculty of Engineering at Imperial College London upon graduating with First Class Honours in Biomedical Engineering.

Beca Rioja Talento — Fundación Riojana para la Innovación

less than 1 minute read

Published: February 04, 2011

Awarded the Beca Rioja Talento 2010-2011 by the Fundación Riojana para la Innovación, Spain.

A Summer of Research as an Amgen Scholar at the University of Cambridge

1 minute read

Published: September 01, 2010

Selected for the Amgen Scholars Programme to conduct summer research at the University of Cambridge’s Wolfson Brain Imaging Centre.

Spanish Mathematics and Physics Olympiad Medals

less than 1 minute read

Published: March 01, 2008

Awarded a Bronze medal at the Spanish Mathematics Olympiad and a Silver medal at the Spanish Physics Olympiad in 2008.

portfolio

Scalable Trust, Safety, and Quality Infrastructure for LLM-Based Shopping Assistants

Continuous, production-scale measurement of accuracy, policy compliance, and user trust for Amazon’s generative shopping assistant, replacing manual review with automated evaluation and actionable quality signals across millions of model responses per week.

Buy For Me — Agentic Shopping System

A browser-based agentic system that helps customers complete purchases across third-party websites using multimodal perception, reasoning, and tool-enabled actions.

Multimodal Agentic Assistants for Primary Care

A patient-facing, multimodal assistant designed to support text- and voice-based interactions in primary care, integrating language, audio, and structured workflows under safety and regulatory constraints.

AI-Driven Patient Message Routing & Automation (One Medical)

An AI-driven system for routing and automating patient portal messages at One Medical, reducing clinician burden and improving operational efficiency at national scale.

Video Avatars & Multimodal Experiences

Applied research, design and prototyping of real-time multimodal video avatar systems that shaped Amazon’s strategy for agentic, embodied AI experiences.

Foresight — Population Health Predictions for Proactive Care

A population health management initiative under Google Care Studio focused on using machine learning to identify high-risk outpatients and support proactive care through closed-loop prediction and measurement workflows.

Project Nightingale

A large-scale clinical data infrastructure and analytics initiative focused on aggregating, standardizing, and enabling analysis of population-scale EHR data to support clinical workflows and machine learning in healthcare.

Crowdsourced Air Pollution Measurement Using DIY Atomic Force Microscopes

A low-cost, citizen-science platform for measuring and analyzing air pollution particles using do-it-yourself atomic force microscopes and human-in-the-loop image analysis.

Detection of Real-World Driving-Induced Affective State Using Physiological Signals

Multi-view multi-task machine learning for detecting driver stress and affective states from physiological signals during real-world driving.

Objective Pain Detection from Brain Signals using fNIRS

Personalized machine learning on wearable neuroimaging signals to objectively detect pain in non-communicative patients.

MIT Happy Robot: Real-Time Affective Interaction & Social Robotics

Real-time facial emotion recognition integrated with expressive social robotics for engaging, affect-aware human–robot interaction.

Injection Study

We investigated to use electrodermal activity (EDA), heart rate variability (HRV), and facial expression analysis as potential endpoints to determine quantitative pain scores.

Deep Reinforcement Learning for Safe Opioid Dosing in Critical Care

A clinically interpretable deep reinforcement learning system for personalized morphine dosing in the ICU, balancing pain control and physiological safety.

SNAPSHOT: Modeling Sleep and Mental Health in Social Networks

A large-scale NIH-funded study combining multimodal sensing and machine learning to model sleep, stress, and mental health dynamics in real-world social networks.

AI-Assisted PD-L1 Scoring for Immunotherapy Decision Support

Developed computer vision models to automatically quantify PD-L1 expression in immunohistochemistry slides.

Detection Limits of Automated MRI Morphometry in Rodent Brains (University of Cambridge, 2010)

A methodological study of voxel-based morphometry sensitivity for detecting subtle neuroanatomical changes in rodent MRI.

ML-Enabled Phone-Based ECG Signal Quality Assessment (University of Oxford, 2011)

Automated assessment of ECG signal quality using signal quality indices and classifier fusion for mobile and low-resource healthcare.

MAIC

Co-founded an early-stage startup through the Antler accelerator, incorporated and based in Singapore, focused on applying AI, computer vision, and IoT to workforce and task management in the construction industry.

publications

Instability in clinical risk stratiﬁcation models using deep learning

Published in Machine Learning for Health (ML4H), 2022

Recommended citation: Lopez-Martinez, Daniel; Yakubovich, Alex; Seneviratne, Martin; Lelkes, Adam; Tyagi, Akshit; Kemp, Jonas; Steinberg, Ethan; N. Downing, Lance; C. Li, Ron; E. Morse, Keith; H. Shah, Nigam; Chen, Ming-Jun, (2022). "Instability in clinical risk stratiﬁcation models using deep learning." Machine Learning for Health (ML4H) 2022.
Download Paper

Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models

Published in CVPR Responsible Generative AI Workshop, 2024

Recommended citation: Lopez-Martinez, Daniel. (2024). "Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models." CVPR 2024 Responsible Generative AI Workshop.
Download Paper

Trustworthiness in medical product question answering by large language models

Published in KDD Workshop on Evaluation and Trustworthiness of Generative AI Models, 2024

Recommended citation: Lopez-Martinez, Daniel (2024). "Trustworthiness in medical product question answering by large language models." KDD 2024 Workshop on Evaluation and Trustworthiness of Generative AI Models.
Download Paper

Detecting sensitive medical responses in general purpose large language models

Published in Machine Learning for Health Symposium, 2024

Recommended citation: Lopez-Martinez, Daniel; Bafna, Abhishek (2024). "Detecting sensitive medical responses in general purpose large language models." Machine Learning for Health Symposium 2024.
Download Paper

Systems for automated interaction with user interfaces

Published in United States Patent Office (pending), 2025

Describes systems and methods for automated agents to interact with user interfaces on behalf of a user. The system uses multiple ML–based agents (including LLM-based planners) to perceive the state of a webpage, determine and execute actions (e.g., navigation, variant selection, cart operations, checkout), and verify that each action aligns with user intent, constraints, and security guardrails (such as controlled handling of payment information). The approach enables reliable end-to-end task execution across heterogeneous third-party interfaces and supports agentic purchasing workflows such as Buy For Me.

Recommended citation: Lopez Martinez, Daniel; et al. Systems for Automated Interaction with User Interfaces. U.S. Patent Application (pending), filed 2025. Applicant: Amazon Technologies, Inc.

Systems for generation of prompts for evaluation of language models

Published in United States Patent Office, 2025

Describes systems and methods to generate synthetic prompts for red-teaming large language models (LLMs). A first ML model (e.g., a Q-learning model) learns prompt modifications that increase the probability of eliciting responses that violate constraints, while additional models score prompts and responses and generate rationales to guide subsequent prompt generation and improve evaluation coverage.

Recommended citation: Lopez Martinez, Daniel. Systems for Generation of Prompts for Evaluation of Language Models. U.S. Patent Application Publication No. US 2025/0378274 A1, published Dec. 11, 2025 (filed Jun. 7, 2024). Applicant: Amazon Technologies, Inc.
View Patent

talks

Detection limits of automated MRI morphometry for phenotyping in the rodent brains for applications in neurological disorders

Published: September 06, 2010

I presented a poster at the Amgen Scholars European Symposium 2010 at the University of Cambridge, describing research conducted during my summer internship at the Wolfson Brain Imaging Centre under the supervision of Adrian Carpenter and Steve Sawiak. The symposium brought together undergraduate researchers from institutions across Europe to share summer projects and participate in a series of academic talks and poster sessions.

Advanced MRI Techniques for Early Detection of Brain Metastases in Small Cell Lung Cancer

Published: September 16, 2011

I delivered an oral presentation at the Cancer Research UK Cambridge Research Institute (CRI) summarizing the results of my summer research internship in the laboratory of Professor John Griffiths at the University of Cambridge. The project focused on evaluating advanced magnetic resonance imaging methods for the early detection of brain metastases in small cell lung cancer (SCLC).

Signal Quality Indices and Data Fusion for Determining Acceptability of Electrocardiograms

Published: September 19, 2011

Gari Clifford (University of Oxford) and I gave an oral presentation at the Computing in Cardiology (CinC) Conference 2011 in Hangzhou, China, presenting our joint work conducted at the Oxford Institute of Biomedical Engineering (IBME). The talk covered our algorithm for assessing the diagnostic acceptability of electrocardiograms (ECGs) collected in noisy or low-resource ambulatory environments.

Modeling Loop Formation in Cortical Circuits Using Spike Timing Dependent Plasticity

Published: August 17, 2012

I delivered an oral presentation in the Kreiman Laboratory at Harvard University, summarizing the results of my summer research internship under the supervision of Professor Gabriel Kreiman. The computational neuroscience project focused on understanding how spike timing dependent plasticity (STDP) shapes the architecture of recurrent cortical circuits and the conditions under which specific connectivity patterns emerge.

Machine Learning Methods for Analyzing Multisensory Integration with Magnetoencephalography

Published: October 07, 2012

I delivered an oral presentation at the Magnetoencephalography Laboratory of the McGovern Institute for Brain Research at MIT, summarizing the results of my research internship under the supervision of Dr Dimitrios Pantazis. The project focused on developing machine learning methods to process magnetoencephalography data and on understanding how the human brain binds visual and auditory information into a unified percept as part of a National Science Foundation supported effort.

Crowdsourced Air Pollution Measurement Using DIY Atomic Force Microscopes

Published: August 25, 2015

I delivered a demo presentation at the LEGO2NANO Summer School at the Shenzhen Open Innovation Lab (SZOIL), showcasing the atomic force microscope our team developed and, in particular, my work on imaging air pollution particles and creating a crowdsourcing-based air pollution measurement platform built around this technology.

Building Bridges to Develop New Medical Technologies

Published: January 25, 2016

I delivered an invited talk at the Building Bridges to Develop New Medical Technologies workshop hosted by the Real Colegio Complutense at Harvard University. The event brought together engineering and medical researchers from Boston and Spain to foster international collaboration and cross-disciplinary innovation in biomedical science and technology.

Wearable Technologies for Multiple Sclerosis: The Future Role of Stress Measurement

Published: May 27, 2016

I delivered an oral presentation at the International Conference on Smart Portable, Wearable, Implantable and Disability Oriented Devices and Systems (SPWID 2016) in Valencia, Spain. The talk presented our work on wearable technologies for managing stress in individuals with multiple sclerosis (MS), based on research conducted at the MIT Media Lab.

Patient-Centered Symptom and Vital Sign Tracking for Lyme Disease Care

Published: June 19, 2016

I presented our team’s work at Lyme Innovation, the first-ever Lyme-disease–focused hackathon, held at the Microsoft NERD Center in Cambridge and organized by Spaulding Rehabilitation’s Dean Center for Tick-Borne Illness, the Veterans Affairs Center for Innovation, MIT Hacking Medicine, UC Berkeley, and Harvard Medical School. The three-day event brought together clinicians, scientists, engineers, entrepreneurs, and patients to develop new solutions for Lyme disease. Our project was selected as one of the finalists and received a $5,000 award.

LymeDot: Using Open Data and Mobile AI for Symptom Tracking in Lyme Disease

Published: September 28, 2016

I presented work at the White House Open Data Innovation Summit in Washington, D.C., where our team was invited to present our project developed during a Boston-based health hackathon. Our project, LymeDot, explored how mobile technology and open data could help patients with Lyme disease track symptoms over time, support clinical decision-making, and empower individuals managing complex, chronic conditions.

Drug Development in the Connected World: A New Paradigm for Drug Discovery

Published: October 07, 2016

I delivered a talk at the MIT Media Lab during an industry symposium on how connectivity, wearable sensing, and data-driven approaches are transforming the drug development pipeline — from discovery through clinical trials and post-market surveillance.

Automatic Detection of Nociceptive Stimuli and Pain Intensity from Facial Expressions

Published: May 17, 2017

I presented a poster at the 2017 Annual Meeting of the American Pain Society in Pittsburgh, describing collaborative work between the MIT Affective Computing group and MedImmune on automatic pain detection using computer vision and machine learning.

Personalized Automatic Estimation of Self Reported Pain Intensity from Facial Expressions

Published: July 26, 2017

I delivered an oral presentation at the Computer Vision and Pattern Recognition (CVPR 2017) Workshop on Deep Affective Learning and Context Modeling, where I presented our work on personalized estimation of self reported pain intensity from facial expressions. The project introduced a two stage machine learning framework that combines recurrent neural networks with a personalized Hidden Conditional Random Field model to estimate Visual Analog Scale (VAS) pain scores from facial landmarks.

ZenAuto: Emotionally Intelligent Transport

Published: September 14, 2017

I presented our startup concept, ZenAuto, at the Lee Kuan Yew Global Business Plan Competition (LKYGBPC) in Singapore, one of Asia’s leading deep-tech entrepreneurship challenges. The competition brings together next-generation founders from around the world to showcase innovations with the potential to reshape cities, industries, and society. Our work was selected for presentation on the competition stage alongside teams from top global universities.

Physiological and Behavioral Profiling for Nociceptive Pain Estimation Using Personalized Multitask Learning

Published: December 08, 2017

I presented a poster at the NeurIPS Machine Learning for Health (ML4H) Workshop 2017, describing our work on personalized pain estimation from multimodal data. The project introduced a method for building physiological and behavioral profiles based on individual responses to heat pain, and for using these profiles within a personalized multi-task neural network architecture.

Skin Conductance Deconvolution for Pain Estimation

Published: March 05, 2018

I presented a poster at the International Conference on Biomedical and Health Informatics (BHI 2018) in Las Vegas, describing our work on estimating pain intensity from skin conductance signals. The project, conducted at the MIT Media Lab, focused on leveraging noninvasive physiological sensing to quantify nociceptive responses when self-report is not feasible.

Continuous Pain Intensity Estimation from Autonomic Signals with Recurrent Neural Networks

Published: July 21, 2018

I delivered an oral presentation at the Engineering in Medicine and Biology Conference (EMBC), describing our work on continuously estimating experimental heat pain intensity from autonomic physiological signals. The project sought to develop an objective pain monitoring method that provides high temporal resolution estimates using data that can be collected noninvasively from wearable sensors.

Multi-Task Multiple Kernel Machines for Personalized Pain Recognition from fNIRS

Published: August 23, 2018

I delivered an oral presentation at the International Conference on Pattern Recognition (ICPR 2018) in Beijing, China, presenting our work on personalized pain detection using functional near-infrared spectroscopy (fNIRS) brain signals. The paper received the Best Student Paper Award.

Machine Learning for Pain Medicine: Physiological and Behavioral Profiling for Nociceptive Pain Estimation

Published: April 10, 2019

I presented a poster at the Harvard–MIT Health Sciences and Technology (HST) Forum at Harvard Medical School, describing my research on personalized machine learning approaches for estimating nociceptive pain. The work, conducted at the MIT Media Lab, explored how individual differences in physiological and behavioral responses to pain can be leveraged to improve continuous pain intensity estimation.

Deep Reinforcement Learning for Optimal Critical Care Pain Management

Published: July 26, 2019

I delivered an oral presentation at the Engineering in Medicine and Biology Conference (EMBC 2019) in Berlin, summarizing our work on using deep reinforcement learning to support optimal pain management in the intensive care unit (ICU). The project introduced a sequential decision making framework that learns clinically interpretable morphine dosing strategies personalized to each patient’s evolving physiological and pain state, based on retrospective ICU data from the MIMIC-III database.

Detecting Real World Driving Induced Affective State Using Physiological Signals

Published: September 03, 2019

I delivered an oral presentation at the International Conference on Affective Computing and Intelligent Interaction (ACII 2019) in Cambridge, UK, during the International Workshop on Social and Emotion AI for Industry (SEAIxI). The presentation summarized our work on detecting real world, driving induced affective states using physiological signals, based on our paper presented at the conference.

Machine Learning for Predicting Renal Replacement Therapy Onset in Chronic Kidney Disease

Published: September 18, 2022

I presented our work at the Applications of Medical AI (AMAI) Workshop at MICCAI 2022, where our paper received the Best Paper Award. This work introduces a dynamic prediction model capable of identifying chronic kidney disease patients at high risk of requiring renal replacement therapy up to one year in advance.

Panel Discussion: Careers in Academia and Industry

Published: September 19, 2022

I was invited to participate as a panelist in the Careers in Academia and Industry session at MICCAI 2022. This flagship event brought together researchers, innovators, and industry leaders to discuss professional pathways, career development, and the evolving relationship between academic research and real world applications.

Instability in Clinical Risk Stratification Models Using Deep Learning

Published: November 28, 2022

I presented a poster at the Machine Learning for Health (ML4H) Symposium 2022 in New Orleans, based on research conducted at Google Health. The work investigates how randomness in training deep learning models, despite identical data, architecture, and hyperparameters, can lead to meaningfully different patient-level predictions in clinical risk stratification tasks.

Trustworthiness in Medical Product Question Answering by Large Language Models

Published: August 26, 2024

I presented a poster at the KDD 2024 Workshop on GenAI Evaluation in Barcelona, corresponding to the paper “Trustworthiness in medical product question answering by large language models”. The work introduces a claim-level evaluation framework to assess whether large language models provide medically accurate and label-consistent answers when responding to questions about prescription drugs and medical products.

Detecting sensitive medical responses in general purpose large language models

Published: December 16, 2024

I presented a poster at the Machine Learning for Health Symposium (ML4H) 2024 in Vancouver, corresponding to the paper Detecting sensitive medical responses in general purpose large language models. The work investigates how to identify sensitive or potentially harmful medical responses produced by general-purpose large language models.

AI-Enabled Virtual Care with Digital Avatar Assistants

Published: June 26, 2025

I delivered a talk at Amazon’s Image and Video Generation Workshop 2025 in Seattle, presenting our work at Amazon Health on building AI-enabled virtual care experiences using LLM-powered digital avatar assistants.

Pioneering Agentic Systems: From Shopping to Health

Published: September 25, 2025

I delivered an invited talk for the Amazon North America Stores GenAI Learning Series, presenting a deep dive into the design, architecture, evaluation, and deployment of large-scale agentic systems across Amazon. The talk bridged my work across Shopping Conversations Foundations and Amazon Health AI / One Medical, highlighting the development of agentic LLM systems from consumer shopping experiences (specifically BuyForMe) to clinical and healthcare workflows.

Trustworthiness in Medical Product Question Answering by Large Language Models

Published: November 04, 2025

I gave an invited talk at the Machine Learning for Healthcare Roundtable during the Amazon Machine Learning Conference 2025, presenting our work on evaluating the trustworthiness of large language models (LLMs) in medical product question answering.

The Future of Healthcare AI

Published: April 24, 2026

Perspective on why healthcare AI is at an inflection point, spanning regulation, market dynamics, and research directions for evaluation and agentic systems.

Memory and Personalization in Health AI

Published: April 24, 2026

Overview of practical approaches to memory and personalization for health AI systems, including long-term context, retrieval, and longitudinal user experiences.

teaching

Personalized Machine Learning

Graduate course, MIT Media Lab, Massachusetts Institute of Technology, 2017

This graduate course explores how machine learning models can be adapted to individuals rather than populations, with a particular focus on health and human data. The class covers modern techniques for personalization, including active learning, domain adaptation and deep models, and guides students in developing their own personalized ML applications.

Daniel Lopez-Martinez

Sitemap

Pages

Posts

portfolio

publications

talks

teaching