Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
The Evaluation Stack for Health AI
Published:
Why medical knowledge benchmarks aren’t enough for real-world health assistants, and what an evaluation stack should look like as Health AI moves from static vignettes to conversational, action-taking systems.
Launching Amazon Health AI for Prime Members
Published:
After months of work, Health AI is launching on Amazon.com and in the Amazon app, giving customers a personal AI health assistant to understand their health information, explore care options, and connect directly to care.
Launching Amazon One Medical Health AI
Published:
Today we launched Amazon One Medical, an agentic Health AI assistant providing 24/7 health guidance, appointment booking, lab reading, and medication management.
From Stochastic Parrots to Software as a Doctor
Published:
How LLMs are evolving from stochastic parrots into regulated medical devices — exploring the FDA frameworks, regulatory pathways, and business implications of software as a doctor.
AWS re:Invent 2025
Published:
Attended AWS re:Invent 2025 in Las Vegas.
Amazon Machine Learning Conference (AMLC 2025)
Published:
Attended Amazon’s internal Machine Learning Conference (AMLC 2025) and gave a talk on LLM trustworthiness in medical product question answering.
Launching Buy For Me on the Amazon Shopping App
Published:
Buy For Me, my first agentic AI system at Amazon, is now live for US customers — an AI agent that browses, selects, and purchases products from third-party brand websites on behalf of customers.
PhDone
Published:
Successfully defended my PhD thesis at MIT in the Harvard–MIT Program in Health Sciences and Technology.
Research Featured on MIT News Front Page
Published:
My PhD research on detecting pain levels from brain signals was featured on the front page of MIT News.
Featured in La Razón: Spanish Innovation at the MIT Media Lab
Published:
I was featured in a double-page spread by the national Spanish newspaper La Razón, highlighting leading researchers at the MIT Media Lab.
Spanish TV Interview at MIT with Iñaki Gabilondo
Published:
I was interviewed at MIT by Spanish journalist Iñaki Gabilondo for a Movistar+ documentary episode on MIT scientists and the future of technology.
Tim Cook at the MIT Media Lab
Published:
Tim Cook, Apple’s CEO, visited our research group (Affective Computing group) at the MIT Media Lab, where our team presented our research on AI and emotion recognition.
Collaborating with BCG Digital Ventures and Intermountain Health on Venture Design
Published:
Invited to represent the MIT Media Lab in a multidisciplinary healthcare venture design and strategic forecasting workshop with BCG Digital Ventures.
La Caixa Fellowship for Postgraduate Studies
Published:
Awarded the La Caixa Fellowship for postgraduate studies abroad, presented by King Juan Carlos I of Spain at a ceremony in Madrid.
Carpe Diem Enterprise Trust Bursary — University of Cambridge
Published:
Awarded the Carpe Diem Enterprise Trust educational bursary to support my studies in the Master’s in Bioscience Enterprise programme at the University of Cambridge.
ACGI Medal for Excellence — Imperial College London
Published:
Awarded the ACGI Medal for Excellence by the Faculty of Engineering at Imperial College London upon graduating with First Class Honours in Biomedical Engineering.
Beca Rioja Talento — Fundación Riojana para la Innovación
Published:
Awarded the Beca Rioja Talento 2010-2011 by the Fundación Riojana para la Innovación, Spain.
A Summer of Research as an Amgen Scholar at the University of Cambridge
Published:
Selected for the Amgen Scholars Programme to conduct summer research at the University of Cambridge’s Wolfson Brain Imaging Centre.
Spanish Mathematics and Physics Olympiad Medals
Published:
Awarded a Bronze medal at the Spanish Mathematics Olympiad and a Silver medal at the Spanish Physics Olympiad in 2008.
portfolio
Scalable Trust, Safety, and Quality Infrastructure for LLM-Based Shopping Assistants
Continuous, production-scale measurement of accuracy, policy compliance, and user trust for Amazon’s generative shopping assistant, replacing manual review with automated evaluation and actionable quality signals across millions of model responses per week.
Buy For Me — Agentic Shopping System
A browser-based agentic system that helps customers complete purchases across third-party websites using multimodal perception, reasoning, and tool-enabled actions.
Multimodal Agentic Assistants for Primary Care
A patient-facing, multimodal assistant designed to support text- and voice-based interactions in primary care, integrating language, audio, and structured workflows under safety and regulatory constraints.
AI-Driven Patient Message Routing & Automation (One Medical)
An AI-driven system for routing and automating patient portal messages at One Medical, reducing clinician burden and improving operational efficiency at national scale.
Video Avatars & Multimodal Experiences
Applied research, design and prototyping of real-time multimodal video avatar systems that shaped Amazon’s strategy for agentic, embodied AI experiences.
Foresight — Population Health Predictions for Proactive Care
A population health management initiative under Google Care Studio focused on using machine learning to identify high-risk outpatients and support proactive care through closed-loop prediction and measurement workflows.
Project Nightingale
A large-scale clinical data infrastructure and analytics initiative focused on aggregating, standardizing, and enabling analysis of population-scale EHR data to support clinical workflows and machine learning in healthcare.
Crowdsourced Air Pollution Measurement Using DIY Atomic Force Microscopes
A low-cost, citizen-science platform for measuring and analyzing air pollution particles using do-it-yourself atomic force microscopes and human-in-the-loop image analysis.
Detection of Real-World Driving-Induced Affective State Using Physiological Signals
Multi-view multi-task machine learning for detecting driver stress and affective states from physiological signals during real-world driving.
Objective Pain Detection from Brain Signals using fNIRS
Personalized machine learning on wearable neuroimaging signals to objectively detect pain in non-communicative patients.
MIT Happy Robot: Real-Time Affective Interaction & Social Robotics
Real-time facial emotion recognition integrated with expressive social robotics for engaging, affect-aware human–robot interaction.
Injection Study
We investigated to use electrodermal activity (EDA), heart rate variability (HRV), and facial expression analysis as potential endpoints to determine quantitative pain scores.
Deep Reinforcement Learning for Safe Opioid Dosing in Critical Care
A clinically interpretable deep reinforcement learning system for personalized morphine dosing in the ICU, balancing pain control and physiological safety.
SNAPSHOT: Modeling Sleep and Mental Health in Social Networks
A large-scale NIH-funded study combining multimodal sensing and machine learning to model sleep, stress, and mental health dynamics in real-world social networks.
AI-Assisted PD-L1 Scoring for Immunotherapy Decision Support
Developed computer vision models to automatically quantify PD-L1 expression in immunohistochemistry slides.
Detection Limits of Automated MRI Morphometry in Rodent Brains (University of Cambridge, 2010)
A methodological study of voxel-based morphometry sensitivity for detecting subtle neuroanatomical changes in rodent MRI.
ML-Enabled Phone-Based ECG Signal Quality Assessment (University of Oxford, 2011)
Automated assessment of ECG signal quality using signal quality indices and classifier fusion for mobile and low-resource healthcare.
MAIC
Co-founded an early-stage startup through the Antler accelerator, incorporated and based in Singapore, focused on applying AI, computer vision, and IoT to workforce and task management in the construction industry.
publications
Instability in clinical risk stratification models using deep learning
Published in Machine Learning for Health (ML4H), 2022
Recommended citation: Lopez-Martinez, Daniel; Yakubovich, Alex; Seneviratne, Martin; Lelkes, Adam; Tyagi, Akshit; Kemp, Jonas; Steinberg, Ethan; N. Downing, Lance; C. Li, Ron; E. Morse, Keith; H. Shah, Nigam; Chen, Ming-Jun, (2022). "Instability in clinical risk stratification models using deep learning." Machine Learning for Health (ML4H) 2022.
Download Paper
Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models
Published in CVPR Responsible Generative AI Workshop, 2024
Recommended citation: Lopez-Martinez, Daniel. (2024). "Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models." CVPR 2024 Responsible Generative AI Workshop.
Download Paper
Trustworthiness in medical product question answering by large language models
Published in KDD Workshop on Evaluation and Trustworthiness of Generative AI Models, 2024
Recommended citation: Lopez-Martinez, Daniel (2024). "Trustworthiness in medical product question answering by large language models." KDD 2024 Workshop on Evaluation and Trustworthiness of Generative AI Models.
Download Paper
Detecting sensitive medical responses in general purpose large language models
Published in Machine Learning for Health Symposium, 2024
Recommended citation: Lopez-Martinez, Daniel; Bafna, Abhishek (2024). "Detecting sensitive medical responses in general purpose large language models." Machine Learning for Health Symposium 2024.
Download Paper
Systems for automated interaction with user interfaces
Published in United States Patent Office (pending), 2025
Describes systems and methods for automated agents to interact with user interfaces on behalf of a user. The system uses multiple ML–based agents (including LLM-based planners) to perceive the state of a webpage, determine and execute actions (e.g., navigation, variant selection, cart operations, checkout), and verify that each action aligns with user intent, constraints, and security guardrails (such as controlled handling of payment information). The approach enables reliable end-to-end task execution across heterogeneous third-party interfaces and supports agentic purchasing workflows such as Buy For Me.
Recommended citation: Lopez Martinez, Daniel; et al. Systems for Automated Interaction with User Interfaces. U.S. Patent Application (pending), filed 2025. Applicant: Amazon Technologies, Inc.
Systems for generation of prompts for evaluation of language models
Published in United States Patent Office, 2025
Describes systems and methods to generate synthetic prompts for red-teaming large language models (LLMs). A first ML model (e.g., a Q-learning model) learns prompt modifications that increase the probability of eliciting responses that violate constraints, while additional models score prompts and responses and generate rationales to guide subsequent prompt generation and improve evaluation coverage.
Recommended citation: Lopez Martinez, Daniel. Systems for Generation of Prompts for Evaluation of Language Models. U.S. Patent Application Publication No. US 2025/0378274 A1, published Dec. 11, 2025 (filed Jun. 7, 2024). Applicant: Amazon Technologies, Inc.
View Patent
talks
Detection limits of automated MRI morphometry for phenotyping in the rodent brains for applications in neurological disorders
Published:
I presented a poster at the Amgen Scholars European Symposium 2010 at the University of Cambridge, describing research conducted during my summer internship at the Wolfson Brain Imaging Centre under the supervision of Adrian Carpenter and Steve Sawiak. The symposium brought together undergraduate researchers from institutions across Europe to share summer projects and participate in a series of academic talks and poster sessions.
Advanced MRI Techniques for Early Detection of Brain Metastases in Small Cell Lung Cancer
Published:
I delivered an oral presentation at the Cancer Research UK Cambridge Research Institute (CRI) summarizing the results of my summer research internship in the laboratory of Professor John Griffiths at the University of Cambridge. The project focused on evaluating advanced magnetic resonance imaging methods for the early detection of brain metastases in small cell lung cancer (SCLC).
Signal Quality Indices and Data Fusion for Determining Acceptability of Electrocardiograms
Published:
Gari Clifford (University of Oxford) and I gave an oral presentation at the Computing in Cardiology (CinC) Conference 2011 in Hangzhou, China, presenting our joint work conducted at the Oxford Institute of Biomedical Engineering (IBME). The talk covered our algorithm for assessing the diagnostic acceptability of electrocardiograms (ECGs) collected in noisy or low-resource ambulatory environments.
Modeling Loop Formation in Cortical Circuits Using Spike Timing Dependent Plasticity
Published:
I delivered an oral presentation in the Kreiman Laboratory at Harvard University, summarizing the results of my summer research internship under the supervision of Professor Gabriel Kreiman. The computational neuroscience project focused on understanding how spike timing dependent plasticity (STDP) shapes the architecture of recurrent cortical circuits and the conditions under which specific connectivity patterns emerge.
Machine Learning Methods for Analyzing Multisensory Integration with Magnetoencephalography
Published:
I delivered an oral presentation at the Magnetoencephalography Laboratory of the McGovern Institute for Brain Research at MIT, summarizing the results of my research internship under the supervision of Dr Dimitrios Pantazis. The project focused on developing machine learning methods to process magnetoencephalography data and on understanding how the human brain binds visual and auditory information into a unified percept as part of a National Science Foundation supported effort.
Crowdsourced Air Pollution Measurement Using DIY Atomic Force Microscopes
Published:
I delivered a demo presentation at the LEGO2NANO Summer School at the Shenzhen Open Innovation Lab (SZOIL), showcasing the atomic force microscope our team developed and, in particular, my work on imaging air pollution particles and creating a crowdsourcing-based air pollution measurement platform built around this technology.
Building Bridges to Develop New Medical Technologies
Published:
I delivered an invited talk at the Building Bridges to Develop New Medical Technologies workshop hosted by the Real Colegio Complutense at Harvard University. The event brought together engineering and medical researchers from Boston and Spain to foster international collaboration and cross-disciplinary innovation in biomedical science and technology.
Wearable Technologies for Multiple Sclerosis: The Future Role of Stress Measurement
Published:
I delivered an oral presentation at the International Conference on Smart Portable, Wearable, Implantable and Disability Oriented Devices and Systems (SPWID 2016) in Valencia, Spain. The talk presented our work on wearable technologies for managing stress in individuals with multiple sclerosis (MS), based on research conducted at the MIT Media Lab.
Patient-Centered Symptom and Vital Sign Tracking for Lyme Disease Care
Published:
I presented our team’s work at Lyme Innovation, the first-ever Lyme-disease–focused hackathon, held at the Microsoft NERD Center in Cambridge and organized by Spaulding Rehabilitation’s Dean Center for Tick-Borne Illness, the Veterans Affairs Center for Innovation, MIT Hacking Medicine, UC Berkeley, and Harvard Medical School. The three-day event brought together clinicians, scientists, engineers, entrepreneurs, and patients to develop new solutions for Lyme disease. Our project was selected as one of the finalists and received a $5,000 award.
LymeDot: Using Open Data and Mobile AI for Symptom Tracking in Lyme Disease
Published:
I presented work at the White House Open Data Innovation Summit in Washington, D.C., where our team was invited to present our project developed during a Boston-based health hackathon. Our project, LymeDot, explored how mobile technology and open data could help patients with Lyme disease track symptoms over time, support clinical decision-making, and empower individuals managing complex, chronic conditions.
Drug Development in the Connected World: A New Paradigm for Drug Discovery
Published:
I delivered a talk at the MIT Media Lab during an industry symposium on how connectivity, wearable sensing, and data-driven approaches are transforming the drug development pipeline — from discovery through clinical trials and post-market surveillance.
Automatic Detection of Nociceptive Stimuli and Pain Intensity from Facial Expressions
Published:
I presented a poster at the 2017 Annual Meeting of the American Pain Society in Pittsburgh, describing collaborative work between the MIT Affective Computing group and MedImmune on automatic pain detection using computer vision and machine learning.
Personalized Automatic Estimation of Self Reported Pain Intensity from Facial Expressions
Published:
I delivered an oral presentation at the Computer Vision and Pattern Recognition (CVPR 2017) Workshop on Deep Affective Learning and Context Modeling, where I presented our work on personalized estimation of self reported pain intensity from facial expressions. The project introduced a two stage machine learning framework that combines recurrent neural networks with a personalized Hidden Conditional Random Field model to estimate Visual Analog Scale (VAS) pain scores from facial landmarks.
ZenAuto: Emotionally Intelligent Transport
Published:
I presented our startup concept, ZenAuto, at the Lee Kuan Yew Global Business Plan Competition (LKYGBPC) in Singapore, one of Asia’s leading deep-tech entrepreneurship challenges. The competition brings together next-generation founders from around the world to showcase innovations with the potential to reshape cities, industries, and society. Our work was selected for presentation on the competition stage alongside teams from top global universities.
Physiological and Behavioral Profiling for Nociceptive Pain Estimation Using Personalized Multitask Learning
Published:
I presented a poster at the NeurIPS Machine Learning for Health (ML4H) Workshop 2017, describing our work on personalized pain estimation from multimodal data. The project introduced a method for building physiological and behavioral profiles based on individual responses to heat pain, and for using these profiles within a personalized multi-task neural network architecture.
Skin Conductance Deconvolution for Pain Estimation
Published:
I presented a poster at the International Conference on Biomedical and Health Informatics (BHI 2018) in Las Vegas, describing our work on estimating pain intensity from skin conductance signals. The project, conducted at the MIT Media Lab, focused on leveraging noninvasive physiological sensing to quantify nociceptive responses when self-report is not feasible.
Continuous Pain Intensity Estimation from Autonomic Signals with Recurrent Neural Networks
Published:
I delivered an oral presentation at the Engineering in Medicine and Biology Conference (EMBC), describing our work on continuously estimating experimental heat pain intensity from autonomic physiological signals. The project sought to develop an objective pain monitoring method that provides high temporal resolution estimates using data that can be collected noninvasively from wearable sensors.
Multi-Task Multiple Kernel Machines for Personalized Pain Recognition from fNIRS
Published:
I delivered an oral presentation at the International Conference on Pattern Recognition (ICPR 2018) in Beijing, China, presenting our work on personalized pain detection using functional near-infrared spectroscopy (fNIRS) brain signals. The paper received the Best Student Paper Award.
Machine Learning for Pain Medicine: Physiological and Behavioral Profiling for Nociceptive Pain Estimation
Published:
I presented a poster at the Harvard–MIT Health Sciences and Technology (HST) Forum at Harvard Medical School, describing my research on personalized machine learning approaches for estimating nociceptive pain. The work, conducted at the MIT Media Lab, explored how individual differences in physiological and behavioral responses to pain can be leveraged to improve continuous pain intensity estimation.
Deep Reinforcement Learning for Optimal Critical Care Pain Management
Published:
I delivered an oral presentation at the Engineering in Medicine and Biology Conference (EMBC 2019) in Berlin, summarizing our work on using deep reinforcement learning to support optimal pain management in the intensive care unit (ICU). The project introduced a sequential decision making framework that learns clinically interpretable morphine dosing strategies personalized to each patient’s evolving physiological and pain state, based on retrospective ICU data from the MIMIC-III database.
Detecting Real World Driving Induced Affective State Using Physiological Signals
Published:
I delivered an oral presentation at the International Conference on Affective Computing and Intelligent Interaction (ACII 2019) in Cambridge, UK, during the International Workshop on Social and Emotion AI for Industry (SEAIxI). The presentation summarized our work on detecting real world, driving induced affective states using physiological signals, based on our paper presented at the conference.
Machine Learning for Predicting Renal Replacement Therapy Onset in Chronic Kidney Disease
Published:
I presented our work at the Applications of Medical AI (AMAI) Workshop at MICCAI 2022, where our paper received the Best Paper Award. This work introduces a dynamic prediction model capable of identifying chronic kidney disease patients at high risk of requiring renal replacement therapy up to one year in advance.
Panel Discussion: Careers in Academia and Industry
Published:
I was invited to participate as a panelist in the Careers in Academia and Industry session at MICCAI 2022. This flagship event brought together researchers, innovators, and industry leaders to discuss professional pathways, career development, and the evolving relationship between academic research and real world applications.
Instability in Clinical Risk Stratification Models Using Deep Learning
Published:
I presented a poster at the Machine Learning for Health (ML4H) Symposium 2022 in New Orleans, based on research conducted at Google Health. The work investigates how randomness in training deep learning models, despite identical data, architecture, and hyperparameters, can lead to meaningfully different patient-level predictions in clinical risk stratification tasks.
Trustworthiness in Medical Product Question Answering by Large Language Models
Published:
I presented a poster at the KDD 2024 Workshop on GenAI Evaluation in Barcelona, corresponding to the paper “Trustworthiness in medical product question answering by large language models”. The work introduces a claim-level evaluation framework to assess whether large language models provide medically accurate and label-consistent answers when responding to questions about prescription drugs and medical products.
Detecting sensitive medical responses in general purpose large language models
Published:
I presented a poster at the Machine Learning for Health Symposium (ML4H) 2024 in Vancouver, corresponding to the paper Detecting sensitive medical responses in general purpose large language models. The work investigates how to identify sensitive or potentially harmful medical responses produced by general-purpose large language models.
AI-Enabled Virtual Care with Digital Avatar Assistants
Published:
I delivered a talk at Amazon’s Image and Video Generation Workshop 2025 in Seattle, presenting our work at Amazon Health on building AI-enabled virtual care experiences using LLM-powered digital avatar assistants.
Pioneering Agentic Systems: From Shopping to Health
Published:
I delivered an invited talk for the Amazon North America Stores GenAI Learning Series, presenting a deep dive into the design, architecture, evaluation, and deployment of large-scale agentic systems across Amazon. The talk bridged my work across Shopping Conversations Foundations and Amazon Health AI / One Medical, highlighting the development of agentic LLM systems from consumer shopping experiences (specifically BuyForMe) to clinical and healthcare workflows.
Trustworthiness in Medical Product Question Answering by Large Language Models
Published:
I gave an invited talk at the Machine Learning for Healthcare Roundtable during the Amazon Machine Learning Conference 2025, presenting our work on evaluating the trustworthiness of large language models (LLMs) in medical product question answering.
The Future of Healthcare AI
Published:
Perspective on why healthcare AI is at an inflection point, spanning regulation, market dynamics, and research directions for evaluation and agentic systems.
Memory and Personalization in Health AI
Published:
Overview of practical approaches to memory and personalization for health AI systems, including long-term context, retrieval, and longitudinal user experiences.
teaching
Personalized Machine Learning
Graduate course, MIT Media Lab, Massachusetts Institute of Technology, 2017
This graduate course explores how machine learning models can be adapted to individuals rather than populations, with a particular focus on health and human data. The class covers modern techniques for personalization, including active learning, domain adaptation and deep models, and guides students in developing their own personalized ML applications.
