Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
From Stochastic Parrots to Software as a Doctor
Published:
LLMs have been characterized as stochastic parrots, probabilistic systems that merely remix text without understanding and predict the next word. But the frontier is shifting. Today, the question is no longer whether LLMs can imitate clinical expertise, but how we transform them into regulated medical devices that can interview patients, form preliminary diagnoses, triage safely, and even prescribe.
AWS re:Invent 2025
Published:
Amazon Machine Learning Conference (AMLC 2025)
Published:
I’ve spent the last few days in Seattle at Amazon’s internal Machine Learning Conference (AMLC). If last year was defined by the frontier of GenAI capabilities, this year the focus shifted decisively toward agents, reliability, and real-world deployment. The conversation has moved from “Can we do X?” to “How do we evaluate, govern, and safely operationalize X at scale?”. It felt like a distinctly Amazonian event: pragmatic, execution-oriented, and full of hallway discussions about shipping real systems and delivering customer impact.
portfolio
Scalable Trust, Safety, and Quality Infrastructure for LLM-Based Shopping Assistants
Continuous, production-scale measurement of accuracy, policy compliance, and user trust for Amazon’s generative shopping assistant, replacing manual review with automated evaluation and actionable quality signals across millions of model responses per week.
Buy For Me — Agentic Shopping System
A browser-based agentic system that helps customers complete purchases across third-party websites using multimodal perception, reasoning, and tool-enabled actions.
Multimodal Agentic Assistants for Primary Care
A patient-facing, multimodal assistant designed to support text- and voice-based interactions in primary care, integrating language, audio, and structured workflows under safety and regulatory constraints.
Video Avatars & Multimodal Experiences
Applied research and prototyping of video-based, multimodal AI experiences combining speech, vision, and structured interaction flows to enable natural, task-oriented user interactions.
Foresight — Population Health Predictions for Proactive Care
A population health management initiative under Google Care Studio focused on using machine learning to identify high-risk outpatients and support proactive care through closed-loop prediction and measurement workflows.
Project Nightingale
A large-scale clinical data infrastructure and analytics initiative focused on aggregating, standardizing, and enabling analysis of population-scale EHR data to support clinical workflows and machine learning in healthcare.
Injection Study
We investigated to use electrodermal activity (EDA), heart rate variability (HRV), and facial expression analysis as potential endpoints to determine quantitative pain scores.
Crowdsourced Air Pollution Measurement Using DIY Atomic Force Microscopes
A low-cost, citizen-science platform for measuring and analyzing air pollution particles using do-it-yourself atomic force microscopes and human-in-the-loop image analysis.
MIT Happy Robot
Emotion recognition and personalized interactions.
AI-Assisted PD-L1 Scoring for Immunotherapy Decision Support
Developed computer vision models to automatically quantify PD-L1 expression in immunohistochemistry slides.
SNAPSHOT: Modeling Sleep and Mental Health in Social Networks
A large-scale NIH-funded study combining multimodal sensing and machine learning to model sleep, stress, and mental health dynamics in real-world social networks.
MAIC
Co-founded an early-stage startup through the Antler accelerator, incorporated and based in Singapore, focused on applying AI, computer vision, and IoT to workforce and task management in the construction industry.
publications
Instability in clinical risk stratification models using deep learning
Published in Machine Learning for Health (ML4H), 2022
Recommended citation: Lopez-Martinez, Daniel; Yakubovich, Alex; Seneviratne, Martin; Lelkes, Adam; Tyagi, Akshit; Kemp, Jonas; Steinberg, Ethan; N. Downing, Lance; C. Li, Ron; E. Morse, Keith; H. Shah, Nigam; Chen, Ming-Jun, (2022). "Instability in clinical risk stratification models using deep learning." Machine Learning for Health (ML4H) 2022.
Download Paper
Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models
Published in CVPR Responsible Generative AI Workshop, 2024
Recommended citation: Lopez-Martinez, Daniel. (2024). "Guardrails for avoiding harmful medical product recommendations and off-label promotion in generative AI models." CVPR 2024 Responsible Generative AI Workshop.
Download Paper
Trustworthiness in medical product question answering by large language models
Published in KDD Workshop on Evaluation and Trustworthiness of Generative AI Models, 2024
Recommended citation: Lopez-Martinez, Daniel (2024). "Trustworthiness in medical product question answering by large language models." KDD 2024 Workshop on Evaluation and Trustworthiness of Generative AI Models.
Download Paper
Detecting sensitive medical responses in general purpose large language models
Published in Machine Learning for Health Symposium, 2024
Recommended citation: Lopez-Martinez, Daniel; Bafna, Abhishek (2024). "Detecting sensitive medical responses in general purpose large language models." Machine Learning for Health Symposium 2024.
Download Paper
Systems for automated interaction with user interfaces
Published in United States Patent Office (pending), 2025
Describes systems and methods for automated agents to interact with user interfaces on behalf of a user. The system uses multiple ML–based agents (including LLM-based planners) to perceive the state of a webpage, determine and execute actions (e.g., navigation, variant selection, cart operations, checkout), and verify that each action aligns with user intent, constraints, and security guardrails (such as controlled handling of payment information). The approach enables reliable end-to-end task execution across heterogeneous third-party interfaces and supports agentic purchasing workflows such as Buy For Me.
Recommended citation: Lopez Martinez, Daniel; et al. Systems for Automated Interaction with User Interfaces. U.S. Patent Application (pending), filed 2025. Applicant: Amazon Technologies, Inc.
Systems for generation of prompts for evaluation of language models
Published in United States Patent Office, 2025
Describes systems and methods to generate synthetic prompts for red-teaming large language models (LLMs). A first ML model (e.g., a Q-learning model) learns prompt modifications that increase the probability of eliciting responses that violate constraints, while additional models score prompts and responses and generate rationales to guide subsequent prompt generation and improve evaluation coverage.
Recommended citation: Lopez Martinez, Daniel. Systems for Generation of Prompts for Evaluation of Language Models. U.S. Patent Application Publication No. US 2025/0378274 A1, published Dec. 11, 2025 (filed Jun. 7, 2024). Applicant: Amazon Technologies, Inc.
View Patent
talks
Detection limits of automated MRI morphometry for phenotyping in the rodent brains for applications in neurological disorders
Published:
I presented a poster at the Amgen Scholars European Symposium 2010 at the University of Cambridge, describing research conducted during my summer internship at the Wolfson Brain Imaging Centre under the supervision of Adrian Carpenter and Steve Sawiak. The symposium brought together undergraduate researchers from institutions across Europe to share summer projects and participate in a series of academic talks and poster sessions.
Advanced MRI Techniques for Early Detection of Brain Metastases in Small Cell Lung Cancer
Published:
I delivered an oral presentation at the Cancer Research UK Cambridge Research Institute (CRI) summarizing the results of my summer research internship in the laboratory of Professor John Griffiths at the University of Cambridge. The project focused on evaluating advanced magnetic resonance imaging methods for the early detection of brain metastases in small cell lung cancer (SCLC).
Signal Quality Indices and Data Fusion for Determining Acceptability of Electrocardiograms
Published:
Gari Clifford (University of Oxford) and I gave an oral presentation at the Computing in Cardiology (CinC) Conference 2011 in Hangzhou, China, presenting our joint work conducted at the Oxford Institute of Biomedical Engineering (IBME). The talk covered our algorithm for assessing the diagnostic acceptability of electrocardiograms (ECGs) collected in noisy or low-resource ambulatory environments.
Modeling Loop Formation in Cortical Circuits Using Spike Timing Dependent Plasticity
Published:
I delivered an oral presentation in the Kreiman Laboratory at Harvard University, summarizing the results of my summer research internship under the supervision of Professor Gabriel Kreiman. The computational neuroscience project focused on understanding how spike timing dependent plasticity (STDP) shapes the architecture of recurrent cortical circuits and the conditions under which specific connectivity patterns emerge.
Machine Learning Methods for Analyzing Multisensory Integration with Magnetoencephalography
Published:
I delivered an oral presentation at the Magnetoencephalography Laboratory of the McGovern Institute for Brain Research at MIT, summarizing the results of my research internship under the supervision of Dr Dimitrios Pantazis. The project focused on developing machine learning methods to process magnetoencephalography data and on understanding how the human brain binds visual and auditory information into a unified percept as part of a National Science Foundation supported effort.
Crowdsourced Air Pollution Measurement Using DIY Atomic Force Microscopes
Published:
I delivered a demo presentation at the LEGO2NANO Summer School at the Shenzhen Open Innovation Lab (SZOIL), showcasing the atomic force microscope our team developed and, in particular, my work on imaging air pollution particles and creating a crowdsourcing-based air pollution measurement platform built around this technology.
Building Bridges to Develop New Medical Technologies
Published:
I delivered an invited talk at the Building Bridges to Develop New Medical Technologies workshop hosted by the Real Colegio Complutense at Harvard University. The event brought together engineering and medical researchers from Boston and Spain to foster international collaboration and cross-disciplinary innovation in biomedical science and technology.
Wearable Technologies for Multiple Sclerosis: The Future Role of Stress Measurement
Published:
I delivered an oral presentation at the International Conference on Smart Portable, Wearable, Implantable and Disability Oriented Devices and Systems (SPWID 2016) in Valencia, Spain. The talk presented our work on wearable technologies for managing stress in individuals with multiple sclerosis (MS), based on research conducted at the MIT Media Lab.
Patient-Centered Symptom and Vital Sign Tracking for Lyme Disease Care
Published:
I presented our team’s work at Lyme Innovation, the first-ever Lyme-disease–focused hackathon, held at the Microsoft NERD Center in Cambridge and organized by Spaulding Rehabilitation’s Dean Center for Tick-Borne Illness, the Veterans Affairs Center for Innovation, MIT Hacking Medicine, UC Berkeley, and Harvard Medical School. The three-day event brought together clinicians, scientists, engineers, entrepreneurs, and patients to develop new solutions for Lyme disease. Our project was selected as one of the finalists and received a $5,000 award.
LymeDot: Using Open Data and Mobile AI for Symptom Tracking in Lyme Disease
Published:
I presented work at the White House Open Data Innovation Summit in Washington, D.C., where our team was invited to present our project developed during a Boston-based health hackathon. Our project, LymeDot, explored how mobile technology and open data could help patients with Lyme disease track symptoms over time, support clinical decision-making, and empower individuals managing complex, chronic conditions.
Automatic Detection of Nociceptive Stimuli and Pain Intensity from Facial Expressions
Published:
I presented a poster at the 2017 Annual Meeting of the American Pain Society in Pittsburgh, describing collaborative work between the MIT Affective Computing group and MedImmune on automatic pain detection using computer vision and machine learning.
Personalized Automatic Estimation of Self Reported Pain Intensity from Facial Expressions
Published:
I delivered an oral presentation at the Computer Vision and Pattern Recognition (CVPR 2017) Workshop on Deep Affective Learning and Context Modeling, where I presented our work on personalized estimation of self reported pain intensity from facial expressions. The project introduced a two stage machine learning framework that combines recurrent neural networks with a personalized Hidden Conditional Random Field model to estimate Visual Analog Scale (VAS) pain scores from facial landmarks.
ZenAuto: Emotionally Intelligent Transport
Published:
I presented our startup concept, ZenAuto, at the Lee Kuan Yew Global Business Plan Competition (LKYGBPC) in Singapore, one of Asia’s leading deep-tech entrepreneurship challenges. The competition brings together next-generation founders from around the world to showcase innovations with the potential to reshape cities, industries, and society. Our work was selected for presentation on the competition stage alongside teams from top global universities.
Physiological and Behavioral Profiling for Nociceptive Pain Estimation Using Personalized Multitask Learning
Published:
I presented a poster at the NeurIPS Machine Learning for Health (ML4H) Workshop 2017, describing our work on personalized pain estimation from multimodal data. The project introduced a method for building physiological and behavioral profiles based on individual responses to heat pain, and for using these profiles within a personalized multi-task neural network architecture.
Skin Conductance Deconvolution for Pain Estimation
Published:
I presented a poster at the International Conference on Biomedical and Health Informatics (BHI 2018) in Las Vegas, describing our work on estimating pain intensity from skin conductance signals. The project, conducted at the MIT Media Lab, focused on leveraging noninvasive physiological sensing to quantify nociceptive responses when self-report is not feasible.
Continuous Pain Intensity Estimation from Autonomic Signals with Recurrent Neural Networks
Published:
I delivered an oral presentation at the Engineering in Medicine and Biology Conference (EMBC), describing our work on continuously estimating experimental heat pain intensity from autonomic physiological signals. The project sought to develop an objective pain monitoring method that provides high temporal resolution estimates using data that can be collected noninvasively from wearable sensors.
Multi-Task Multiple Kernel Machines for Personalized Pain Recognition from fNIRS
Published:
I delivered an oral presentation at the International Conference on Pattern Recognition (ICPR 2018) in Beijing, China, presenting our work on personalized pain detection using functional near-infrared spectroscopy (fNIRS) brain signals. The paper received the Best Student Paper Award.
Machine Learning for Pain Medicine: Physiological and Behavioral Profiling for Nociceptive Pain Estimation
Published:
I presented a poster at the Harvard–MIT Health Sciences and Technology (HST) Forum at Harvard Medical School, describing my research on personalized machine learning approaches for estimating nociceptive pain. The work, conducted at the MIT Media Lab, explored how individual differences in physiological and behavioral responses to pain can be leveraged to improve continuous pain intensity estimation.
Deep Reinforcement Learning for Optimal Critical Care Pain Management
Published:
I delivered an oral presentation at the Engineering in Medicine and Biology Conference (EMBC 2019) in Berlin, summarizing our work on using deep reinforcement learning to support optimal pain management in the intensive care unit (ICU). The project introduced a sequential decision making framework that learns clinically interpretable morphine dosing strategies personalized to each patient’s evolving physiological and pain state, based on retrospective ICU data from the MIMIC-III database.
Detecting Real World Driving Induced Affective State Using Physiological Signals
Published:
I delivered an oral presentation at the International Conference on Affective Computing and Intelligent Interaction (ACII 2019) in Cambridge, UK, during the International Workshop on Social and Emotion AI for Industry (SEAIxI). The presentation summarized our work on detecting real world, driving induced affective states using physiological signals, based on our paper presented at the conference.
Machine Learning for Predicting Renal Replacement Therapy Onset in Chronic Kidney Disease
Published:
I presented our work at the Applications of Medical AI (AMAI) Workshop at MICCAI 2022, where our paper received the Best Paper Award. This work introduces a dynamic prediction model capable of identifying chronic kidney disease patients at high risk of requiring renal replacement therapy up to one year in advance.
Panel Discussion: Careers in Academia and Industry
Published:
I was invited to participate as a panelist in the Careers in Academia and Industry session at MICCAI 2022. This flagship event brought together researchers, innovators, and industry leaders to discuss professional pathways, career development, and the evolving relationship between academic research and real world applications.
Instability in Clinical Risk Stratification Models Using Deep Learning
Published:
I presented a poster at the Machine Learning for Health (ML4H) Symposium 2022 in New Orleans, based on research conducted at Google Health. The work investigates how randomness in training deep learning models, despite identical data, architecture, and hyperparameters, can lead to meaningfully different patient-level predictions in clinical risk stratification tasks.
Trustworthiness in Medical Product Question Answering by Large Language Models
Published:
I presented a poster at the KDD 2024 Workshop on GenAI Evaluation in Barcelona, corresponding to the paper “Trustworthiness in medical product question answering by large language models”. The work introduces a claim-level evaluation framework to assess whether large language models provide medically accurate and label-consistent answers when responding to questions about prescription drugs and medical products.
Detecting sensitive medical responses in general purpose large language models
Published:
I presented a poster at the Machine Learning for Health Symposium (ML4H) 2024 in Vancouver, corresponding to the paper Detecting sensitive medical responses in general purpose large language models. The work investigates how to identify sensitive or potentially harmful medical responses produced by general-purpose large language models.
AI-Enabled Virtual Care with Digital Avatar Assistants
Published:
I delivered a talk at Amazon’s Image and Video Generation Workshop 2025, presenting our work at Amazon Health on building AI-enabled virtual care experiences using digital avatar assistants.
Pioneering Agentic Systems: From Shopping to Health
Published:
I delivered an invited talk for the Amazon North America Stores GenAI Learning Series, presenting a deep dive into the design, architecture, evaluation, and deployment of large-scale agentic systems across Amazon. The talk bridged my work across Shopping Conversations Foundations and Amazon Health AI / One Medical, highlighting the development of agentic LLM systems from consumer shopping experiences (specifically BuyForMe) to clinical and healthcare workflows.
Trustworthiness in Medical Product Question Answering by Large Language Models
Published:
I gave an invited talk at the Machine Learning for Healthcare Roundtable during the Amazon Machine Learning Conference 2025, presenting our work on evaluating the trustworthiness of large language models (LLMs) in medical product question answering.
teaching
Personalized Machine Learning
Graduate course, MIT Media Lab, Massachusetts Institute of Technology, 2017
This graduate course explores how machine learning models can be adapted to individuals rather than populations, with a particular focus on health and human data. The class covers modern techniques for personalization, including active learning, domain adaptation and deep models, and guides students in developing their own personalized ML applications.
