Biosignals like ECG, EEG, and PPG capture the body's physiological and behavioural "languages", but analyzing them at scale requires dedicated approaches. Foundation models, which have transformed NLP and computer vision, are now emerging for biosignal analysis, promising to unlock patterns across vast amounts of sensor readings.
This curated list covers foundation models (FMs), datasets, and tools for biosignals based on our comprehensive survey of the field.
π Survey Paper: Foundation Models for Biosignals: A Survey - A comprehensive review of foundation model development and applications for biosignals.
*timeline marked by the first online date of each work
The field of biosignal foundation models can be organized into three converging directions:
- Pretraining Biosignal FMs from Scratch: Pretraining dedicated foundation models on large biosignal corpora
- Adapting Time Series FMs to Biosignals: Repurposing general time series foundation models to biosignal tasks
- Leveraging LLMs for Biosignals: Leverage (multi-modal) large language models for biosignal analysis
- πΊοΈ Landscape
- π Surveys & Reviews
- π¬ Datasets & Benchmarks & Tools
- ποΈ Pretraining Biosignal Foundation Models from Scratch
- π Adapting Time Series Foundation Models to Biosignals
- π€ Leveraging Large Language Models for Biosignal Analysis
βΌοΈ Open Challenges- π€ Contributing
- [IEEE TPAMI, 2025] Foundation models defining a new era in vision: a survey and outlook [paper]
- [ACM SIGKDD, 2024] Foundation models for time series analysis: a tutorial and survey [paper]
- [Arxiv, 2024] A survey of time series foundation models: generalizing time series representation with large language mode [paper]
- [Arxiv, 2024] Biomedical foundation model: a survey [paper]
- [Arxiv, 2024] Deep time series models: a comprehensive survey and benchmark [paper]
- [Arxiv, 2024] Foundation models for video understanding: a survey [paper]
- [IJCAI, 2024] Large language models for time series: a survey [paper]
ECG- Electrocardiography (heart electrical activity)EEG- Electroencephalography (brain electrical activity)PPG- Photoplethysmography (blood volume changes)IMU- Inertial Measurement Unit (motion sensors)EMG- Electromyography (muscle electrical activity)ABP- Arterial Blood PressurePCG- Phonocardiography (heart sounds)Resp- Respiration/Respiratory signals (breathing patterns)
| Dataset | Modality | # Individuals | # Duration (hr) | Link |
|---|---|---|---|---|
| UKBiobank | ECG PPG IMU Health Metrics |
- | - | [dataset] |
| MC-MED | ECG PPG Resp Vital Signs |
70K | - | [dataset] |
| MIMIC-III-WDB | ECG PPG ABP Resp Vital Signs |
30K | 3M | [dataset] |
| VitalDB | ECG PPG ABP Resp Vital Signs |
6K | - | [dataset] |
| PulseDB | ECG PPG ABP |
5K | 50M | [dataset] |
| MESA | ECG PPG EEG |
2K | - | [dataset] |
| VTaC | ECG PPG ABP |
2K | - | [dataset] |
| CODE | ECG |
2M | - | [dataset] |
| MIMIC-IV-ECG | ECG |
160K | 2K | [dataset] |
| PhysioNet2020 | ECG |
40K | 90 | [dataset] |
| eICU | Vital Signs |
139K | - | [dataset] |
| HiRID | Vital Signs |
34K | - | [dataset] |
| UCSF-PPG | PPG |
21K | 600K | [dataset] |
| TUEG | EEG |
15K | 27K | [dataset] |
| HBN-EEG | EEG |
3K | 3K | [dataset] |
| MOABB | EEG |
>1K | - | [dataset] |
| SEED Series | EEG Gaze Metrics |
- | - | [dataset] |
| emg2pose | EMG |
193 | 370 | [dataset] |
| emg2qwerty | EMG |
108 | 346 | [dataset] |
| HUNT4 | IMU |
35K | - | [dataset] |
| Capture24 | IMU |
151 | 4K | [dataset] |
| Ego4D | IMU Audio Gaze Metrics |
923 | 4K | [dataset] |
| COVID-19 Sounds | Audio |
36K | 552 | [dataset] |
-
PhysioNet Challenges: Annual international competitions providing standardized evaluation frameworks for various biosignal analysis tasks including ECG interpretation, sleep staging, and arrhythmia detection [benchmark]
-
MOABB (Mother of All BCI Benchmarks): Comprehensive benchmarking framework for EEG-based brain-computer interface algorithms with standardized pipelines and evaluation protocols [benchmark]
| Toolbox/Package | Modality | Link |
|---|---|---|
| MNE-Python | EEG |
[tool] |
| NeuroKit | ECG PPG Resp EMG |
[tool] |
| HeartPy | ECG PPG |
[tool] |
| pyHRV | ECG PPG |
[tool] |
| BIOBSS | ECG PPG IMU |
[tool] |
| BioSPPy | ECG PPG EEG Resp EMG PCG |
[tool] |
| PyPhysio | ECG PPG IMU |
[tool] |
| EEGLAB | EEG |
[tool] |
| Vital-Sqi | ECG PPG |
[tool] |
| PhysioKit | PPG Resp |
[tool] |
| PPGFeat | PPG |
[tool] |
| WFDB Toolbox | ECG PPG EEG Resp ABP EMG |
[tool] |
| PyPPG | PPG |
[tool] |
Building dedicated foundation models using large biosignal corpora
- [Nature Communications, 2026] A unified time-frequency foundation model for sleep decoding
[paper][code]
- [Nature Communications, 2026] A foundation model for continuous glucose monitoring data
[paper][code]
- [Nature, 2026] Insulin resistance prediction from wearables and routine blood biomarkers
[paper][code]
- [Nature Machine Intelligence, 2026] Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals.
[paper][code]
- [Nature Medicine, 2026] A multimodal sleep foundation model for disease prediction
[paper] [code]
- [Arxiv, 2025] CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models
[paper] [code]
- [NEJM AI, 2025] An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains
[paper] [code]
- [JAMIA Open, 2025] ECG-FM: An Open Electrocardiogram Foundation Model
[paper] [code]
- [Arxiv, 2025] BenchECG and xECG: a benchmark and baseline for ECG foundation models
[paper] [code]
- [Arxiv, 2025] ProtoECGNet: Case-Based Interpretable Deep Learning for Multi-Label ECG Classification with Contrastive Learning
[paper] [code]
- [Arxiv, 2025] OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records
[paper]
- [NeurIPS, 2025] SensorLM: Learning the Language of Wearable Sensors
[paper]
- [NeurIPS, 2025] PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation
[paper] [code]
- [NeurIPS, 2025] BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals
[paper] [code]
- [NeurIPS, 2025] NeurIPT: Foundation Model for Neural Interfaces
[paper] [code]
- [NeurIPS, 2025] LUNA: Efficient and Topology-Agnostic Foundation Model for EEG Signal Analysis
[paper] [code]
- [ICML, 2025] EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping
[paper] [code]
- [ICML, 2025] From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining
[paper] [code]
- [ICML, 2025] Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners
[paper] [code]
- [ICML, 2025] Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions
[paper]
- [Nature, 2025] A generic noninvasive neuromotor interface for human-computer interaction
[paper] [code]
- [UbiComp, 2025] Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings
[paper] [code]
- [ICLR, 2025] CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding
[paper] [code]
- [ICLR, 2025] Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model*
[paper] [code]
- [ICLR, 2025] NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
[paper] [code]
- [ICLR, 2025] PaPaGei: Open Foundation Models for Optical Physiological Signals
[paper] [code]
- [ICLR, 2025] RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data
[paper] [code]
- [SPI Health Data Science, 2025] ECG-LM: Understanding Electrocardiogram with a Large Language Model
[paper]
- [NPJ Digital Medicine, 2024] Self-supervised learning for human activity recognition using 700,000 person-days of wearable data
[paper] [code]
- [NPJ Cardiovascular Health, 2024] Foundation models for cardiovascular disease detection via biosignals from digital stethoscopes
[paper]
- [Cell Reports Medicine, 2024] Foundation model of ECG diagnosis: Diagnostics and explanations of any form and rhythm on ECG
[paper]
- [NeurIPS, 2024] Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
[paper] [code]
- [ICLR, 2024] Large-scale Training of Foundation Models for Wearable Biosignals
[paper]
- [ICLR, 2024] Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI
[paper] [code]
- [AAAI Spring Symposium, 2024] EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model
[paper]
- [IEEE ISBI, 2024] Neuro-GPT: Towards A Foundation Model for EEG
[paper] [code]
- [Applied Intelligence, 2024] SelfPAB: large-scale pre-training on accelerometer data for human activity recognition*
[paper] [code]
- [Physiological Measurement, 2024] Siamquality: a convnet-based foundation model for photoplethysmography signals
[paper] [code]
- [Arxiv, 2024] AnyECG: Foundational Models for Multitask Cardiac Analysis in Real-World Settings
[paper]
- [Arxiv, 2024] Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance
[paper] [code]
- [Arxiv, 2024] BrainWave: A Brain Signal Foundation Model for Clinical Applications
[paper] [code]
- [Arxiv, 2024] Foundation Models for ECG: Leveraging Hybrid Self-Supervised Learning for Advanced Cardiac Diagnostics
[paper]
- [Arxiv, 2024] Scaling Wearable Foundation Models
[paper]
- [Arxiv, 2024] Wearable Accelerometer Foundation Models for Health via Knowledge Distillation
[paper]
- [Arxiv, 2024] Toward Foundation Model for Multivariate Wearable Sensing of Physiological Signals
[paper] [code]
- [Arxiv, 2024] Foundation Models for ECG: Leveraging Hybrid Self-Supervised Learning for Advanced Cardiac Diagnostics
[paper]
- [NPJ Digital Medicine, 2023] A foundational vision transformer improves diagnostic performance for electrocardiograms
[paper] [code]
- [NeurIPS, 2023] Brant: Foundation Model for Intracranial Neural Signal
[paper] [code]
- [NeurIPS, 2023] Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling*
[paper] [code]
*Self-supervised approaches not explicitly claimed as foundation models
Repurposing general time series foundation models for biomedical-domain-specific tasks
π Mainstream Time Series Foundation Models (Click to expand)
| Model | Venue | Year | Dataset Scale | Model Size | Tasks | Paper | Code |
|---|---|---|---|---|---|---|---|
| Time-MoE | ICLR | 2025 | 309B | 2.4B | F |
[paper] | [code] |
| Timer-XL | ICLR | 2025 | - | - | F |
[paper] | [code] |
| ChatTime | AAAI | 2025 | 1M | 350M | F |
[paper] | [code] |
| TimePFN | AAAI | 2025 | 1.5M | - | F |
[paper] | [code] |
| TTM | NeurIPS | 2024 | 1B | 5M | F |
[paper] | [code] |
| Time-FFM | NeurIPS | 2024 | - | - | F |
[paper] | [code] |
| UniTS | NeurIPS | 2024 | - | 8M | F D I |
[paper] | [code] |
| Moirai-MOE | NeurIPS Workshop | 2024 | - | 11M-86M | F |
[paper] | [code] |
| Moirai | ICML | 2024 | 27B | 14 / 91 / 311M | F |
[paper] | [code] |
| MOMENT | ICML | 2024 | 1B | 385M | F C D I |
[paper] | [code] |
| TimesFM | ICML | 2024 | 100B | 200M | F |
[paper] | [code] |
| Timer | ICML | 2024 | 28B | 67M | F D I |
[paper] | [code] |
| DAM | ICLR | 2024 | - | - | F I |
[paper] | - |
| Chronos | TMLR | 2024 | 84B | 20 / 46 / 200 / 710M | F |
[paper] | [code] |
| GTT | ACM CIKM | 2024 | 2B | 7 / 19 / 57M | F |
[paper] | [code] |
| Mamba4Cast | - | 2024 | - | 27M | F |
[paper] | [code] |
| TimeRAF | - | 2024 | 320M | - | F |
[paper] | - |
| TSMamba | - | 2024 | - | - | F |
[paper] | - |
| TimeDiT | - | 2024 | 5B | 33 / 120 / 460 / 680M | F D |
[paper] | - |
| ViTime | - | 2024 | - | 74 / 95M | F |
[paper] | [code] |
| Lag-Llama | - | 2023 | 360M | 200M | F |
[paper] | [code] |
| TimeGPT-1 | - | 2023 | 100B | - | F D |
[paper] | [code] |
Task Legend: F = Forecasting, C = Classification, D = Anomaly Detection, I = Data Imputation
- [ICML, 2025] Efficient Personalized Adaptation for Physiological Signal Foundation Model [paper] - Applied diffusion to generate LoRA weights for adaptation of FMs
- [ICLR, 2025] PaPaGei: Open Foundation Models for Optical Physiological Signals [paper] [code] - Benchmarked against Chronos and MOMENT for PPG analysis, demonstrated transferable features from general time series FMs
- [ML4H, 2024] Are time series foundation models ready for vital sign forecasting in healthcare? [paper] - Adapted Lag-LLaMA, TimesFM, MOMENT, Moirai, Chronos for vital sign forecasting using feature extraction approach
- [ICMI, 2024] Low-rank adaptation of time series foundational models for out-of-domain modality forecasting [paper] - LoRA techniques applied to Moirai and Chronos for cross-domain biosignal adaptation
- [Arxiv, 2024] Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models [paper] - Applied LoRA, QLoRA, and other PEFT techniques to Lag-LLaMA for biomedical time series adaptation
- [Arxiv, 2024] Generalized Prompt Tuning: Adapting frozen univariate time series foundation models for multivariate healthcare time series [paper] [code] - Gen-P-Tuning module for adapting univariate time series FMs (MOMENT) to multivariate clinical data
- [Arxiv, 2024] Repurposing Foundation Model for Generalizable Medical Time Series Classification [paper] - FORMED method repurposing TimesFM to classification tasks with post-processing modules
- [ICML, 2025] H-Tuning: Toward Low-Cost and Efficient ECG-based Cardiovascular Disease Detection with Pre-Trained Models [paper] [code]
- [ICML, 2025] Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG [paper] [code]
Using (multi-modal) LLMs for biosignal analysis and interpretation
Four functional roles of LLMs:
- [Nature Communications, 2026] Transforming wearable data into personal health insights using large language model agents
[paper][code] - An LLM agent for analyzing wearable daily statistics tables and tables of discrete activity events
- [NeurIPS, 2025] GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images
[paper] [code] - Convert ECG to images and feature vectors for multi-modal LLM understanding
- [ICML, 2025] EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping
[paper] [code]- EEG-Modal-LLM alignment for knowledge integration
- [ICML, 2025] Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners
[paper] [code] - ECG-text-report integration via contrastive masked autoencoding
- [IEEE TBD, 2025] Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction
[paper] - Heart failure risk prediction using LLM-informed dual attention network with ECG time sequences
- [SPI Health Data Science, 2025] ECG-LM: Understanding Electrocardiogram with a Large Language Model
[paper] - Comprehensive ECG understanding using LLMs with signal analysis, text integration, and conversational capabilities
- [Arxiv, 2025] EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
[paper] [code] - EEG emotion recognition with LLM backbone and multimodal fusion for personalized emotion analysis
- [Arxiv, 2025] LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors
[paper] [code] - Multimodal LLM for human activity analysis using IMU data and natural language
- [Cell Reports Medicine, 2024] FoundationmodelofECGdiagnosis:Diagnosticsand explanations of any form and rhythm on ECG
[paper] [code] - Foundation ECG diagnosis model with comprehensive LLM integration across all functional roles
- [ICML, 2024] Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge ncement
[paper] [code] - ECG diagnosis using LLMs with retrieval-augmented generation and knowledge integration
- [ICLR, 2024] BELT-2: Bootstrapping EEG-to-Language Representation Alignment for Multi-task Brain Decoding
[paper] [code] - EEG-to-language alignment using transformer backbone with multimodal integration
- [TMLR, 2024] ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-nced Cardiological Text
[paper] [code] - Foundation ECG model pretrained with LLM-nced cardiological text for semantic integration
- [CHIL, 2024] Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data
[paper] [code] - Health prediction using LLMs with activity measures as text input
- [ACM IMWUT, 2024] Sensor2Text: Enabling Natural Language Interactions for Daily Activity Tracking Using Wearable Sensors
[paper] - Natural language interactions for activity tracking using wearable sensor data with multimodal fusion
- [IEEE CSCAIoT, 2024] Are You Being Tracked? Discover the Power of Zero-Shot Trajectory Tracing with LLMs!
[paper] - Multi-object tracking using LLMs with IMU text sequences and knowledge integration
- [IEEE FMSys, 2024] HARGPT: Are LLMs Zero-Shot Human Activity Recognizers?
[paper] [code] - Zero-shot human activity recognition using LLMs with IMU time sequence data
- [IEEE TNSRE, 2024] Integrating Large Language Model, EEG, and Eye-Tracking for Word-Level Neural State Classification in Reading Comprehension
[paper] - Word-level neural state classification using EEG time sequences with LLM knowledge integration
- [IEEE BHI, 2024] PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models
[paper] - Personalized health insights using LLMs with activity measures and knowledge integration
- [Arxiv, 2024] Conversational Health Agents: A Personalized LLM-Powered Agent Framework
[paper] [code] - Conversational health agent using activity measures as text for personalized health interactions
- [Arxiv, 2024] ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis
[paper] [code] - Large ECG-language model combining signal analysis and time sequences for conversational diagnosis
- [Arxiv, 2024] Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder
[paper] [code] - EEG-to-text generation using feature sequences with LLM knowledge
- [Arxiv, 2024] MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation
[paper] [code] - ECG report generation using multimodal instruction tuning with time sequences
- [Arxiv, 2024] Teach Multimodal LLMs to Comprehend Electrocardiographic Images
[paper] [code] - Multimodal ECG analysis using image plots with vision-language model integration
- [Arxiv, 2024] Towards a Personal Health Large Language Model
[paper] - Personal health LLM agent using activity measures as text input for personalized health insights
- [EMNLP, 2023] Can Brain Signals Reveal Inner Alignment with Human Languages?
[paper] [code] - EEG analysis using LoRA-enhanced LLMs with multimodal fusion capabilities
- [ACM ISWC, 2023] IMUGPT 2.0: Language-Based Cross Modality Transfer for Sensor-Based Human Activity Recognition
[paper] [code] - IMU signal analysis using generative transformer architecture for activity recognition
- [MIDL, 2023] Frozen Language Model Helps ECG Zero-Shot Learning
[paper] - ECG zero-shot learning using frozen language models for knowledge enhancement
- [AAAI, 2022] Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification
[paper] [code] - EEG-to-text decoding using BART backbone with sentiment classification capabilities
Based on our survey, key challenges include:
- Standardization: Harmonizing datasets from diverse sources and acquisition protocols
- Interpretability: Making foundation models clinically interpretable and trustworthy
- Efficiency: Deploying large models in resource-constrained healthcare environments
- Benchmarking: Establishing comprehensive evaluation frameworks for clinical relevance
- Security & Safety: Protecting sensitive biosignal data from privacy breaches and ensuring model robustness against adversarial attacks in clinical settings
We welcome contributions! Please:
- Follow the format:
**[Venue Year]** Title [[paper]](link) [[code]](link) - Focus on foundation models for biosignals (not general ML papers)
- Ensure papers fit one of the three main directions
- Open an issue or submit a PR
Citation: If you find this repository useful, please cite our survey:
@article{gu2025bfm,
title={Foundation Models for Biosignals: A Survey},
author = {Gu, Xiao and Shu, Yuxuan and Han, Jinpei and Liu, Yuxuan and Liu, Zhangdaihong and Anibal, James and Sangha, Veer and Phillips, Edward and Segal, Bradley and Liu, Yuxuan and Yuan, Hang and Liu, Fenglin and Branson, Kim and Schwab, Patrick and Belgrave, Danielle and Clifton, Lei and Spathis, Dimitris and Lampos, Vasileios and Faisal, A. Aldo and Clifton, David A.}
year={2025},
publisher={TechRxiv}
}

