Skip to content

guxiao0822/Awesome-Biosignal-Foundation-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 

Repository files navigation

Awesome Foundation Models for Biosignals Awesome PR's Welcome

Biosignals like ECG, EEG, and PPG capture the body's physiological and behavioural "languages", but analyzing them at scale requires dedicated approaches. Foundation models, which have transformed NLP and computer vision, are now emerging for biosignal analysis, promising to unlock patterns across vast amounts of sensor readings.

This curated list covers foundation models (FMs), datasets, and tools for biosignals based on our comprehensive survey of the field.

πŸ“š Survey Paper: Foundation Models for Biosignals: A Survey - A comprehensive review of foundation model development and applications for biosignals.

πŸ—ΊοΈ Landscape of Biosignal Foundation Models

Three Directions Overview *timeline marked by the first online date of each work

The field of biosignal foundation models can be organized into three converging directions:

  1. Pretraining Biosignal FMs from Scratch: Pretraining dedicated foundation models on large biosignal corpora
  2. Adapting Time Series FMs to Biosignals: Repurposing general time series foundation models to biosignal tasks
  3. Leveraging LLMs for Biosignals: Leverage (multi-modal) large language models for biosignal analysis

πŸ“Œ Table of Contents

πŸ“š Surveys & Reviews

  • [IEEE TPAMI, 2025] Foundation models defining a new era in vision: a survey and outlook [paper]
  • [ACM SIGKDD, 2024] Foundation models for time series analysis: a tutorial and survey [paper]
  • [Arxiv, 2024] A survey of time series foundation models: generalizing time series representation with large language mode [paper]
  • [Arxiv, 2024] Biomedical foundation model: a survey [paper]
  • [Arxiv, 2024] Deep time series models: a comprehensive survey and benchmark [paper]
  • [Arxiv, 2024] Foundation models for video understanding: a survey [paper]
  • [IJCAI, 2024] Large language models for time series: a survey [paper]

πŸ”¬ Datasets & Benchmarks & Tools

Illustration of representative biosignals

  • ECG - Electrocardiography (heart electrical activity)
  • EEG - Electroencephalography (brain electrical activity)
  • PPG - Photoplethysmography (blood volume changes)
  • IMU - Inertial Measurement Unit (motion sensors)
  • EMG - Electromyography (muscle electrical activity)
  • ABP - Arterial Blood Pressure
  • PCG - Phonocardiography (heart sounds)
  • Resp - Respiration/Respiratory signals (breathing patterns)

Large-Scale Training Datasets

Dataset Modality # Individuals # Duration (hr) Link
UKBiobank ECG PPG IMU Health Metrics - - [dataset]
MC-MED ECG PPG Resp Vital Signs 70K - [dataset]
MIMIC-III-WDB ECG PPG ABP Resp Vital Signs 30K 3M [dataset]
VitalDB ECG PPG ABP Resp Vital Signs 6K - [dataset]
PulseDB ECG PPG ABP 5K 50M [dataset]
MESA ECG PPG EEG 2K - [dataset]
VTaC ECG PPG ABP 2K - [dataset]
CODE ECG 2M - [dataset]
MIMIC-IV-ECG ECG 160K 2K [dataset]
PhysioNet2020 ECG 40K 90 [dataset]
eICU Vital Signs 139K - [dataset]
HiRID Vital Signs 34K - [dataset]
UCSF-PPG PPG 21K 600K [dataset]
TUEG EEG 15K 27K [dataset]
HBN-EEG EEG 3K 3K [dataset]
MOABB EEG >1K - [dataset]
SEED Series EEG Gaze Metrics - - [dataset]
emg2pose EMG 193 370 [dataset]
emg2qwerty EMG 108 346 [dataset]
HUNT4 IMU 35K - [dataset]
Capture24 IMU 151 4K [dataset]
Ego4D IMU Audio Gaze Metrics 923 4K [dataset]
COVID-19 Sounds Audio 36K 552 [dataset]

Evaluation Benchmarks

  • PhysioNet Challenges: Annual international competitions providing standardized evaluation frameworks for various biosignal analysis tasks including ECG interpretation, sleep staging, and arrhythmia detection [benchmark]

  • MOABB (Mother of All BCI Benchmarks): Comprehensive benchmarking framework for EEG-based brain-computer interface algorithms with standardized pipelines and evaluation protocols [benchmark]

Preprocessing Toolboxes and Packages

Toolbox/Package Modality Link
MNE-Python EEG [tool]
NeuroKit ECG PPG Resp EMG [tool]
HeartPy ECG PPG [tool]
pyHRV ECG PPG [tool]
BIOBSS ECG PPG IMU [tool]
BioSPPy ECG PPG EEG Resp EMG PCG [tool]
PyPhysio ECG PPG IMU [tool]
EEGLAB EEG [tool]
Vital-Sqi ECG PPG [tool]
PhysioKit PPG Resp [tool]
PPGFeat PPG [tool]
WFDB Toolbox ECG PPG EEG Resp ABP EMG [tool]
PyPPG PPG [tool]

πŸ—οΈ Pretraining Biosignal Foundation Models from Scratch

Building dedicated foundation models using large biosignal corpora

  • [Nature Communications, 2026] A unified time-frequency foundation model for sleep decoding EEG EMG EOG [paper][code]
  • [Nature Communications, 2026] A foundation model for continuous glucose monitoring data Health-metrics [paper][code]
  • [Nature, 2026] Insulin resistance prediction from wearables and routine blood biomarkers Health-metrics [paper][code]
  • [Nature Machine Intelligence, 2026] Cardiac health assessment across scenarios and devices using a multimodal foundation model pretrained on data from 1.7 million individuals. ECG PPG [paper][code]
  • [Nature Medicine, 2026] A multimodal sleep foundation model for disease prediction ECG EEG Resp EMG[paper] [code]
  • [Arxiv, 2025] CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models ECG [paper] [code]
  • [NEJM AI, 2025] An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains ECG [paper] [code]
  • [JAMIA Open, 2025] ECG-FM: An Open Electrocardiogram Foundation Model ECG [paper] [code]
  • [Arxiv, 2025] BenchECG and xECG: a benchmark and baseline for ECG foundation models ECG [paper] [code]
  • [Arxiv, 2025] ProtoECGNet: Case-Based Interpretable Deep Learning for Multi-Label ECG Classification with Contrastive Learning ECG [paper] [code]
  • [Arxiv, 2025] OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records ECG [paper]
  • [NeurIPS, 2025] SensorLM: Learning the Language of Wearable Sensors Health-metrics [paper]
  • [NeurIPS, 2025] PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation ECG EMG [paper] [code]
  • [NeurIPS, 2025] BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals EEG MEG [paper] [code]
  • [NeurIPS, 2025] NeurIPT: Foundation Model for Neural Interfaces EEG [paper] [code]
  • [NeurIPS, 2025] LUNA: Efficient and Topology-Agnostic Foundation Model for EEG Signal Analysis EEG [paper] [code]
  • [ICML, 2025] EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping EEG [paper] [code]
  • [ICML, 2025] From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining ECG [paper] [code]
  • [ICML, 2025] Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners ECG [paper] [code]
  • [ICML, 2025] Beyond Sensor Data: Foundation Models of Behavioral Data from Wearables Improve Health Predictions Health-metrics [paper]
  • [Nature, 2025] A generic noninvasive neuromotor interface for human-computer interaction EMG [paper] [code]
  • [UbiComp, 2025] Pulse-PPG: An Open-Source Field-Trained PPG Foundation Model for Wearable Applications Across Lab and Field Settings PPG [paper] [code]
  • [ICLR, 2025] CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding EEG [paper] [code]
  • [ICLR, 2025] Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model* ECG [paper] [code]
  • [ICLR, 2025] NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals EEG [paper] [code]
  • [ICLR, 2025] PaPaGei: Open Foundation Models for Optical Physiological Signals PPG [paper] [code]
  • [ICLR, 2025] RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data IMU [paper] [code]
  • [SPI Health Data Science, 2025] ECG-LM: Understanding Electrocardiogram with a Large Language Model ECG [paper]
  • [NPJ Digital Medicine, 2024] Self-supervised learning for human activity recognition using 700,000 person-days of wearable data IMU [paper] [code]
  • [NPJ Cardiovascular Health, 2024] Foundation models for cardiovascular disease detection via biosignals from digital stethoscopes ECG PCG [paper]
  • [Cell Reports Medicine, 2024] Foundation model of ECG diagnosis: Diagnostics and explanations of any form and rhythm on ECG ECG [paper]
  • [NeurIPS, 2024] Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking Respiratory [paper] [code]
  • [ICLR, 2024] Large-scale Training of Foundation Models for Wearable Biosignals ECG PPG [paper]
  • [ICLR, 2024] Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI EEG [paper] [code]
  • [AAAI Spring Symposium, 2024] EEGFormer: Towards Transferable and Interpretable Large-Scale EEG Foundation Model EEG [paper]
  • [IEEE ISBI, 2024] Neuro-GPT: Towards A Foundation Model for EEG EEG [paper] [code]
  • [Applied Intelligence, 2024] SelfPAB: large-scale pre-training on accelerometer data for human activity recognition* IMU [paper] [code]
  • [Physiological Measurement, 2024] Siamquality: a convnet-based foundation model for photoplethysmography signals PPG [paper] [code]
  • [Arxiv, 2024] AnyECG: Foundational Models for Multitask Cardiac Analysis in Real-World Settings ECG [paper]
  • [Arxiv, 2024] Self-Supervised Pre-Training with Joint-Embedding Predictive Architecture Boosts ECG Classification Performance ECG [paper] [code]
  • [Arxiv, 2024] BrainWave: A Brain Signal Foundation Model for Clinical Applications EEG [paper] [code]
  • [Arxiv, 2024] Foundation Models for ECG: Leveraging Hybrid Self-Supervised Learning for Advanced Cardiac Diagnostics ECG [paper]
  • [Arxiv, 2024] Scaling Wearable Foundation Models Health-metrics [paper]
  • [Arxiv, 2024] Wearable Accelerometer Foundation Models for Health via Knowledge Distillation PPG IMU [paper]
  • [Arxiv, 2024] Toward Foundation Model for Multivariate Wearable Sensing of Physiological Signals ECG EEG PPG IMU EMG [paper] [code]
  • [Arxiv, 2024] Foundation Models for ECG: Leveraging Hybrid Self-Supervised Learning for Advanced Cardiac Diagnostics ECG [paper]
  • [NPJ Digital Medicine, 2023] A foundational vision transformer improves diagnostic performance for electrocardiograms ECG [paper] [code]
  • [NeurIPS, 2023] Brant: Foundation Model for Intracranial Neural Signal EEG [paper] [code]
  • [NeurIPS, 2023] Learning Topology-Agnostic EEG Representations with Geometry-Aware Modeling* EEG [paper] [code]

*Self-supervised approaches not explicitly claimed as foundation models

πŸ”„ Adapting Time Series Foundation Models to Biosignals

Repurposing general time series foundation models for biomedical-domain-specific tasks

Tasks Perspective

πŸ“Š Mainstream Time Series Foundation Models (Click to expand)
Model Venue Year Dataset Scale Model Size Tasks Paper Code
Time-MoE ICLR 2025 309B 2.4B F [paper] [code]
Timer-XL ICLR 2025 - - F [paper] [code]
ChatTime AAAI 2025 1M 350M F [paper] [code]
TimePFN AAAI 2025 1.5M - F [paper] [code]
TTM NeurIPS 2024 1B 5M F [paper] [code]
Time-FFM NeurIPS 2024 - - F [paper] [code]
UniTS NeurIPS 2024 - 8M F D I [paper] [code]
Moirai-MOE NeurIPS Workshop 2024 - 11M-86M F [paper] [code]
Moirai ICML 2024 27B 14 / 91 / 311M F [paper] [code]
MOMENT ICML 2024 1B 385M F C D I [paper] [code]
TimesFM ICML 2024 100B 200M F [paper] [code]
Timer ICML 2024 28B 67M F D I [paper] [code]
DAM ICLR 2024 - - F I [paper] -
Chronos TMLR 2024 84B 20 / 46 / 200 / 710M F [paper] [code]
GTT ACM CIKM 2024 2B 7 / 19 / 57M F [paper] [code]
Mamba4Cast - 2024 - 27M F [paper] [code]
TimeRAF - 2024 320M - F [paper] -
TSMamba - 2024 - - F [paper] -
TimeDiT - 2024 5B 33 / 120 / 460 / 680M F D [paper] -
ViTime - 2024 - 74 / 95M F [paper] [code]
Lag-Llama - 2023 360M 200M F [paper] [code]
TimeGPT-1 - 2023 100B - F D [paper] [code]

Task Legend: F = Forecasting, C = Classification, D = Anomaly Detection, I = Data Imputation

TSFM adaptation

  • [ICML, 2025] Efficient Personalized Adaptation for Physiological Signal Foundation Model [paper] - Applied diffusion to generate LoRA weights for adaptation of FMs
  • [ICLR, 2025] PaPaGei: Open Foundation Models for Optical Physiological Signals [paper] [code] - Benchmarked against Chronos and MOMENT for PPG analysis, demonstrated transferable features from general time series FMs
  • [ML4H, 2024] Are time series foundation models ready for vital sign forecasting in healthcare? [paper] - Adapted Lag-LLaMA, TimesFM, MOMENT, Moirai, Chronos for vital sign forecasting using feature extraction approach
  • [ICMI, 2024] Low-rank adaptation of time series foundational models for out-of-domain modality forecasting [paper] - LoRA techniques applied to Moirai and Chronos for cross-domain biosignal adaptation
  • [Arxiv, 2024] Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models [paper] - Applied LoRA, QLoRA, and other PEFT techniques to Lag-LLaMA for biomedical time series adaptation
  • [Arxiv, 2024] Generalized Prompt Tuning: Adapting frozen univariate time series foundation models for multivariate healthcare time series [paper] [code] - Gen-P-Tuning module for adapting univariate time series FMs (MOMENT) to multivariate clinical data
  • [Arxiv, 2024] Repurposing Foundation Model for Generalizable Medical Time Series Classification [paper] - FORMED method repurposing TimesFM to classification tasks with post-processing modules

Other related works

  • [ICML, 2025] H-Tuning: Toward Low-Cost and Efficient ECG-based Cardiovascular Disease Detection with Pre-Trained Models [paper] [code]
  • [ICML, 2025] Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEG [paper] [code]

πŸ€– Leveraging Large Language Models for Biosignal Analysis

Using (multi-modal) LLMs for biosignal analysis and interpretation

LLM for Biosignals

Four functional roles of LLMs: tool tool tool tool

  • [Nature Communications, 2026] Transforming wearable data into personal health insights using large language model agents [paper][code] - An LLM agent for analyzing wearable daily statistics tables and tables of discrete activity events
  • [NeurIPS, 2025] GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [paper] [code] - Convert ECG to images and feature vectors for multi-modal LLM understanding
  • [ICML, 2025] EEG-Language Pretraining for Highly Label-Efficient Clinical Phenotyping [paper] [code]- EEG-Modal-LLM alignment for knowledge integration
  • [ICML, 2025] Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners [paper] [code] - ECG-text-report integration via contrastive masked autoencoding
  • [IEEE TBD, 2025] Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction [paper] - Heart failure risk prediction using LLM-informed dual attention network with ECG time sequences
  • [SPI Health Data Science, 2025] ECG-LM: Understanding Electrocardiogram with a Large Language Model [paper] - Comprehensive ECG understanding using LLMs with signal analysis, text integration, and conversational capabilities
  • [Arxiv, 2025] EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation [paper] [code] - EEG emotion recognition with LLM backbone and multimodal fusion for personalized emotion analysis
  • [Arxiv, 2025] LLaSA: A Multimodal LLM for Human Activity Analysis Through Wearable and Smartphone Sensors [paper] [code] - Multimodal LLM for human activity analysis using IMU data and natural language
  • [Cell Reports Medicine, 2024] FoundationmodelofECGdiagnosis:Diagnosticsand explanations of any form and rhythm on ECG [paper] [code] - Foundation ECG diagnosis model with comprehensive LLM integration across all functional roles
  • [ICML, 2024] Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge ncement [paper] [code] - ECG diagnosis using LLMs with retrieval-augmented generation and knowledge integration
  • [ICLR, 2024] BELT-2: Bootstrapping EEG-to-Language Representation Alignment for Multi-task Brain Decoding [paper] [code] - EEG-to-language alignment using transformer backbone with multimodal integration
  • [TMLR, 2024] ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-nced Cardiological Text [paper] [code] - Foundation ECG model pretrained with LLM-nced cardiological text for semantic integration
  • [CHIL, 2024] Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data [paper] [code] - Health prediction using LLMs with activity measures as text input
  • [ACM IMWUT, 2024] Sensor2Text: Enabling Natural Language Interactions for Daily Activity Tracking Using Wearable Sensors [paper] - Natural language interactions for activity tracking using wearable sensor data with multimodal fusion
  • [IEEE CSCAIoT, 2024] Are You Being Tracked? Discover the Power of Zero-Shot Trajectory Tracing with LLMs! [paper] - Multi-object tracking using LLMs with IMU text sequences and knowledge integration
  • [IEEE FMSys, 2024] HARGPT: Are LLMs Zero-Shot Human Activity Recognizers? [paper] [code] - Zero-shot human activity recognition using LLMs with IMU time sequence data
  • [IEEE TNSRE, 2024] Integrating Large Language Model, EEG, and Eye-Tracking for Word-Level Neural State Classification in Reading Comprehension [paper] - Word-level neural state classification using EEG time sequences with LLM knowledge integration
  • [IEEE BHI, 2024] PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models [paper] - Personalized health insights using LLMs with activity measures and knowledge integration
  • [Arxiv, 2024] Conversational Health Agents: A Personalized LLM-Powered Agent Framework [paper] [code] - Conversational health agent using activity measures as text for personalized health interactions
  • [Arxiv, 2024] ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis [paper] [code] - Large ECG-language model combining signal analysis and time sequences for conversational diagnosis
  • [Arxiv, 2024] Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder [paper] [code] - EEG-to-text generation using feature sequences with LLM knowledge
  • [Arxiv, 2024] MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation [paper] [code] - ECG report generation using multimodal instruction tuning with time sequences
  • [Arxiv, 2024] Teach Multimodal LLMs to Comprehend Electrocardiographic Images [paper] [code] - Multimodal ECG analysis using image plots with vision-language model integration
  • [Arxiv, 2024] Towards a Personal Health Large Language Model [paper] - Personal health LLM agent using activity measures as text input for personalized health insights
  • [EMNLP, 2023] Can Brain Signals Reveal Inner Alignment with Human Languages? [paper] [code] - EEG analysis using LoRA-enhanced LLMs with multimodal fusion capabilities
  • [ACM ISWC, 2023] IMUGPT 2.0: Language-Based Cross Modality Transfer for Sensor-Based Human Activity Recognition [paper] [code] - IMU signal analysis using generative transformer architecture for activity recognition
  • [MIDL, 2023] Frozen Language Model Helps ECG Zero-Shot Learning [paper] - ECG zero-shot learning using frozen language models for knowledge enhancement
  • [AAAI, 2022] Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification [paper] [code] - EEG-to-text decoding using BART backbone with sentiment classification capabilities

‼️ Open Challenges

Based on our survey, key challenges include:

  • Standardization: Harmonizing datasets from diverse sources and acquisition protocols
  • Interpretability: Making foundation models clinically interpretable and trustworthy
  • Efficiency: Deploying large models in resource-constrained healthcare environments
  • Benchmarking: Establishing comprehensive evaluation frameworks for clinical relevance
  • Security & Safety: Protecting sensitive biosignal data from privacy breaches and ensuring model robustness against adversarial attacks in clinical settings

🀝 Contributing

We welcome contributions! Please:

  • Follow the format: **[Venue Year]** Title [[paper]](link) [[code]](link)
  • Focus on foundation models for biosignals (not general ML papers)
  • Ensure papers fit one of the three main directions
  • Open an issue or submit a PR

Citation: If you find this repository useful, please cite our survey:

@article{gu2025bfm,
  title={Foundation Models for Biosignals: A Survey},
  author = {Gu, Xiao and Shu, Yuxuan and Han, Jinpei and Liu, Yuxuan and Liu, Zhangdaihong and Anibal, James and Sangha, Veer and Phillips, Edward and Segal, Bradley and Liu, Yuxuan and Yuan, Hang and Liu, Fenglin and Branson, Kim and Schwab, Patrick and Belgrave, Danielle and Clifton, Lei and Spathis, Dimitris and Lampos, Vasileios and Faisal, A. Aldo and Clifton, David A.}
  year={2025},
  publisher={TechRxiv}
}

Releases

No releases published

Packages

 
 
 

Contributors