You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Variations in nucleotide sequences often lead to significant changes in fitness. Nucleotide Foundation Models (NFMs) have emerged as a new paradigm in fitness prediction, enabling increasingly accurate estimation of fitness directly from sequence. However, assessing the advantages of these models remains challenging due to the use of diverse and specific experimental datasets, and their performance often varies markedly across different nucleic acid families, complicating **fair comparisons**.
@@ -38,7 +32,7 @@ This suggests fundamental differences in the nature of the representations learn
38
32
39
33
## Baseline Models
40
34
41
-
Our benchmark evaluates a total of 27 nucleotide foundation models, which are categorized into four main architectural classes: **BERT-like**, **GPT-like**, **Hyena**, and **LLaMA-based**.
35
+
Our benchmark evaluates a total of 29 nucleotide foundation models, which are categorized into four main architectural classes: **BERT-like**, **GPT-like**, **Hyena**, and **LLaMA-based**.
42
36
43
37
| Model | Params | Max Length | Tokenization | Architecture |
@@ -154,14 +138,3 @@ This script will generate detailed performance reports, including metrics aggreg
154
138
We thank all the researchers and experimentalists who developed the original assays and foundation models that made this benchmark possible. We also acknowledge the invaluable contributions of the communities behind **ProteinGym** and **RNAGym**, which heavily inspired this work.
155
139
156
140
Please consider citing the corresponding papers of the models and datasets you use from this benchmark.
157
-
158
-
## Citation
159
-
If you use NABench in your work, please cite the following paper:
160
-
161
-
```bibtex
162
-
@article{nawork2024,
163
-
title={NABench: Large-Scale Benchmarks of Nucleotide Foundation Models for Fitness Prediction},
164
-
author={Antiquus S. Hippocampus and Natalia Cerebro and Amelie P. Amygdale and Ji Q. Ren and Yevgeny LeNet},
165
-
year={2024},
166
-
journal={ICLR 2026 Conference Track on Datasets and Benchmarks}
0 commit comments