Skip to content

Publications

Selected publications. For a complete list, see my Google Scholar profile.

Self-Distillation of Hidden Layers for Self-Supervised Representation Learning
Scott Lowe, Anthony Fuller, Sageev Oore, Evan Shelhamer, Graham Taylor
arXiv preprint arXiv:2603.15553, 2026
Self-distilling intermediate layer representations to improve self-supervised visual learning.
arXiv
CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale
ZeMing Gong, Austin Wang, Xiaoliang Huo, Joakim Bruslund Haurum, Scott Lowe, Graham Taylor, Angel Chang
International Conference on Learning Representations (ICLR), 2025
Contrastive learning across images, DNA barcodes, and taxonomic labels for zero-shot insect classification.
Harnessing Artificial Intelligence to Fill Global Shortfalls in Biodiversity
Laura Pollock, Justin Kitzies, Sara Beery, Kaitlyn Gaynor, Marta Jarzyna, Oisin Mac Aodha, Bernd Meyer, David Rolnick, Graham Taylor, Devis Tuia, Tanya Berger-Wolf
Nature Reviews, 2025
A review of how AI is transforming biodiversity observation, from CNNs to foundation models.
Paper
BarcodeMamba: State Space Models for Biodiversity Analysis
Tiancheng Gao, Graham Taylor
Neural Information Processing Systems (NeurIPS) Workshop on Foundation Models for Science, 2024
State-space models applied to DNA barcode sequences for taxonomic classification at scale.
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
Zahra Gharaee, Scott Lowe, ZeMing Gong, Pablo Millan Arias, Nicholas Pellegrino, Austin Wang, Joakim Bruslund Haurum, Iuliia Zarubiieva, Lila Kari, Dirk Steinke, Graham Taylor, Paul Fieguth, Angel Chang
Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2024
A multimodal dataset of over 5 million insect specimens with images, DNA barcodes, and taxonomic labels.
Agglomerative Token Clustering
Joakim Bruslund Haurum, Sergio Escalera, Graham Taylor, Thomas Moeslund
European Conference on Computer Vision (ECCV), 2024
Bottom-up hierarchical clustering for merging vision transformer tokens without any learnable parameters, especially effective at low keep rates.
BarcodeBERT: Transformers for Biodiversity Analysis
Pablo Millan Arias, Niousha Sadjadi, Monireh Safari, ZeMing Gong, Austin Wang, Scott Lowe, Joakim Bruslund Haurum, Iuliia Zarubiieva, Dirk Steinke, Lila Kari, Angel Chang, Graham Taylor
Neural Information Processing Systems (NeurIPS) Workshop on Self-Supervised Learning: Theory and Practice, 2023
Self-supervised transformer pretraining on invertebrate DNA barcodes for taxonomic identification, matching BLAST accuracy at species level while running far faster.
Bandit-Driven Batch Selection for Robust Learning under Label Noise
Michal Lisicki, Mihai Nica, Graham Taylor
Neural Information Processing Systems (NeurIPS) Workshop on Optimization for Machine Learning, 2023
Combinatorial bandits guiding SGD batch selection for robustness to label noise, without the overhead of auxiliary networks.
arXiv
A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset
Zahra Gharaee, ZeMing Gong, Nicholas Pellegrino, Iuliia Zarubiieva, Joakim Bruslund Haurum, Scott Lowe, Jaclyn McKeown, Chris Ho, Joschka McLeod, Yi-Yun Wei, Jireh Agda, Sujeevan Ratnasingham, Dirk Steinke, Angel Chang, Graham Taylor, Paul Fieguth
Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2023
A million-image dataset of expert-labelled insect specimens paired with DNA barcodes and hierarchical taxonomic labels.
Which Tokens to Use? Investigating Token Reduction in Vision Transformers
Joakim Bruslund Haurum, Sergio Escalera, Graham Taylor, Thomas Moeslund
International Conference on Computer Vision (ICCV) Workshop on New Ideas in Vision Transformers, 2023
Comparing ten token reduction methods for vision transformers and finding Top-K pruning a surprisingly strong baseline.
Empirically Validating Conformal Prediction on Modern Vision Architectures Under Distribution Shift and Long-tailed Data
Kevin Kasa, Graham Taylor
International Conference on Machine Learning (ICML) Workshop on Structured Probabilistic Inference & Generative Modeling, 2023
Empirically showing that conformal prediction safety guarantees frequently break down on modern vision models under distribution shift and long-tailed data.
arXiv
The Catalog Problem: Clustering and Ordering Variable-Sized Sets
Mateusz Jurewicz, Graham Taylor, Leon Derczynski
International Conference on Machine Learning (ICML), 2023
A fully differentiable architecture for predicting a varying number of ordered clusters from variable-sized sets, applied to product catalogue structuring.
Paper
Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
Cong Wei, Brendan Duke, Ruowei Jiang, Parham Aarabi, Graham Taylor, Florian Shkurti
Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Learning instance-dependent sparse attention masks for vision transformers, cutting self-attention compute by roughly half with negligible accuracy loss.
Bounding generalization error with input compression: An empirical study with infinite-width networks
Angus Galloway, Anna Golubeva, Mahmoud Salem, Mihai Nica, Yani Ioannou, Graham Taylor
Transactions on Machine Learning Research (TMLR), 2022
An empirical study of an input-compression generalization bound using infinite-width networks to estimate mutual information between inputs and representations.
arXiv
On Evaluation Metrics for Graph Generative Models
Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor
International Conference on Learning Representations (ICLR), 2022
Scalar, domain-agnostic metrics for graph generative models built from features of untrained random graph neural networks.