Publications

Selected publications. For a complete list, see my Google Scholar profile.

Self-Distillation of Hidden Layers for Self-Supervised Representation Learning

Scott Lowe, Anthony Fuller, Sageev Oore, Evan Shelhamer, Graham Taylor

arXiv preprint arXiv:2603.15553, 2026

Self-distilling intermediate layer representations to improve self-supervised visual learning.

arXiv

CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale

ZeMing Gong, Austin Wang, Xiaoliang Huo, Joakim Bruslund Haurum, Scott Lowe, Graham Taylor, Angel Chang

International Conference on Learning Representations (ICLR), 2025

Contrastive learning across images, DNA barcodes, and taxonomic labels for zero-shot insect classification.

arXiv GitHub Project

Harnessing Artificial Intelligence to Fill Global Shortfalls in Biodiversity

Laura Pollock, Justin Kitzies, Sara Beery, Kaitlyn Gaynor, Marta Jarzyna, Oisin Mac Aodha, Bernd Meyer, David Rolnick, Graham Taylor, Devis Tuia, Tanya Berger-Wolf

Nature Reviews, 2025

A review of how AI is transforming biodiversity observation, from CNNs to foundation models.

Paper

BarcodeMamba: State Space Models for Biodiversity Analysis

Tiancheng Gao, Graham Taylor

Neural Information Processing Systems (NeurIPS) Workshop on Foundation Models for Science, 2024

State-space models applied to DNA barcode sequences for taxonomic classification at scale.

arXiv GitHub

BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity

Zahra Gharaee, Scott Lowe, ZeMing Gong, Pablo Millan Arias, Nicholas Pellegrino, Austin Wang, Joakim Bruslund Haurum, Iuliia Zarubiieva, Lila Kari, Dirk Steinke, Graham Taylor, Paul Fieguth, Angel Chang

Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2024

A multimodal dataset of over 5 million insect specimens with images, DNA barcodes, and taxonomic labels.

arXiv GitHub

Agglomerative Token Clustering

Joakim Bruslund Haurum, Sergio Escalera, Graham Taylor, Thomas Moeslund

European Conference on Computer Vision (ECCV), 2024

Bottom-up hierarchical clustering for merging vision transformer tokens without any learnable parameters, especially effective at low keep rates.

arXiv GitHub

BarcodeBERT: Transformers for Biodiversity Analysis

Pablo Millan Arias, Niousha Sadjadi, Monireh Safari, ZeMing Gong, Austin Wang, Scott Lowe, Joakim Bruslund Haurum, Iuliia Zarubiieva, Dirk Steinke, Lila Kari, Angel Chang, Graham Taylor

Neural Information Processing Systems (NeurIPS) Workshop on Self-Supervised Learning: Theory and Practice, 2023

Self-supervised transformer pretraining on invertebrate DNA barcodes for taxonomic identification, matching BLAST accuracy at species level while running far faster.

arXiv GitHub

Bandit-Driven Batch Selection for Robust Learning under Label Noise

Michal Lisicki, Mihai Nica, Graham Taylor

Neural Information Processing Systems (NeurIPS) Workshop on Optimization for Machine Learning, 2023

Combinatorial bandits guiding SGD batch selection for robustness to label noise, without the overhead of auxiliary networks.

arXiv

A Step Towards Worldwide Biodiversity Assessment: The BIOSCAN-1M Insect Dataset

Zahra Gharaee, ZeMing Gong, Nicholas Pellegrino, Iuliia Zarubiieva, Joakim Bruslund Haurum, Scott Lowe, Jaclyn McKeown, Chris Ho, Joschka McLeod, Yi-Yun Wei, Jireh Agda, Sujeevan Ratnasingham, Dirk Steinke, Angel Chang, Graham Taylor, Paul Fieguth

Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, 2023

A million-image dataset of expert-labelled insect specimens paired with DNA barcodes and hierarchical taxonomic labels.

arXiv GitHub

Which Tokens to Use? Investigating Token Reduction in Vision Transformers

Joakim Bruslund Haurum, Sergio Escalera, Graham Taylor, Thomas Moeslund

International Conference on Computer Vision (ICCV) Workshop on New Ideas in Vision Transformers, 2023

Comparing ten token reduction methods for vision transformers and finding Top-K pruning a surprisingly strong baseline.

arXiv GitHub

Empirically Validating Conformal Prediction on Modern Vision Architectures Under Distribution Shift and Long-tailed Data

Kevin Kasa, Graham Taylor

International Conference on Machine Learning (ICML) Workshop on Structured Probabilistic Inference & Generative Modeling, 2023

Empirically showing that conformal prediction safety guarantees frequently break down on modern vision models under distribution shift and long-tailed data.

arXiv

The Catalog Problem: Clustering and Ordering Variable-Sized Sets

Mateusz Jurewicz, Graham Taylor, Leon Derczynski

International Conference on Machine Learning (ICML), 2023

A fully differentiable architecture for predicting a varying number of ordered clusters from variable-sized sets, applied to product catalogue structuring.

Paper

Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers

Cong Wei, Brendan Duke, Ruowei Jiang, Parham Aarabi, Graham Taylor, Florian Shkurti

Conference on Computer Vision and Pattern Recognition (CVPR), 2023

Learning instance-dependent sparse attention masks for vision transformers, cutting self-attention compute by roughly half with negligible accuracy loss.

arXiv GitHub

Bounding generalization error with input compression: An empirical study with infinite-width networks

Angus Galloway, Anna Golubeva, Mahmoud Salem, Mihai Nica, Yani Ioannou, Graham Taylor

Transactions on Machine Learning Research (TMLR), 2022

An empirical study of an input-compression generalization bound using infinite-width networks to estimate mutual information between inputs and representations.

arXiv

On Evaluation Metrics for Graph Generative Models

Rylee Thompson, Boris Knyazev, Elahe Ghalebi, Jungtaek Kim, Graham Taylor

International Conference on Learning Representations (ICLR), 2022

Scalar, domain-agnostic metrics for graph generative models built from features of untrained random graph neural networks.

arXiv GitHub