Pascal Notin — Scifaro

Few-shot Protein Fitness Prediction via In-context Learning and Test-time Training

Accurately predicting protein fitness with minimal experimental data is a persistent challenge in protein engineering. We introduce PRIMO (PRotein In-context Mutation Oracle), a transformer-based framework that leverages in-context learning…

Biomolecules · Quantitative Biology 2025-12-03 Felix Teufel , Aaron W. Kollasch , Yining Huang , Ole Winther , Kevin K. Yang , Pascal Notin , Debora S. Marks

Protriever: End-to-End Differentiable Protein Homology Search for Fitness Prediction

Retrieving homologous protein sequences is essential for a broad range of protein modeling tasks such as fitness prediction, protein design, structure modeling, and protein-protein interactions. Traditional workflows have relied on a…

Quantitative Methods · Quantitative Biology 2025-06-11 Ruben Weitzman , Peter Mørch Groth , Lood Van Niekerk , Aoi Otani , Yarin Gal , Debora Marks , Pascal Notin

The CausalBench challenge: A machine learning contest for gene network inference from single-cell perturbation data

In drug discovery, mapping interactions between genes within cellular systems is a crucial early step. Such maps are not only foundational for understanding the molecular mechanisms underlying disease biology but also pivotal for…

Machine Learning · Computer Science 2025-05-20 Mathieu Chevalley , Jacob Sackett-Sanders , Yusuf Roohani , Pascal Notin , Artemy Bakulin , Dariusz Brzezinski , Kaiwen Deng , Yuanfang Guan , Justin Hong , Michael Ibrahim , Wojciech Kotlowski , Marcin Kowiel , Panagiotis Misiakos , Achille Nazaret , Markus Püschel , Chris Wendler , Arash Mehrjou , Patrick Schwab

Multi-megabase scale genome interpretation with genetic language models

Understanding how molecular changes caused by genetic variation drive disease risk is crucial for deciphering disease mechanisms. However, interpreting genome sequences is challenging because of the vast size of the human genome, and…

Genomics · Quantitative Biology 2025-01-15 Frederik Träuble , Lachlan Stuart , Andreas Georgiou , Pascal Notin , Arash Mehrjou , Ron Schwessinger , Mathieu Chevalley , Kim Branson , Bernhard Schölkopf , Cornelia van Duijn , Debora Marks , Patrick Schwab

Multi-Scale Representation Learning for Protein Fitness Prediction

Designing novel functional proteins crucially depends on accurately modeling their fitness landscape. Given the limited availability of functional annotations from wet-lab experiments, previous methods have primarily relied on…

Machine Learning · Computer Science 2024-12-03 Zuobai Zhang , Pascal Notin , Yining Huang , Aurélie Lozano , Vijil Chenthamarakshan , Debora Marks , Payel Das , Jian Tang

DiscoBAX: Discovery of Optimal Intervention Sets in Genomic Experiment Design

The discovery of therapeutics to treat genetically-driven pathologies relies on identifying genes involved in the underlying disease mechanisms. Existing approaches search over the billions of potential interventions to maximize the…

Quantitative Methods · Quantitative Biology 2023-12-08 Clare Lyle , Arash Mehrjou , Pascal Notin , Andrew Jesson , Stefan Bauer , Yarin Gal , Patrick Schwab

RITA: a Study on Scaling Up Generative Protein Sequence Models

In this work we introduce RITA: a suite of autoregressive generative models for protein sequences, with up to 1.2 billion parameters, trained on over 280 million protein sequences belonging to the UniRef-100 database. Such generative models…

Quantitative Methods · Quantitative Biology 2022-07-18 Daniel Hesslow , Niccoló Zanichelli , Pascal Notin , Iacopo Poli , Debora Marks

Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval

The ability to accurately model the fitness landscape of protein sequences is critical to a wide range of applications, from quantifying the effects of human variants on disease likelihood, to predicting immune-escape mutations in viruses…

Machine Learning · Computer Science 2022-05-30 Pascal Notin , Mafalda Dias , Jonathan Frazer , Javier Marchena-Hurtado , Aidan Gomez , Debora S. Marks , Yarin Gal

GeneDisco: A Benchmark for Experimental Design in Drug Discovery

In vitro cellular experimentation with genetic interventions, using for example CRISPR technologies, is an essential step in early-stage drug discovery and target validation that serves to assess initial hypotheses about causal associations…

Machine Learning · Computer Science 2021-10-25 Arash Mehrjou , Ashkan Soleymani , Andrew Jesson , Pascal Notin , Yarin Gal , Stefan Bauer , Patrick Schwab

Improving black-box optimization in VAE latent space using decoder uncertainty

Optimization in the latent space of variational autoencoders is a promising approach to generate high-dimensional discrete objects that maximize an expensive black-box property (e.g., drug-likeness in molecular generation, function…

Machine Learning · Computer Science 2021-07-02 Pascal Notin , José Miguel Hernández-Lobato , Yarin Gal

Improving compute efficacy frontiers with SliceOut

Pushing forward the compute efficacy frontier in deep learning is critical for tasks that require frequent model re-training or workloads that entail training a large number of models. We introduce SliceOut -- a dropout-inspired scheme…

Machine Learning · Computer Science 2021-04-02 Pascal Notin , Aidan N. Gomez , Joanna Yoo , Yarin Gal