Rubinacci Lab · FIMM · University of Helsinki
We develop efficient statistical and computational methods to decode human genetic variation at scale, applying them to biobanks to understand the genetic basis of disease.
The team at our lab retreat · Kirkkonummi, Finland · 2026
What we study
Large genomic rearrangements (deletions, duplications, inversions) shape disease risk. We integrate multi-omics data to decode their functional consequences at biobank scale.
We identify shared haplotype segments across populations to illuminate human history and enable novel disease mapping via identity-by-descent analysis.
We design algorithms that process millions of genomes efficiently. Our tools have shaped multiple UK Biobank releases and are widely adopted globally.
lcWGS + imputation is now competitive with SNP arrays. We push accuracy to ultra-rare variants at coverages as low as 0.1× for under $1 per genome.
We phase rare and singleton variants accurately without family data, enabling compound heterozygous disease detection.
We integrate genetic variation with transcriptomics, proteomics, and other molecular layers to trace how genomic changes propagate to disease.
ESHG 2025 · Milan, Italy
Our work for the community
We actively contribute through conference talks, workshops, and open-source software, presented at ESHG, ASHG, and other international conferences.
Methods papers should be accompanied by robust, well-documented software usable by anyone, from large biobanks to individual labs with limited compute.
We introduced a lcWGS imputation method that sublinearly to millions of haplotypes. Applied to 150,119 UK Biobank genomes.
Documentation →Statistical phasing method that allows <5% switch error for ultra-rare variants.
Documentation →Method allowing SNP array imputation scale to millions reference individuals.
Documentation →From our blog
For two decades, population-based phasing methods reported 50% switch error at singleton variants — equivalent to random guessing. SHAPEIT5 changes this, and here's how.
The UK Biobank's WGS release was an extraordinary resource — but existing imputation methods couldn't cope at that scale. Here's how GLIMPSE2 was redesigned from the ground up.
Updates