The 10,000th HLA sequence submitted to the IPD-IMGT/HLA Database
At the end of 2025, we reached an important milestone: more than 10,000 novel HLA alleles have now been fully characterized and submitted to the international HLA reference database IPD‑IMGT/HLA. This achievement reflects over a decade of continuous refinement of long‑read sequencing workflows, enabling comprehensive full‑gene characterization across all HLA loci used in donor typing.
From discovery to database submission
Novel alleles are reported when sequence variants are detected that differ from all previously recorded reference sequences. Depending on the HLA locus, this occurs in 20 to 73 out of 100,000 genotyped samples (see figure below). Because such alleles cannot be definitively assigned, donors initially receive provisional “XX” genotypes in the affected loci, which may limit their usability in clinical donor searches.
To resolve these cases, we perform a full‑gene sequencing workflow, covering 3,500 -16,000 base pairs depending on the locus. Each candidate allele is processed twice using Illumina and Oxford Nanopore Sequencing as complementary platforms with respect to accuracy and phasing. Only when both datasets result in fully concordant sequences do we submit the allele to the IPD‑IMGT/HLA Database for official nomenclature assignment. In addition to novel alleles, we also submit sequence confirmations and extensions, thereby strengthening the reliability and completeness of the database (https://doi.org/10.1111/tan.13057).
A global contribution to immunogenetics
With more than 10,000 submissions - including approximately 6,500 distinct novel HLA alleles and 3,500 confirmations or sequence extensions - we are one of the major contributors to the IPD-IMGT/HLA Database, which currently lists 43,758 HLA alleles (release 3.63; January 2025). Through this work, we expand global understanding of immunogenetic diversity and contribute to improved donor matching and better outcomes for patients requiring stem cell transplantation.
Explanation
Allele: Each gene is defined by its DNA sequence. There are different versions of the sequence of genes called alleles. Each allele has specific variations in the genetic code. The color of an eye or of a flower are popular examples of the impact of alleles.
HLA: HLA (human leukocyte antigen) refers to a group of highly variable genes on chromosome 6 that encode cell‑surface proteins essential for helping the immune system distinguish self from non‑self.
Locus: A locus (plural: loci) in genetics is the specific, fixed position of a gene or DNA sequence on a chromosome. At DKMS a locus is used to describe a genetic target.
Phasing: phasing in sequencing is the process of determining which genetic variants occur together on the same parental chromosome. Long continuous sequencing reads as obtained by nanopore sequencing enable us to phase the sequence of HLA alleles.
Figure: Number of identified (non-distinct) novel alleles at DKMS Life Science Lab per 100,000 genotyped samples in 2025. The genes MICA and MICB are not shown as they sum to over 600 novel alleles per 100,000 samples.