Selected Publications

All publications

Genetic variant effect prediction algorithms are used extensively in clinical genomics and research to determine the likely consequences of amino acid substitutions on protein function. We derive three independent, functionally determined human mutation datasets, UniFun, BRCA1-DMS and TP53-TA, and employ them, alongside previously described datasets, to assess the pre-eminent variant effect prediction tools. Apparent accuracies of variant effect prediction tools were influenced significantly by the benchmarking dataset. Benchmarking with the assay-determined datasets yielded considerably lower accuracy than observed for other, potentially more conflicted datasets.
Human Genomics

Constitutional biological processes involve the generation of DNA double-strand breaks (DSBs). We re-analyse public RAFT data to derive sites of DSBs at the single-nucleotide level across the built genome for human HEK293T cells. This refined mapping, combined with accessory ENCODE data tracks and ribosomal DNA-related sequence annotations, will likely be of value for the design of clinically relevant targeted assays such as those for cancer susceptibility, diagnosis, treatment-matching and prognostication.
Genomics Data

ROVER, a DNA variant caller which identifies genetic variants from PCR-targeted massively parallel sequencing (MPS) datasets generated by the Hi-Plex protocol. ROVER permits stringent filtering of sequencing chemistry-induced errors by requiring reported variants to appear in both reads of overlapping pairs above certain thresholds of occurrence. ROVER was developed in tandem with Hi-Plex and has been used successfully to screen for genetic mutations in the breast cancer predisposition gene PALB2. UNDR ROVER provides the same rapid and accurate genetic variant calling as its predecessor with greatly reduced computational costs.
BMC Bioinformatics

Recent Posts

Projects

predictein

Prediction of protein structure for non-model organisms

Pan Prostate Sample Analysis

Analysis of a large number of WGS samples using a common pipeline.

Teaching

Contact