HRD Prediction Workflow Using WES Data in Tumor-Only Mode

can you describe steps for HRD prediction with using different software in tumor only mode with wes data?


Certainly! Homologous Recombination Deficiency (HRD) prediction using Whole Exome Sequencing (WES) data in tumor-only mode involves several steps. Here’s a general workflow using different software tools:

1. Data Preparation

  • Sequencing Data: Obtain raw WES data (FASTQ files) from the tumor sample.
  • Quality Control: Use tools like FastQC to assess the quality of the raw sequencing data.

2. Read Alignment

  • Align Reads: Use an aligner like BWA-MEM to map the reads to the reference genome (e.g., GRCh38).
    bwa mem -t 8 reference_genome.fa sample_R1.fastq.gz sample_R2.fastq.gz > aligned_reads.sam

3. Post-Alignment Processing

  • Convert SAM to BAM: Use SAMtools to convert SAM to BAM and sort the BAM file.
    samtools view -Sb aligned_reads.sam | samtools sort -o sorted_reads.bam
  • Mark Duplicates: Use Picard to mark duplicate reads.
    picard MarkDuplicates I=sorted_reads.bam O=dedup_reads.bam M=metrics.txt
  • Index BAM File: Index the BAM file using SAMtools.
    samtools index dedup_reads.bam

4. Variant Calling

  • Call Variants: Use a variant caller like GATK HaplotypeCaller to call variants.
    gatk HaplotypeCaller -R reference_genome.fa -I dedup_reads.bam -O raw_variants.vcf

5. Variant Filtering

  • Filter Variants: Apply filters to the raw variant calls using GATK VariantFiltration.
    gatk VariantFiltration -R reference_genome.fa -V raw_variants.vcf -O filtered_variants.vcf --filter-expression "QD < 2.0 || FS > 60.0 || MQ < 40.0" --filter-name "basic_snp_filter"

6. HRD Score Calculation

  • HRD Tools: Use specialized tools for HRD prediction. Some popular tools include:
    • HRDetect: A machine learning-based tool that integrates multiple types of genomic data.
    • ScarHRD: A tool that calculates HRD scores based on the number of large-scale state transitions (LSTs), loss of heterozygosity (LOH), and telomeric allelic imbalance (TAI).

Example with ScarHRD:

  • Prepare Input: Convert the VCF file to the required format for ScarHRD.
  • Run ScarHRD: Use the ScarHRD R package to calculate the HRD score.
    scarHRD(input_file = "filtered_variants.vcf", reference = "GRCh38", output_file = "HRD_results.txt")

7. Interpretation and Reporting

  • Interpret Results: Analyze the HRD scores and other relevant metrics to determine the HRD status of the tumor.
  • Generate Report: Compile the findings into a comprehensive report for clinical or research purposes.

Additional Considerations

  • Tumor Purity and Ploidy: Tools like FACETS or PureCN can be used to estimate tumor purity and ploidy, which can affect HRD prediction.
  • Annotation: Use tools like ANNOVAR or VEP to annotate the variants for better interpretation.


This workflow provides a comprehensive approach to HRD prediction using WES data in tumor-only mode. Each step involves specific tools and commands that need to be executed in sequence to ensure accurate and reliable results.