Predicting HRD Scores Using scarHRD with CNVkit in Tumor-Only Mode

Answer:

Certainly! Using scarHRD to predict HRD (Homologous Recombination Deficiency) scores with CNVkit in tumor-only mode involves several steps. Here’s a detailed guide:

Prerequisites:

Install CNVkit: Ensure you have CNVkit installed. You can install it using pip:
```
pip install cnvkit
```

Install scarHRD: scarHRD is an R package, so you need R and Bioconductor installed. You can install scarHRD using:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("scarHRD")

Steps:

1. Prepare Your Data

Tumor BAM file: You need a BAM file of your tumor sample.
Reference Genome: Ensure you have the reference genome (e.g., hg19 or hg38) that matches your BAM file.

2. Run CNVkit in Tumor-Only Mode

CNVkit can be run in tumor-only mode to generate copy number profiles.

Create a reference: Since you are running in tumor-only mode, you can create a flat reference:
```
cnvkit.py reference -o flat_reference.cnn
```
Run CNVkit: Use the flat reference to process your tumor BAM file:
```
cnvkit.py batch tumor.bam -r flat_reference.cnn -d cnvkit_output/
```
This will generate several output files in the cnvkit_output/ directory, including .cns and .cnr files.

3. Convert CNVkit Output for scarHRD

scarHRD requires specific input formats. You need to convert CNVkit output to a format that scarHRD can use.

Extract segments: Extract the segments from the .cns file:

cnvkit.py export seg cnvkit_output/tumor.cns -o tumor.seg

4. Run scarHRD

Now, you can use the tumor.seg file with scarHRD in R.

Load scarHRD in R:
```
library(scarHRD)
```
Run scarHRD: Use the scar_score function to calculate the HRD score:
```
scarHRD::scar_score("tumor.seg", reference = "hg19")
```
Replace "hg19" with "hg38" if you are using the hg38 reference genome.

Example R Script:

Here’s a complete example R script to run scarHRD:

# Load the scarHRD library
library(scarHRD)

# Define the path to the SEG file
seg_file <- "path/to/tumor.seg"

# Define the reference genome
reference_genome <- "hg19"  # or "hg38"

# Calculate the HRD score
hrd_results <- scarHRD::scar_score(seg_file, reference = reference_genome)

# Print the results
print(hrd_results)

Notes:

Ensure that the paths to your files are correct.
The reference genome specified in scarHRD should match the one used in your BAM file alignment.
scarHRD provides several metrics related to HRD, including HRD-LOH, LST, and TAI scores.

By following these steps, you should be able to predict HRD scores using scarHRD with CNVkit in tumor-only mode.