Usage

To use tcdo-pg-tools in a project:

import tcdo_pg_tools

See Commandline Interface section for commandline options.

Examples

Computing AA coverage across multi-enzyme digests

To compute AA seq coverage across different combinations of enzymes, use the coverage-calculator feature. For example:

tcdo_pg_tools coverage-calculator --fragpipe_dir /Volumes/kentsis/proteomics/fragpipe_results/APS010.1_A673_proteogenomics/spike_in --enzymes argc,aspn,gluc,in-house_chymotrypsin,lysc,lysn,proalanase,trypsin --output_tsv protein_coverage.tsv

The fragpipe output directory must contain subdirectories with the result of each enzyme, and these directories must be labeled starting with the enzyme name (e.g., trypsin_diaPASEF)

Merging multiple proteomegenerator fasta

To merge across multiple proteomegenerator fasta file by merging proteins with the same amino acid sequence, use the merge-fasta feature:

tcdo_pg_tools merge-fasta -i input.csv

The input.csv must have three columns: fasta, sample, condition. fasta is the path to the protein fasta file, sample is the sample name, and condition is the condition for a given sample (e.g., tumor, normal).

You can use the –upset flag to output an upset plot likeso:

tcdo_pg_tools merge-fasta -i input.csv --upset

The upset plot will be plot across the condition column.

Merging proteomegenerator results across multiple samples

To merge multiple proteomegenerator results, use merge-pg-results, which behaves in the same way as merge-fasta, except it filters on the protein.tsv output of philosopher to identify unique proteins. So you can run:

tcdo_pg_tools merge-pg-results -i input.csv --upset

Where here the input.csv must have four columns: fasta, protein_table, sample, condition. fasta is the path to the protein fasta file, protein_table is the protein.tsv file that is output by Philosopher (in the Fragpipe output directory), sample is the sample name, and condition is the condition for a given sample (e.g., tumor, normal).