Exporting Crosslink-Spectrum-Matches to ProXL
from pyXLMS import __version__
print(f"Installed pyXLMS version: {__version__}") Installed pyXLMS version: 1.7.0from pyXLMS import parser
from pyXLMS import exporterAll exporting functionality is available via the exporter submodule. We also import the parser submodule to read crosslink-spectrum-matches.
parser_result = parser.read(
"../../data/ms_annika/XLpeplib_Beveridge_QEx-HFX_DSS_R1_CSMs.txt",
engine="MS Annika",
crosslinker="DSS",
)
csms = parser_result["crosslink-spectrum-matches"] Reading MS Annika CSMs...: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 826/826 [00:00<00:00, 10466.88it/s]We read crosslink-spectrum-matches (CSMs) using the generic parserΒ from a single MS Annika .txt file. For easier access we also assign our crosslink-spectrum-matches to the variable csms.
from pyXLMS.transform import targets_only
csms = targets_only(csms)Crosslink-spectrum-matches should be filtered to only contain target-target matches! The ProXL exporter will not export decoy proteins simply because they are not available for most crosslink search engines, therefore decoy matches should be filtered out! You might also want to employ additional filter steps like validation - this choice is up to you!
_xml = exporter.to_proxl(
csms,
fasta_filename="../../data/_fasta/Cas9_plus10.fasta",
search_engine="MS Annika",
search_engine_version="3.0.1",
score="higher_better",
crosslinker="DSS",
filename="CSMs_exported_to_ProXL.xml",
schema_validation="online",
) Successfully created ProXL XML and validated it against online XML schema!The function exporter.to_proxl() exports a list of crosslink-spectrum-matches to ProXLΒ format for down-stream analysis. The tool ProXL is accessible via the link yeastrc.org/proxl_publicΒ . The to_proxl() function requires several input parameters:
- csms :
list of dict of str, any- A list of crosslink-spectrum-matches.
- fasta_filename :
str- The name/path of the fasta file for reading protein sequences.
- search_engine :
str- Name of the used crosslink search engine.
- search_engine_version :
str- Version identifier of the used crosslink search engine.
- score :
str, one of "higher_better" or "lower_better"- If a higher CSM score is considered better, or a lower score is considered better.
- crosslinker :
str- Name of the used cross-linking reagent, for example βDSSOβ. If this is not a common crosslinking reagent, parameter
crosslinker_massalso needs to be specified!
- Name of the used cross-linking reagent, for example βDSSOβ. If this is not a common crosslinking reagent, parameter
The export additionally requires that charge and score fields are set for all crosslink-spectrum-matches. You can read more about the to_proxl() function and all its parameters here: docs.
Specifying filename=None will only return the XML as a string and not write it to disk!