Frequently Asked Questions

Here are some common questions and answers about pyXLMS that might help you.

What is pyXLMS?

pyXLMS is a python package and web application for mass spectrometry-based protein-protein crosslinking data. The focus of pyXLMS is reading results from crosslink search engines and providing a unified python interface and data structures to interact with them programmatically. Secondly, pyXLMS aims to provide writers/exporters for most commonly used crosslinking down-stream analysis tools to bridge the gap between search engine results and biologically relevant insights. We recommend reading the introduction in the pyXLMS user guide for more details.

What is pyXLMS not?

pyXLMS is not a down-stream analysis tool in itself - while it does provide basic preprocessing functions (e.g. filtering, grouping, annotation) and quality control (e.g. score distributions, validation) it is not aimed at directly generating new biological insights. pyXLMS should be seen as an intermediate layer between crosslink search engines and down-stream analysis tools, connecting the two and offering extra filtering and quality control in between. We recommend also reading through the limitations !

Why was pyXLMS created?

Development of pyXLMS already started in 2022 after I (M.J.B.) realized that crosslink analysis was limited by which crosslink down-stream analysis tool supported which crosslink search engine as input. At that time I was the main developer of MS Annika and recognized that I would need to provide exporters for all of the down-stream analysis tools that did not support MS Annika as input. However, this problem was not unique to MS Annika and applies to most other crosslink search engines as well. Transforming the results from crosslink search engines to the required input format of the various down-stream analysis tools is cumbersome and prone to errors. Often output and input formats are also not sufficiently documented and needed clarification with developers. While mzIdentML - the standard format proposed by HUPO PSI - was already supporting crosslink results, adoption was still slow and even today many crosslink search engines and down-stream analysis tools do not yet support mzIdentML output or input. It was clear to me that a solution was needed that supported mzIdentML and all other formats as input and which could output all the different formats required for the different down-stream analysis tools. Ultimately, pyXLMS has become that - at least to large degree, of course it is impossible to support all formats, but we have already incorporated a lot of the commonly used search engines and down-stream analysis tools. On a personal note, pyXLMS is what I wish existed 5 years ago - it would have made my PhD a lot easier! 😉

Is pyXLMS an alternative to mzIdentML?

No! We still recommend to use mzIdentML if possible! The aim of pyXLMS is simply to extend support for cases where mzIdentML is either not available, not supported, or not preferred. We also want to give researchers the choice to work with whatever files they want!

Which crosslink aggregation levels does pyXLMS support?

pyXLMS supports crosslink-spectrum-matches and crosslinks. Crosslinks can either be peptide pairs or residue pairs, depending on the grouping.

Does pyXLMS support protein-protein interactions?

No, only indirectly! Currently none of the supported down-stream analysis tools require protein-protein interactions as input, therefore we decided to not include them in pyXLMS. However, some of the supported down-stream analysis tools output protein-protein interactions using the inputs generated by pyXLMS.

Will pyXLMS support protein-protein interactions in the future?

Maybe. If we see use cases where protein-protein interactions are explicitly required we might add them. However, in any case we do not aim to implement protein-protein interaction aggregation or validation which are very sophisticated topics (but are happy to take in outside contributions ).

Does pyXLMS support linear peptides, monolinks, and looplinks?

No! The focus of pyXLMS is peptide-peptide crosslinks, please refer to other tools for linear/monolinked/looplinked peptides, e.g. psm_utils .

Does the pyXLMS web application support all of the features of the python package?

No! Some features are exclusively available in the python package. Features that are not supported in the web app are listed in the user guide under Documentation ➡️ Web Application ➡️ Feature Support.

The export options do not list the down-stream analysis tool that I want to use but the documentation says it is supported, why?

The shown export options in the web application depend on what you uploaded, e.g. if you uploaded crosslink-spectrum-matches, the web application will only show you export options that support crosslink-spectrum-matches as input. For all other export options that are not listed another aggregation level (e.g. crosslinks) is needed. You can either upload a file containing crosslinks or you can directly aggregate your crosslink-spectrum-matches to crosslinks in the Load Data tab.

Do I have to be an expert in python programming in order to use the pyXLMS package in python?

No! Usage of the pyXLMS python package is straight-forward and the user guide has examples for almost all functions in the package!

What are the requirements for running pyXLMS?

pyXLMS requires at least python version 3.7 but we recommend a python version that is still in the active support cycle.

Where can I find pinned package versions to exactly reproduce the results from the manuscript?

You can find pinned package versions in this repository in the uv.lock file. We recommend using uv for project and dependency management. We also provide a template repository for projects using pyXLMS.

How long will it take to process my data with pyXLMS?

It is hard to give an accurate estimate without knowing specific details. Most functionality in pyXLMS will complete instantly or within a few seconds. Reading large files (e.g. > 100 000 crosslink-spectrum-matches or crosslinks) may take a bit longer. Re-annotation of positions and validation may also take a few minutes depending on the size of the data and the FASTA file. Almost all pyXLMS functions show a progress bar to track the current status, however this is only available in the python package and not the web application. We therefore recommend using the python package for large analyses.

I have data from a crosslink search engine that is not yet supported in pyXLMS – what can I do?

If you are comfortable with programming in python you can simply write your own parser using the functions pyXLMS.data.create_csm() and pyXLMS.data.create_crosslink(). If you are not familiar with python please contact us (ideally with some example data) and we will do our best to write a new parser for the requested crosslink search engine!

I want to use the pyXLMS python package – where should I start?

Please start by reading the user guide under Documentation ➡️ Working with pyXLMS.

Can I built a new tool based on pyXLMS, or another python package that does similar things?

Yes, absolutely! pyXLMS is licensed under a permissive MIT license and we encourage building upon pyXLMS!

How do I contribute to pyXLMS?

We are happy if you want to contribute! Please read further here .

Something went wrong – where can I get help?

Please open an issue here or contact us directly!

Do you wear wigs?

Not while implementing pyXLMS! .