Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PeakPicking Vendor files & empty scan #635

Closed
IvanSilbern opened this issue Sep 9, 2019 · 7 comments
Closed

PeakPicking Vendor files & empty scan #635

IvanSilbern opened this issue Sep 9, 2019 · 7 comments

Comments

@IvanSilbern
Copy link

Dear Developers,

thank you very much for your amazing job. I have some problems running msconvert using windows command line, I hope you can help me and point out what I am doing wrong.

I have a thermo .raw file acquired on Thermo Fusion Lumos instrument. It containts MS2 and SPS-MS3 spectra in profile mode and the goal is to convert it to .mgf file format.

I am using Windows 10 command line and ProteoWizard release: 3.0.19248 (37b2e98)
Build date: Sep 5 2019 21:34:03

  1. running
msconvert test_file.raw --mgf --filter "peakPicking vendor" --filter "titleMaker <RunId>.
<ScanNumber>.<ScanNumber>.<ChargeState>"

gives an error message while writing the output file:

[SpectrumWorkerThreads::work] error in thread: [SpectrumList_Thermo::spectrum()] Error
retrieving spectrum "controllerType=0 controllerNumber=1 scan=29316"
[RawFileThreadImpl::getMassList()] failed to centroid scan

The output file is created, however only scans up to 29315 are returned, the further scans are missing. The command line freezes and does not return the prompt back.
However, the first scan reported looks normal:

BEGIN IONS
TITLE=test_file.628.628.2
RTINSECONDS=2960.344929
PEPMASS=1220.495361328125 102622.8203125
CHARGE=2+
145.0496674 3544.6081542969
147.7623596 1837.2149658203
156.7328644 1640.818359375
156.8530579 1448.8192138672
156.9769287 1674.1666259766
158.9644928 3548.841796875
163.0605774 10220.19140625
166.9441833 2255.6904296875
[not showing all ions]
1220.99585 34895.9140625
1221.495728 28782.04296875
1221.975342 8339.6279296875
1305.610474 2594.3481445313
1629.73645 3305.6770019531
1630.727905 3509.3596191406
1953.827148 2065.6315917969
END IONS

The scan=29316 looks empty when I check it in Thermo Xcalibur browser. I am not sure why it was created and not filtered out.

  1. Using msconvert GUI
    2019-09-09 12_54_07-MSConvertGUI (64-bit)

it runs through without an issue, the output file is created, but the result is rather strange:

BEGIN IONS
test_file.628.628.2 File:"test_file.raw", NativeID:"controllerType=0 controllerNumber=1 scan=628"
RTINSECONDS=2960.344929
PEPMASS=1220.495361328125 377777.30212399998
CHARGE=2+
130.6846985 0.0
130.6863028 0.0
130.6879071 0.0
145.0370789 0.0
145.0389546 0.0
145.0408303 0.0
145.0502095 3478.3859863281
145.0595887 0.0
145.0614649 0.0
145.0633411 0.0
147.7511843 0.0
147.7531129 0.0
147.7550416 0.0
147.7627565 1817.9001464844
147.770469 0.0
147.7723979 0.0
[...]
158.9494898 0.0
158.9516417 0.0
158.9537938 0.0
158.9645544 3546.9838867188
158.9757711 0.0
158.9779236 0.0
1.968864375e-312 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
[...]
0 0.0
0 0.0
0 0.0
0 0.0
END IONS

First, there are flanking 0 ions around each centroided peak, second, the ions above ~160 m/z are missing or reporterd as 0
The troublesome scan 29316 and all scans after it are reported:

TITLE=test_file.29316.29316.2 File:"test_file.raw", NativeID:"controllerType=0 controllerNumber=1 scan=29316"
RTINSECONDS=13276.56953
PEPMASS=1878.165405273438
CHARGE=2+
130.6847357 0.0
130.68634 0.0
130.6879443 0.0
9.911105182e-307 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
0 0.0
END IONS

  1. running
    msconvert test_file.raw --mgf --filter "peakPicking cwt" --filter "titleMaker <RunId>.<ScanNumber>.<ScanNumber>.<ChargeState>"

creates and output very similar to the one from GUI, but with fewer 0 reported, so the output is much smaller in size.

  1. adding zeroSamples to eihter cmd or gui results in a following error:
    --filter "zeroSamples removeExtra"

[SpectrumList_ZeroSamplesFilter] Error filtering intensity data: managed and native storage have different sizes

So far I feel myself lost and cannot explain the different output I get from gui and cmd and how I can proceed further. Am I missing something when running msconvert in command line? I would appreciate any help!

Thank you very much in advance,
Ivan

@chambm
Copy link
Member

chambm commented Sep 9, 2019

If the scan looks blank in QualBrowser it's a safe bet the file is corrupt. Once that scan is accessed things go wonky. Try running a convert job with an index filter that includes everything EXCEPT that scan. There very well may be other corrupt scans too so you may need to exclude a range of scans. For example if you have 500 scans and scan 124 is corrupt, use filters:
--filter "index 0-122,124-"

edit: fixed off by one error

@IvanSilbern
Copy link
Author

Thanks for your reply!
However, I cannot use --filter index 0-29315, 29317-" option with "vendor" peakPicking, because if filter out the scan before --filter "pickPeaking vendor" it complains that I have another filter before vendor peakPicking, then it applies cwt instead and gives a strange mgf seen in the case [2] and [3].
If I filter out the corrupt scan after --filter "peakPicking vendor", I get the same error like in [1].

It is not that this scan and the following scans are very important, but it would be nice if msconvert could skip scans that generate an error during conversion and continue with other scans (if it is manageable).

@IvanSilbern
Copy link
Author

I've found out that if I add --filter "defaultArrayLength 1-" after --filter "peakPicking vendor", msconvert skips the corrupt scan and exits normally. So, this works now:
msconvert test_file.raw --mgf --filter "peakPicking vendor" --filter "defaultArrayLength 1-" --filter "titleMaker <RunId>.<ScanNumber>.<ScanNumber>.<ChargeState>"

@chambm
Copy link
Member

chambm commented Sep 10, 2019

If the vendor API mishandles the corrupt scan and messes up the process memory, everything that happens afterward is what's called "undefined behavior". It may look like everything is fine, but things may silently be corrupted. It's better to do the index filter as I said. It doesn't need to be the first filter. It can come after peakPicking.

@IvanSilbern
Copy link
Author

If I put index fileter after --filter "peakPicking vendor" like:
msconvert test_file.raw --mgf --filter "peakPicking vendor" --filter "index 1-29315, 29317-" --filter "titleMaker <RunId>. <ScanNumber>.<ScanNumber>.<ChargeState>"

msconvert creates the same error as in [1]

[SpectrumWorkerThreads::work] error in thread: [SpectrumList_Thermo::spectrum()] Error
retrieving spectrum "controllerType=0 controllerNumber=1 scan=29316"
[RawFileThreadImpl::getMassList()] failed to centroid scan

However, I am satisfied with the solution I've posted before. Thanks!

@chambm
Copy link
Member

chambm commented Sep 10, 2019

The indices are 0 based and scan numbers are 1 based, so index 29315 is scan 29316. :) Sorry I didn't make that clear before (and the ranges are inclusive on both ends). Doh. I see I got it wrong in my reply! I was even trying to pay attention to that too...

@IvanSilbern
Copy link
Author

Sorry, my mistake too, --filter "peakPicking vendor" --filter "index 0-29314,29316-" works! Thanks for your helping!

@chambm chambm closed this as completed Sep 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants