CyTOF .fcs files and proper gating
Posted: Fri Dec 13, 2013 8:30 pm
I am looking for some input on any precautions people take when gating CyTOF generated .fcs as opposed to regular flow files.
After speaking with DVS (conversation below) I know that the files are originally generated with only positive integer values and zero but then it is randomized by the software.
I ran an .fcs file back through the software and told it to restore the originally acquired values (plot on the left) and I compared it with the automatically randomized data (plot on right). If I gate between zero and one I get different population values, obviously because of the distribution
Would just never gating between zero and one prevent this problem? If the same file is randomized on different occasions using the same algorithim is it possible we would get significantly different gate stats, e.g. % parent, median, between the different randomizations? If so, would using rectangular gates, instead of freeform, prevent this?
Thanks for any thoughts you guys might have,
Rob
P.S. Here are some answers I got from the DVS people about .fcs file generation. Their answers are italicized
1. Has DVS released any technical note regarding how the push data is converted to FCS.
No tech note yet (but that is a really good idea). Here is a brief description. It is the two stage process: firstly, cellular events are separated from the background. It is done using a simple thresholding reflected in the Acquisition window/Analysis tab as the Lower convolution threshold for the total (sum of all channels) ion signal. This stage is key in localizing the cellular events in time assigning "start push-end push" boundaries to every event. On the second stage, the ion signal is integrated from the "start push" to the "end push" over all channels separately keeping "end pushes" minus "start push" value as an event length. These data are converted to FCS.
2. The original CyTOF data would be all integers of ion counts per cell, right?
Correct.
3. If that’s the case what kind of randomization is used to get the non-integer values that I see in the .fcs?
The default randomization during the direct acquisition is the same as in the FCS Analysis window: the randomization is applied to all values using the uniform negative distribution. That means that the ion count value 1 is uniformly distributed in the interval ]0,1] ( funny brackets mean "excluding zero and including one"). The same for any other the ion count value: for example, 10 counts will be uniformly distributed in the interval ]9,10].
4. The software allows us to generate .fcs files with custom data randomization settings after the fact, but I could not find what the default values are that come with the on-the-fly analysis in the acquisition window.
We will get back to you on this, we should make it clear in the SW Manual.
5. Does the conversion to .fcs bin the data into a certain number of channels in a particular range, like Diva does? or are the 'true' ion counts reflected in the CyTOF .fcs file?
We do not bin data artificially at all. The randomization can be switched off in post-processing of the FCS file and the raw data will be available with all complications of its presentation. Notice, that in our Plotviewer program we employ the unequal binning procedure which delegates different screen "real estate" to different ion values. Therefore, only one bin is dedicated to ]0,1] interval as well as for the ]10,11] interval (for example). In this case, the randomization is not required and the problem does not exist.
After speaking with DVS (conversation below) I know that the files are originally generated with only positive integer values and zero but then it is randomized by the software.
I ran an .fcs file back through the software and told it to restore the originally acquired values (plot on the left) and I compared it with the automatically randomized data (plot on right). If I gate between zero and one I get different population values, obviously because of the distribution
Would just never gating between zero and one prevent this problem? If the same file is randomized on different occasions using the same algorithim is it possible we would get significantly different gate stats, e.g. % parent, median, between the different randomizations? If so, would using rectangular gates, instead of freeform, prevent this?
Thanks for any thoughts you guys might have,
Rob
P.S. Here are some answers I got from the DVS people about .fcs file generation. Their answers are italicized
1. Has DVS released any technical note regarding how the push data is converted to FCS.
No tech note yet (but that is a really good idea). Here is a brief description. It is the two stage process: firstly, cellular events are separated from the background. It is done using a simple thresholding reflected in the Acquisition window/Analysis tab as the Lower convolution threshold for the total (sum of all channels) ion signal. This stage is key in localizing the cellular events in time assigning "start push-end push" boundaries to every event. On the second stage, the ion signal is integrated from the "start push" to the "end push" over all channels separately keeping "end pushes" minus "start push" value as an event length. These data are converted to FCS.
2. The original CyTOF data would be all integers of ion counts per cell, right?
Correct.
3. If that’s the case what kind of randomization is used to get the non-integer values that I see in the .fcs?
The default randomization during the direct acquisition is the same as in the FCS Analysis window: the randomization is applied to all values using the uniform negative distribution. That means that the ion count value 1 is uniformly distributed in the interval ]0,1] ( funny brackets mean "excluding zero and including one"). The same for any other the ion count value: for example, 10 counts will be uniformly distributed in the interval ]9,10].
4. The software allows us to generate .fcs files with custom data randomization settings after the fact, but I could not find what the default values are that come with the on-the-fly analysis in the acquisition window.
We will get back to you on this, we should make it clear in the SW Manual.
5. Does the conversion to .fcs bin the data into a certain number of channels in a particular range, like Diva does? or are the 'true' ion counts reflected in the CyTOF .fcs file?
We do not bin data artificially at all. The randomization can be switched off in post-processing of the FCS file and the raw data will be available with all complications of its presentation. Notice, that in our Plotviewer program we employ the unequal binning procedure which delegates different screen "real estate" to different ion values. Therefore, only one bin is dedicated to ]0,1] interval as well as for the ]10,11] interval (for example). In this case, the randomization is not required and the problem does not exist.