FAQ  •  Register  •  Login

Transformation methods and flow data

Forum rules
Please be as geeky as possible. Reference, reference, reference.
Also, please note that this is a mixed bag of math-gurus and mathematically challenged, so choose your words wisely :-)



Posts: 53

Joined: Wed Feb 25, 2015 3:03 pm

Post Fri Jun 21, 2019 5:16 pm

Transformation methods and flow data

Dear all,
We have been comparing BM samples that we classify as control BM based on genetics (PCR-negative for mutations), morphology (absence of abnormal cells) and flow cytometry, using manual FlowJo gating strategies. We have been using Cytofkit clustering tools to assess the flow data. In the GUI we have selected the default autological transformation and generated clustering data. While playing around we also ran clustering with the Arcsinh transformation selected and unsurprisingly get different clustering results. In a nutshell, the Arcsinh gave clustering data that fit better with what we know about the samples, which was unexpected as it isnt the recommended method for flow data.
In trying to understand how the different methods transform the data we ran some flow data files in the Hao Chen transformation comparation SHINY app )https://chenhao.shinyapps.io/TransformationComparation_shinyAPP/). This showed us that Arcsinh generates a peak of negative values at the x axis while autological generates a spread out continuum/tail of negative values (see attached image).

We would appreciate some feedback on whether using Arcsinh in flow data is acceptable, and whether the generated clustering data would be considered reliable?
Autolgcl vs cytofArcsinh.png



Posts: 79

Joined: Wed Dec 21, 2016 9:22 pm

Location: Marseille, France

Post Sat Jun 22, 2019 1:45 pm

Re: Transformation methods and flow data


If I understand correctly, your data are fluorescent flow cytometry. Typical transformations for such data are logicle and asinh. Both have parameters. The autologicle transformation deduces its parameters from the data (ie per channel). The arc sinus hyperbolic transformation is usually used with only parameter called the cofactor which is used to scale down the data before applying the asinh function. For data from instruments using a 18 bits range (ie data ranging up to about 262144), the cofactor is from 150 to 250 typically.

The GUI of cytofkit is offering a subset of transformations (ie more in cytofkit's command line functions). For flow cytometry, I would recommend logicle (either auto or fixed). The cytofAsinh is a custom transformation that works well for mass cytometry. This function pools all intensities below 1 and assigns each of them a random number taken from a gaussian law. Then it divides those intensities by a cofactor of 5 and applies asinh. The pooling and randomization effect is the reason why you are getting a peak at the low end, which I think is zero. This tends to create an artificial peak at low end which has no reason IMHO.

Even without this effect on negative values, a inadequate cofactor leads to a double peak around zero Ray & Pyne 2012. This is similar in sense to what you know from Moore et al 2012. So setting a correct (ie around 250 for flow) cofactor is important. Unfortunately, cytofkit's GUI does not allow to set a cofactor yet. It's probably in the TODO list.

In my experience, if the peak of the negative cells is at zero, asinh is as good as autologicle. If not, autologicle performs better (but not always).

You could have a look at the code of cytofAsinh and the other functions.


Return to CyTOF data analysis

Who is online

Users browsing this forum: No registered users and 1 guest