FAQ  •  Register  •  Login

Hypothesis testing (stats) on viSNE data

Forum rules
Please be as geeky as possible. Reference, reference, reference.
Also, please note that this is a mixed bag of math-gurus and mathematically challenged, so choose your words wisely :-)
<<

jcvillasboas

Contributor

Posts: 41

Joined: Fri Apr 03, 2015 3:22 pm

Location: Rochester - MN

Post Wed Jan 04, 2017 2:57 pm

Hypothesis testing (stats) on viSNE data

Dear CyTOF community,

I am contrasting the T-cell phenotype from tissue obtained from patients with cancer to that of normal subjects. For this purpose, I concatenated all manually-gated T-cell events from those groups into 2 separate FCS files [all cancer cases (n=8) vs all control cases (n=8)]. I am sampling between 50K to 100K per viSNE run and manually gating the T-cell subsets from within the viSNE map. Questions for the forum:

1. Is there a need to repeat multiple (3?) viSNE runs on the same data set to obtain an average?

2. What would be the most appropriate statistical test to compare the proportion of events (as % of parent) found in any given subset identified on a viSNE run? Would something simple such as Chi square or Fisher exact suffice or do I need more advanced stats to adjust for multiple comparison and bootstrapping?

Thanks in advance,

-JC
<<

bc2zbUVA

Contributor

Posts: 22

Joined: Thu Nov 19, 2015 4:23 pm

Post Wed Jan 04, 2017 4:06 pm

Re: Hypothesis testing (stats) on viSNE data

I would suggest you look at RchyOptimyx (https://www.bioconductor.org/packages/r ... timyx.html) and Citrus (https://github.com/nolanlab/citrus) as both of these are designed to run statistical testing for this very question. Citrus will run its own clustering on your data, and RchyOptimyx was designed to work downstream of FlowType (https://www.bioconductor.org/packages/r ... wType.html) but you can pass in any clusters of your choosing. In either case, these tools will handle the multiple hypothesis testing correction for you (which you should do in this case). I am curious if there are any other tools that do this if anyone else has suggestions.


Edit: I would also suggest you review the literature on cluster stability (http://onlinelibrary.wiley.com/doi/10.1 ... 1/abstract) and the stochastic nature of tSNE (http://distill.pub/2016/misread-tsne/). Both of these will illustrate the importance of doing multiple tSNE and clustering runs.
<<

jcvillasboas

Contributor

Posts: 41

Joined: Fri Apr 03, 2015 3:22 pm

Location: Rochester - MN

Post Wed Jan 04, 2017 5:01 pm

Re: Hypothesis testing (stats) on viSNE data

Thank you Brian. Citrus sounds like a great idea, specially since it has already been incorporate on Cytobank premium. Thanks for pointing out the additional references. I had seen the Melchiotti et al paper but will take another look. Best. JC
<<

sgranjeaud

Master

Posts: 79

Joined: Wed Dec 21, 2016 9:22 pm

Location: Marseille, France

Post Fri Jan 06, 2017 8:59 pm

Re: Hypothesis testing (stats) on viSNE data

Dear JC,

Concerning statistical testing, I am in favour of techniques used in transcriptomics analyses. In the recent paper advertised in this forum "Establishing High Dimensional Immune Signatures from Peripheral Blood via Mass Cytometry in a Discovery Cohort of Stage IV Melanoma Patients", you will see a lot of heatmaps representing percentages. These representations are more comprehensive (or compact) than a bunch of boxplots. The usual goal of an analysis is to compare groups of samples (cancer vs control in your case) for each cell population that was detected (or identified or searched). This could be achieved using t-test or Wilcoxon tests. Such tests will take into account the variability within the group of sample. In order to cope with multiple testing, False Discovery Rate estimation (or correction) must be carried out.

We usually use MeV program to carry out all the stages of such analysis once the populations or clusters of cells have been found and percentages have been extracted. IMHO, the post-analysis turns to be the same whatever the populations came from a CyTOF or classical cytometry experiment, a computational method for finding clusters of cells or a classical manual gating. The MeV software is a graphical data analysis tool. We have just set up a web site for the French community http://impact.marseille.inserm.fr/. The web site is in French but the dias are in English. You will find the standard pipeline we use and some explanations of the rationale of steps in a recent presentation I gave in a French congress.

* rationale of the pipeline (briefly, we apply a log transform to the percentages before centring) http://impact.marseille.inserm.fr/tutos ... njeaud.pdf

* step by step slide show http://impact.marseille.inserm.fr/tutos ... ercent.pdf

* short video demonstrating the various steps on a real dataset from a classical cytometry experiment http://impact.marseille.inserm.fr/tutos ... erview.mp4 (comments are in French, but I guess you will see how easy it is to carry out the analysis)

* download the software at https://sourcesup.renater.fr/frs/?group_id=2569 (Windows 64 bits including Java).

If you have questions, feel free to contact me.

Apart from this very own view, I think there are brilliant statisticians at Mayo Clinic that could help you.

Concerning Citrus pointed out by Brian, if you look under the hood, you will find hierarchical clustering and a statistical method (SAM) taking into account multiple testing. Both of them have been exploited a lot at the age of DNA chip.

Brian, thanks for the reference concerning cluster stability.

Hope my answer will help you,
Samuel
<<

jcvillasboas

Contributor

Posts: 41

Joined: Fri Apr 03, 2015 3:22 pm

Location: Rochester - MN

Post Mon Jan 09, 2017 3:03 pm

Re: Hypothesis testing (stats) on viSNE data

Dear Samuel,

Thank you much for the detailed response. Mev seems like a great tool. I will give it a try and let you know if I run into questions. I truly appreciate the links and your input.

Best regards,

J.C.
<<

jeccles

Participant

Posts: 5

Joined: Thu Jul 14, 2016 4:07 am

Post Mon Jul 17, 2017 4:09 pm

Re: Hypothesis testing (stats) on viSNE data

Hey, if anyone on here has really gotten into CITRUS and is an expert, I've got some questions! Specifically I'm wondering how you set it up so that each sample is assigned an unperturbed reference. The original paper says this is possible. From cytobank and the standalone GUI, it looks like it just compares averaged groups of samples, but doesn't connect subject samples across groups. I made a little illustration to show what I mean:
CITRUS.png

The tool would be a lot more powerful if it could normalize test samples to their own reference before averaging them, so please let me know if you have experience doing this! Thanks-
<<

markrobinsonca

Participant

Posts: 9

Joined: Wed Apr 19, 2017 7:35 pm

Post Tue Jul 18, 2017 8:33 am

Re: Hypothesis testing (stats) on viSNE data

Dear Jacob,

Basically, you want subject-specific (e.g., patient-specific) effects in the model. The natural way to do this in statistical models is to have a random effect, where the patient-specific effect is drawn from a (for example, normal) distribution. (You can also fit it with a fixed effect, but this consumes degrees of freedom). You can do all the same testing (e.g., differences between conditions, differences between time points), but with the patient-specific effects (or batch effects, etc.) adjusted for.

We have almost that exact use case in our "differential analysis of CyTOF" workflow paper:
https://f1000research.com/articles/6-748/v1
(See the "Differential analysis" section)

In my view, the CITRUS model is the wrong way around (response = condition, covariates = cytometry data). To me, it is more natural to treat the cytometry data as the response (whether it be cell counts or marker signal) and have batch, patient, experimental condition be the covariates, because the covariates directly affect the cytometry measurements. We discuss this a bit in the workflow.

Best wishes, Mark

Return to CyTOF data analysis

Who is online

Users browsing this forum: Google [Bot] and 1 guest