### Methods for Dimensionality Reduction

Thanks!

Elyse

Please be as geeky as possible. Reference, reference, reference.

Also, please note that this is a mixed bag of math-gurus and mathematically challenged, so choose your words wisely

5 posts
• Page **1** of **1**

Hi guys, I have a very broad question and I would be forever grateful if someone could help me out. I don't really understand the difference between dimensionality reduction methods such as PCA and SPADE and VISNE and ACCENSE.. what are the major differences between each of these, and when you would want to use one over another? I'm giving a CyTOF presentation tomorrow and I'm realizing that my knowledge of data anlysis for high-dimensional data is sorely lacking.

Thanks!

Elyse

Thanks!

Elyse

Dear Elyse, good question.

These tools represent two broad approaches for exploring data, clustering (SPADE) and dimensionality reduction (PCA and SNE).

PCA and SNE both perform dimensionality reduction, but in different ways. Simply put, PCA performs a linear dim. reduction while SNE performs a nonlinear dim. reduction and due to the nature of cytometry data, SNE therefor tend to produce clearer insights into the underlying structure of the data than PCA does. Both viSNE and ACCENSE use the same SNE algorithm for dim. reduction and visualization. The difference between these tools is that beyond visualization, ACCENSE also allow for an automated classification of cells into subpopulations and the export of such data in a tabular format for statistical analyses.

SPADE clusters the data and visualize the clusters on a minimum spanning tree, but does not directly support statistical comparisons. Citrus on the other hand, allows for statistical comparisons between samples based on clustering of the data.

I hope this helps!

Petter

These tools represent two broad approaches for exploring data, clustering (SPADE) and dimensionality reduction (PCA and SNE).

PCA and SNE both perform dimensionality reduction, but in different ways. Simply put, PCA performs a linear dim. reduction while SNE performs a nonlinear dim. reduction and due to the nature of cytometry data, SNE therefor tend to produce clearer insights into the underlying structure of the data than PCA does. Both viSNE and ACCENSE use the same SNE algorithm for dim. reduction and visualization. The difference between these tools is that beyond visualization, ACCENSE also allow for an automated classification of cells into subpopulations and the export of such data in a tabular format for statistical analyses.

SPADE clusters the data and visualize the clusters on a minimum spanning tree, but does not directly support statistical comparisons. Citrus on the other hand, allows for statistical comparisons between samples based on clustering of the data.

I hope this helps!

Petter

Thanks for the reply! I forgot that SPADE was not actually a dimensionality reduction algorithm... thanks for clearing that up. However I'm still not clear on a few points:

1) I don't really understand how SPADE clustering can be represented on a 2D plot without dimensionality reduction... and I don't understand why SPADE doesn't support statistical comparisons. And what does it mean when I read that SPADE clustering is 'heirarchical'? I think I need someone to explain how SPADE clusters data as though I'm a layman (which I basically am... I keep getting hung up on the terminology)

2) Why is non-linear dimensionality reduction more appropriate for cytometry data? Can you go into a little detail as to what linear vs non-linear means?

I am a very visual person, and I'm having a lot of trouble visualizing how each of these methods work.

1) I don't really understand how SPADE clustering can be represented on a 2D plot without dimensionality reduction... and I don't understand why SPADE doesn't support statistical comparisons. And what does it mean when I read that SPADE clustering is 'heirarchical'? I think I need someone to explain how SPADE clusters data as though I'm a layman (which I basically am... I keep getting hung up on the terminology)

2) Why is non-linear dimensionality reduction more appropriate for cytometry data? Can you go into a little detail as to what linear vs non-linear means?

I am a very visual person, and I'm having a lot of trouble visualizing how each of these methods work.

Hi Elyse,

SPADE doesn't *directly* support statistical comparisons. SPADE does the clustering, but doesn't then go ahead and do the next step of actually comparing one sample to the next (or one group of samples to the next). Citrus does.

In most (all?) versions of SPADE, there are ways for you to export the data in tables, which you could then bring into R or some other statistics program to do the comparison. But SPADE doesn't do that comparison for you.

Mike

SPADE doesn't *directly* support statistical comparisons. SPADE does the clustering, but doesn't then go ahead and do the next step of actually comparing one sample to the next (or one group of samples to the next). Citrus does.

In most (all?) versions of SPADE, there are ways for you to export the data in tables, which you could then bring into R or some other statistics program to do the comparison. But SPADE doesn't do that comparison for you.

Mike

Just a few other comparative observations on these tools:

-SPADE, Citrus, and ACCENSE do overt clustering; PCA and ViSNE plot single cells in 2-D (or 3-D) space, and the user can visualize what clusters are there.

-Citrus is built to test for group-level significant differences in specific clusters; as already discussed for SPADE, the other programs are more about visualization than statistical testing.

-SPADE allows easy visualization of fold-change, and is thus very appropriate for signaling data.

-SPADE is integrated in Cytobank, so is most readily available to non-computational users. ACCENSE has a free downloadable version available (http://www.cellaccense.com/). PCA can be run with conventional statistical packages.[/list]

Regards,

holden

-SPADE, Citrus, and ACCENSE do overt clustering; PCA and ViSNE plot single cells in 2-D (or 3-D) space, and the user can visualize what clusters are there.

-Citrus is built to test for group-level significant differences in specific clusters; as already discussed for SPADE, the other programs are more about visualization than statistical testing.

-SPADE allows easy visualization of fold-change, and is thus very appropriate for signaling data.

-SPADE is integrated in Cytobank, so is most readily available to non-computational users. ACCENSE has a free downloadable version available (http://www.cellaccense.com/). PCA can be run with conventional statistical packages.[/list]

Regards,

holden

5 posts
• Page **1** of **1**

Users browsing this forum: No registered users and 4 guests