FAQ  •  Register  •  Login

data representation

Forum rules
Please be as geeky as possible. Reference, reference, reference.
Also, please note that this is a mixed bag of math-gurus and mathematically challenged, so choose your words wisely :-)
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Sat Jan 30, 2021 9:28 pm

data representation

Hi All,
I would like to represent data from cytobank (where for each marker of interest on tSNE there is seperate intensity bar) in a unified way where expression would be set as normalized intensity for all markers. (Like in Figure 1B Wagner et al., DOI:https://doi.org/10.1016/j.cell.2019.03.005)
Anyone can give a hand in code in R?

I would like to attach attachemnet but I can not..
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Sat Jan 30, 2021 9:54 pm

Re: data representation

Dear All,
In addition,
In data analysis we are pooling the samples together "PBS" vs "Treatment" to increase event size (number of available cells for analysis) and statistical significance,
What would be the best way to represent that data is NOT skewed due to particular sample in treatment group , but all samples within the group behave in teh same way (marker distribution is comparable)?
Any ideas? suggestion?
I would be really grateful!
best,
Julia
<<

tomash

Contributor

Posts: 25

Joined: Sun Oct 19, 2014 10:15 pm

Post Sun Jan 31, 2021 2:07 am

Re: data representation

Hi Julia,

I think we can help with some scripts from our R package 'Spectre' (https://wiki.centenary.org.au/display/SPECTRE). I've just made a modification of our 'simple discovery workflow' that should take care of the plotting that you are interested in, including the re-scaling between 0 and 1. Before sharing it, one quick question: when you export files from Cytobank, do you keep the arcsinh transformed data, or does it come out with the raw values? This will just determine whether I leave the arcsinh transformation step or not.

In data analysis we are pooling the samples together "PBS" vs "Treatment" to increase event size (number of available cells for analysis) and statistical significance,
What would be the best way to represent that data is NOT skewed due to particular sample in treatment group , but all samples within the group behave in teh same way (marker distribution is comparable)?


The wave I have understood this question, is that you want the marker expression colouring is comparable between individual samples when they are plotted separately (as opposed to having each plot use it's own min/max range). Is this correct? If so the plotting functions in Spectre will let you set 'global' ranges, so that. all the plots are directly comparable in terms of X/Y distribution of cells on the tSNE plot, and the colour ranges.

Tom
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Sun Jan 31, 2021 12:23 pm

Re: data representation

Hi Tom,
Thank you very much for yoru reply!
When I export FCS files from Cytobank they are without arsinc transformation.
SO yes, step for arsinc transformation would be necessary.

I am looking forward for the script!
My mail is julia.majewska@weizmann.a.cil

Again, thank you a lot!

For the second part of the question:
For data analysis step, we are pooling all PBS samples together (for example 6 mice PBS treated) and we are also pooling all treated mice (at the end of pooling we have 2 samples, 1st - PBS, 2nd treatment).
The cells of interest are rare events, therefore the aim of pooling is to increase event size and significance of analysis.
If we would analyse each sample separately, then we would see if particular sample behaves as an outlier, as a dot on a graph.. However, we dont see it as we work on pooled samples.
The question is how show that distribution of expression for different markers is compabarable within treatment, therefore pooling can be done.
We have done it by violin plots, showing expression for each marker for all mice (treated vs untreated). I was wondering if there is any other way.. (i think tSNE is not the most readable).
<<

sgranjeaud

Master

Posts: 125

Joined: Wed Dec 21, 2016 9:22 pm

Location: Marseille, France

Post Sun Jan 31, 2021 2:33 pm

Re: data representation

Hi,

Usually, and it seems to be the case on this figure 1B, a scaling is applied to each channel and it is based on the 99th percentile instead of the maximum value. This means that 1% of the highest intensities will be saturated in each channel. You can try a few values, such as 3% or 5%. Once applied, all the channels can be compared/analyzed using a unique color scale.

If you didn't read it yet, you should have a look at the paragraph "Visual representation with UMAP" of Nowicka et al pipeline or the paragraph Dimensionality reduction of CATALYST's vignette.

Hope this helps.
<<

tomash

Contributor

Posts: 25

Joined: Sun Oct 19, 2014 10:15 pm

Post Fri Feb 05, 2021 4:15 am

Re: data representation

Hi Julia,

There is a script on this page: https://github.com/ImmuneDynamics/Spect ... 20cytobank with some demo data (and a demo output). You'll need to install the dev version of Spectre (https://wiki.centenary.org.au/display/S ... of+Spectre), as the re-scale function isn't in the main version just yet.

Give it a shot. You can largely follow the guidance from this page, though it won't be exactly the same: https://wiki.centenary.org.au/x/4C8MCQ.

The default co-factor here is 500, but I assume you'll be using 5-15, so you can change it accordingly.

Additionally, I've got two version of the columns being setup: arcsinh only, and arcsinh with re-scaling between 0 and 1. You can pick whichever you like (or both) for plotting. In the demo I've just used the arcsinh data.

The way the 'multiplots' work, is the data is split up by some factor (e.g. file, group, etc) and then the expression levels and X/Y ranges are set to be consistent between them, so they can be compared directly.

Give it a shot and let me know if this is the kind of thing you need.

Tom
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Fri Apr 02, 2021 1:51 pm

Re: data representation

Hi Tom,

I ma trying to run your code, but tehre are following problems with the spectre installation

SO i am either using R studio on server of weizmann and I run these two lines of code
if(!require('devtools')) {install.packages('devtools')}
library('devtools')
but then when I run
install_github("immunedynamics/spectre")
I got following error message:

/usr/bin/ld: /tmp/RtmplCP1Tq/R.INSTALL1f7c48b24d59/gert/libgit2/lib/libgit2.a(annotated_commit.c.o): unrecognized relocation (0x2a) in section `.text'
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
make: *** [gert.so] Error 1
ERROR: compilation failed for package ‘gert’
* removing ‘/home/labs/valerykri/juliam/R/x86_64-pc-linux-gnu-library/3.5/gert’
Error: Failed to install 'Spectre' from GitHub:
(converted from warning) installation of package ‘gert’ had non-zero exit status

Btw, I f you are willing to exchange your email maybe it will be better,
Thanks a lot,
Julia
<<

tomash

Contributor

Posts: 25

Joined: Sun Oct 19, 2014 10:15 pm

Post Fri Apr 02, 2021 11:48 pm

Re: data representation

Hi Julia,

Hmm that looks annoying -- I haven't come across that particular error before. There's a couple of options to tackle it, and we also have a script to do the plotting which only requires a few packages, mainly ggplot2, so you don't need to install Spectre (https://github.com/sydneycytometry/tSNEplots). Feel free to shoot me an email at thomas.ashhurst@sydney.edu.au and we can sort it out!

Tom

Return to CyTOF data analysis

Who is online

Users browsing this forum: No registered users and 3 guests