FAQ  •  Register  •  Login

Data Analysis creating flow set

Forum rules
Please be polite and civil. We know that troubleshooting is vexing...
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Wed Oct 21, 2020 9:21 pm

Data Analysis creating flow set

Hi Everyone,
I am new to data analysis in R .
I would like to use the package HDCytoData.
1. How to create the flowset object so that it would include the panel columns eg. Y89Di Pr141Di together with metadata file columns eg. group_id patient_id sample_id population_id?

2. So far I did create the flowset with 'read.flowSet' function.
When I check if all panel columns are in the flowSet object with
all(panel$fcs_colname %in% colnames(fs)) i get FALSE.
However,
> colnames(fs)
[1] "Y89Di" "Pr141Di" "Nd142Di" "Nd143Di" "Nd144Di" "Nd146Di" "Sm147Di" "Nd148Di" "Sm149Di"
[10] "Nd150Di" "Eu151Di" "Sm152Di" "Eu153Di" "Sm154Di" "Gd155Di" "Gd156Di" "Tb159Di" "Gd160Di"
[19] "Dy162Di" "Dy163Di" "Dy164Di" "Ho165Di" "Er166Di" "Er168Di" "Tm169Di" "Yb171Di" "Yb172Di"
[28] "Yb173Di" "Yb176Di" "Bi209Di"
> panel$fcs_colname
[1] "Y89Di" "Pr141Di" "Nd142Di" "Nd143Di" "Nd144Di" "Nd146Di" "Sm147Di" "Nd148Di" "Sm149Di"
[10] "Nd150Di" "Eu151Di" "Sm152Di" "Sm153Di" "Sm154Di" "Gd155Di" "Gd156Di" "Tb159Di" "Gd160Di"
[19] "Dy162Di" "Dy163Di" "Dy164Di" "Ho165Di" "Er166Di" "Er168Di" "Tm169Di" "Yb171Di" "Yb172Di"
[28] "Yb173Di" "Yb176Di" "Bi209Di"

I dont understand why, and how it shall be changed..
<<

jimbomahoney

Master

Posts: 83

Joined: Wed Feb 27, 2019 11:21 am

Post Thu Oct 22, 2020 7:45 am

Re: Data Analysis creating flow set

Hi Julia,

It sounds like you might be following this workflow?

Which dataset are you using?

Re: Your first question, it's my understanding that the panel and metadata are separate objects (dataframes) from the flowset, but can be combined into a special structure called an SCE e.g. see the following section from the link I posted (points of interest in bold):

Data organization
We will store all data used and returned throughout differential analysis in an object of the SingleCellExperiment (SCE) class. For this, CATALYST provides the wrapper prepData() to construct a SCE object from the following inputs:

x: a flowSet containing the raw measurement data, or a character string that specifies the path to a set of .fcs files.

panel: a data.frame containing, for each marker, i) its column name in the input raw data, ii) its targeted protein markers, and, optionally, iii) its class (type, state, or none).

md: a data.frame with columns describing the experimental design.

Argument features specifies which columns (channels) to retain from the input data. By default, all measurement parameters will be kept (features = NULL). Here, we only keep the channels listed in panel.

It is important to carefully check whether variables are of the desired type (factor, numeric, character), since input methods may convert columns into different data types. This is taken care of by the prepData() SCE constructor. For the statistical modeling, we want to make the condition variable a factor with the reference (Ref) being the reference level. The order of factor levels can be defined with the levels parameter of the factor function or via relevel().


I wonder if you're running into one of the most common issues in R - differing data types (e.g. text could be characters or factors).

I'm not an expert, but I wonder if the following might work (as a test):

  Code:
all(as.character(panel$fcs_colname) %in% as.character(colnames(fs)))
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Sun Oct 25, 2020 9:39 pm

Re: Data Analysis creating flow set

Hi, Thanks for your reply.
Actually the problem is SCE. i think i hvae done as it shall be but its giving me an error...

> sce <- prepData(fSet, panel_f, md1, features = panel_f$fcs_colname)
Error in prepData(fSet, panel_f, md1, features = panel_f$fcs_colname) :
all(unlist(md_cols) %in% names(md)) is not TRUE
> names(md1)
[1] "file_name" "sample_id" "condition"
> md1_cols
$file
[1] "file_name"
$id
[1] "sample_id"
$factors
[1] "condition"
> class(names(md1))
[1] "character"
> class(md1_cols)
[1] "list"
> unlist(md1_cols)
file id factors
"file_name" "sample_id" "condition"
> unlist(md1_cols) == names(md1)
file id factors
TRUE TRUE TRUE
<<

sgranjeaud

Master

Posts: 123

Joined: Wed Dec 21, 2016 9:22 pm

Location: Marseille, France

Post Wed Oct 28, 2020 3:27 pm

Re: Data Analysis creating flow set

Hi,
I think you should open an issue (although you are asking help) at
https://github.com/HelenaLC/CATALYST
Best.
<<

markrobinsonca

Participant

Posts: 11

Joined: Wed Apr 19, 2017 7:35 pm

Post Wed Oct 28, 2020 3:55 pm

Re: Data Analysis creating flow set

I agree with Samuel .. better place to get ahold of us would be the issues of the github repo (https://github.com/HelenaLC/CATALYST/issues) or the Bioconductor support site (https://support.bioconductor.org/). I don't know if many of us monitor this list. I stumbled on it by accident.

In your code example, you mention md1_cols, but that is never used, right?

I'm taking a guess, but maybe what you need is:

sce <- prepData(fSet, panel_f, md1, features = panel_f$fcs_colname, md_cols = md_cols1)

Best, Mark
<<

juliam

Participant

Posts: 18

Joined: Wed Oct 07, 2020 8:13 am

Post Sun Nov 01, 2020 11:24 am

Re: Data Analysis creating flow set

Hi Mark,
Thanks a lot for the advice,I actually solved it, after reading many different scripts.
Next time I will post analysis problem at the site you recommended.

Except my preliminary analysis in R , in pararell we are also trying to analyse the data in Wolfram Mathematica (12.1.1.0).

Therefore raw FCS files were nornalized in software supported by Fluidigm, then I gated out beads, doublets and dead cells.
Then I wanted to concentrate on EpCam+, Cd45+, or Cd31+, therefore I gated these populations in cytobank and subsequently exported the FCS files (refering either to CD31, EpCam or Cd45 cells) with the option "split by populations".
When importing the FCS files to mathematica we are encountering a large number of events with very large positive numbers (~10^30) or very negative ones (~ - 10^30). This occurs in several channels.
In Cytobank I do NOT see such 'extreme' values.
Anyone has ideas what is happening?

As always very grateful for any advices or suggestions.
Julia
<<

sgranjeaud

Master

Posts: 123

Joined: Wed Dec 21, 2016 9:22 pm

Location: Marseille, France

Post Sun Nov 01, 2020 3:29 pm

Re: Data Analysis creating flow set

I used Mathematica 30 years ago for image and signal processing and symbolic calculation. I don't see any reason to use it nowadays. R and Python have rich libraries. If you want have feedback, you should be better go with R or Python.
<<

BjornZ

Contributor

Posts: 43

Joined: Fri Jul 10, 2015 1:04 am

Post Fri Nov 13, 2020 6:00 pm

Re: Data Analysis creating flow set

Hi Julia,

I wrote the Mathematica FCS importers and exporters (almost 15 years ago, wow). There's a bug in the exporter that I have not yet corrected that might cause this. If you e-mail me I can assist: zbjornson at primitybio.com. In any case, it's a good reminder for me to fix the bug.

Needless to say, I'm biased, but I love Mathematica and find R especially painful to use.

Zach

Return to CyTOF troubleshooting

Who is online

Users browsing this forum: No registered users and 12 guests