Gating output-addl analysis-issue of dependent pops
Hi all,
I have a general topic to throw out for discussion, especially to the stats and informatics people.
When we perform experiments for people as part of the service center, we process the samples, run the samples, do preliminary analysis (standard FlowJo gating, especially as QC), and then give all the files back to the customer.
Often, we're asked to output a data table of Freq Parent and/or Count from the FlowJo gating hierarchy. Some people use this as a quick first pass, but others use this as a big part of their data for statistical analysis.
Unfortunately, we do occasionally have people that want to use that FlowJo table without thought to the gating hierarchy and what are dependent and independent populations. For example, we often do CD3 +/- as a simple split gate in a histogram. Clearly, in this case, CD3+ pop and CD3- pop are interdependent and sum to 100%.
In some other cases, we may do a bivariate like CD4 vs CD8, and gate the single-pos. In this case, CD4+ and CD8+ often do *not* sum to 100% (so, not *completely* interdependent, but not sure I'd call them completely independent either since they come from the same parent gate).
Obviously, this is an issue of gating hierarchies, as a single cell will be a member of several gates as you go down the hierarchy. This is in contrast to many types of clustering, where a single cell can be a part of one and only one cluster and therefore there are none of these redundancies or interdependencies.
Since this is something that comes up, I thought I'd throw this out as a discussion topic to the community: what do you do in these cases? Is there *one* answer? Or, is this where the informatics/stats people need to work particularly closely with the bench scientists to decide what goes and what stays in the analysis?
In the above CD3 split gate example, are *both* CD3pos *and* CD3neg something to include in your modeling, or should you use one or the other but not both?
Also, you could imagine a case where, say, total CD4+ Count or (CD4+ percent of total CD3+) may not change significantly between Case and Control, but some subpopulation (%Naive, or %Th2) would. In other cases, total CD4+ Count or (CD4+ percent of total CD3+) may change, but within that altered CD4+ fraction, the subfractions of %Naive or %Th2 might not.
Thoughts?
Mike
I have a general topic to throw out for discussion, especially to the stats and informatics people.
When we perform experiments for people as part of the service center, we process the samples, run the samples, do preliminary analysis (standard FlowJo gating, especially as QC), and then give all the files back to the customer.
Often, we're asked to output a data table of Freq Parent and/or Count from the FlowJo gating hierarchy. Some people use this as a quick first pass, but others use this as a big part of their data for statistical analysis.
Unfortunately, we do occasionally have people that want to use that FlowJo table without thought to the gating hierarchy and what are dependent and independent populations. For example, we often do CD3 +/- as a simple split gate in a histogram. Clearly, in this case, CD3+ pop and CD3- pop are interdependent and sum to 100%.
In some other cases, we may do a bivariate like CD4 vs CD8, and gate the single-pos. In this case, CD4+ and CD8+ often do *not* sum to 100% (so, not *completely* interdependent, but not sure I'd call them completely independent either since they come from the same parent gate).
Obviously, this is an issue of gating hierarchies, as a single cell will be a member of several gates as you go down the hierarchy. This is in contrast to many types of clustering, where a single cell can be a part of one and only one cluster and therefore there are none of these redundancies or interdependencies.
Since this is something that comes up, I thought I'd throw this out as a discussion topic to the community: what do you do in these cases? Is there *one* answer? Or, is this where the informatics/stats people need to work particularly closely with the bench scientists to decide what goes and what stays in the analysis?
In the above CD3 split gate example, are *both* CD3pos *and* CD3neg something to include in your modeling, or should you use one or the other but not both?
Also, you could imagine a case where, say, total CD4+ Count or (CD4+ percent of total CD3+) may not change significantly between Case and Control, but some subpopulation (%Naive, or %Th2) would. In other cases, total CD4+ Count or (CD4+ percent of total CD3+) may change, but within that altered CD4+ fraction, the subfractions of %Naive or %Th2 might not.
Thoughts?
Mike