Page 1 of 1

Phenograph - defining cluster marker 'positivity'

PostPosted: Wed Sep 05, 2018 8:49 am
by jamesaries
Dear Phenograph users,

I had a silly question regarding how you define whether your cluster is 'positive' or not for a non-dichotomously expressed marker.

Phenograph clusters my data nicely and distinguishing CD3 from CD19 and CD4 and CD8 is not a problem, but for something like Tbet (transcription factor) I'm not sure what the best thing to do is?

From the heat maps and median expression values (from the exported cvs file) I can see which clusters are clearly positive and negative, but for the others I'm not so sure. One of my approaches was to rank the median expression of all the clusters and define clusters that are 'positive' as those above the median. I'm not sure if this is too crude an approach. My clusters are all live cells, from PBMCs.

Thanks and apologies if there is something obvious out there about this !


Re: Phenograph - defining cluster marker 'positivity'

PostPosted: Wed Sep 05, 2018 3:16 pm
by mleipold
Hi James,

I guess my first question is, do you have a biological reason to assume that Tbet would be uniformly bimodal?

By this, I mean clearly positive or clearly negative, with no "mid" or variable expression on different cell types). Relatively few non-lineage markers satisfy this: even something that's nicely bimodal like CD27 has variable expression...... Memory B cells are positive, but their Median signal is still lower than Naive or CM T cells.

Put differently: Tbet may just be a smear in some cases.....biology is messy!


Re: Phenograph - defining cluster marker 'positivity'

PostPosted: Mon Sep 10, 2018 10:25 am
by vtosevski
Hi James,

The question is, I think, not only for Phenograph users but for anyone who uses clustering algorithms to define "populations" in their data. Therefore, let me put 2 answers forward.

1. You could take a pragmatical (but conceptually non-robust approach) and define a cut off between positive and negative for any parameter (Tbet included), as you would for regular gate-drawing in 2 dimensions. In the past I have used flowDensity::deGate function for this purpose. Then any cluster that has a median Tbet value below the threshold you deem "negative" for Tbet and any cluster with median Tbet above the threshold you deem "positive". The more overclustered your data is, the better this approach will work, but is not very robust, as I mentioned.

2. There is important aspect of your workflow that you're not mentioning, and that is if Tbet parameter was used as an input for the clustering algorithm or not. If yes, the partitioning of the high-dimensional space *could* conceivably result in Tbet+ subsets being recognized on their own. In that instance the above approach will work reasonably well. If not, there's no guarantee that any "Tbet median as a cluster feature" approach will work unless you have a lot of Tbet positive cells. Your Tbet positive cells could be scattered across multiple clusters and unless they happen to heavily correlate with the underlying structure defined by the clustering algorithm, no median or similar measure (apart from max :) ) will ever pick those up. Even when picked up in this scenario, such a cluster could not be considered Tbet+ since it would be a mixture of Tbet+ and Tbet- negative cells, but could be used as a "signature" of sort, I guess...

It gets complicated to say further something useful as I don't know what exactly are you trying to achieve and how your data looks like, but in general above holds true.


Re: Phenograph - defining cluster marker 'positivity'

PostPosted: Mon Sep 10, 2018 8:16 pm
by jamesaries
Dear Mike and Vinko,

Thanks very much for your comments. Yes I have been using Tbet as one of the clustering markers.

This approach seems like a good one.

On the one hand the beauty of clustering is visualising the variety and range of expression in a way that flow just can't do, but I still mentally find it helpful to think of clusters that are 'positive' and those that are 'negative'.