flow cytometry data transformation definitions

From: Ryan Brinkman <rbrinkman@bccrc.ca>
Date: Mon Dec 11 2006 - 15:26:24 EST
The Ontology for Biomedical Investigations (OBI; http://obi.sourceforge.net/) project is
developing an integrated ontology for the description of biological and medical
experiments and investigations. This includes a set of 'universal' terms, that are
applicable across biological and technological domains, and domain-specific terms
relevant only to a given domain. This ontology will support the consistent annotation of
biomedical investigations, regardless of the particular field of study. The ontology will
model the design of an investigation, the protocols and instrumentation used, the
material used, the data generated and the type analysis performed on it. This project was
formerly called the Functional Genomics Investigation Ontology (FuGO) project.

As part of this effort OBI is currently collecting terms for data transformations used in
different domains, including flow cytometry. Data transformation is defined for this
purpose as "The process of redefining data based on some predefined rules. The values are
redefined based on a specific formula or technique.  This includes mapping data elements
from a source data format into destination data."

The following is a list of definitions for data transformation believed to be used in
flow. If you have any suggestions or comments on this list they would be appreciate by
December 15th. 



fluorescence_compensation: Subtraction of the fluorescence due to one fluorochrome from
the fluorescence due to another fluorochrome to account for the overlap of the emission
spectra.

data_filtering: Data transformation process that takes a dataset and produces a subset.

gating: The deterministic filtering of a dataset  based solely on unaggregated intrinsic
characteristics of members of the dataset.

normalization: A data transformation process that involves scaling of measured values to
remove an effect biasing a statistic.

parameter_combination: A data transformation process that involves creating of a new
parameter; the value is computed as a result of a function applied on the existing
parameter values.

parameter_scaling: A data transformation process that involves creating of a new
parameter solely based on a single source parameter.

logicle_transformation: Parameter scaling using a logicle function. The logicle function
is defined as logicle(parameter, T, w, m) = root(S(y, T, w, m) - parameter), where root()
is a standard root finding algorithm (e.g., Newton's method) that finds y such that S(y,
T, w, m) = parameter. The S function is defined as follows: if(y ™ w): S(y, T, w, m) =
Te^(-(m-w)) * (e^(y-w) - p^2*e^(-(y-w)/p) + p^2 - 1), otherwise: S(y, T, w, m) = - S(w -
y, T, w, m), where the operands T, and m are positive real constants, w is a non-negative
real constant, e is the base of natural logarithm and parameter is the source parameter.
The logicle function is defined in Parks D.R., Roederer M., Moore W.A. (2006). 

hyperlog_transformation: Parameter scaling using a hyperlog function. The hyperlog (HL)
function is defined as follows: HL(parameter, b, d, r) = root(EH(y, b, d, r) -
parameter), where root() is a standard root finding algorithm (e.g., Newton's method)
that finds y such that EH(y, b, d, r) = parameter. The EH function is defined as follows:
if(y ™ 0):  EH(y, b, d, r) = 10^(y * d / r) + b * (d / r) * y - 1; otherwise:  EH(y, b,
d, r) = - 10^(-y * d / r) + b * (d / r) * y + 1, where r, d, and b are positive real
constants and parameter is the source parameter. The hyperlog function is defined in
Bagwell C.B. (2006). Hyperlog - a flexible log-like transform for negative, zero, and
positive valued data. Cytometry A 64, 34-42.

biexponential_transformation: Parameter scaling using a bi-exponential function. The
bi-exponential (BiEx) function is defined as BiEx(parameter, a, b, c, d, f) = root(B(y,
a, b, c, d, f) - parameter), where root() is a standard root finding algorithm (e.g.,
Newton's method) that finds y such that B(y, a, b, c, d, f) = parameter. The B function
is defined as B(y, a, b, c, d, f) = a * e^(b * y) - c * e^(-d * y) + f, where e is the
base of natural logarithm, a, b, c, d are positive real constants, f is a real constant
and parameter is the source parameter.


split_scale_transformation: Parameter scaling using a split scale function. The split
scale function consists of a logarithmic transformation function applied to high values
and a linear transformation function applied to low values, with a fixed transition point
chosen so that the slope (first derivative) of the resulting split scale transformation
function is continuous. The split scale transformation is defined as if(parameter ˜ t):
split(parameter, a, b, c, r, d) = a * parameter + b, otherwise: split(parameter, a, b, c,
r, d) = log10 (c * parameter) * r/d, where parameter is the source parameter and a, b, c,
r, d are real constants  chosen to make the transition smooth.

linear_transformation: Parameter scaling using a linear function. The linear function is
defined as linear(parameter, a, b) = a * parameter + b, where a, b are real constants and
parameter is the source parameter.

quadratic_transformation: Parameter scaling using a quadratic function. The quadratic
function is defined as quadratic(parameter, a, b, c) = a*parameter^2 + b*parameter + c,
where a, b, c are real constants and parameter is the source parameter.

log_transformation: Parameter scaling using a logarithmical function. The logarithmical
function is defined for positive parameter values as f(parameter, logbase, r, d) =
log_logbase_(parameter) * r/d, where parameter is the source parameter, logbase is the
base of the logarithm (e.g., 10, e), r and d are positive real constants. The function is
defined as 0 for non positive parameter values.  NOTE: An option is to define as
f(parameter) = log_logbase_(parameter). The r and d constants represent linear scaling of
the log transformation and make this definition consistent with other flow specific
transformations. Also, the definition for non-positive values is flow specific.

linear_parameter_combination: A parameter combination using a first degree polynomial to
linearly combine parameters. The first degree polynomial is defined as f(parameter_1,
parameter_2, ..., parameter_n, a1, a2, ..., a_n, b) =
a_1*parameter_1+a_2*parameter_2+...+a_n*parameter_n + b, where a_1 ... a_n are real
constants, b is a real constant, parameter_1 ... parameter_n are source parameters.

Thanks,
Ryan

Ryan Brinkman, PhD
Senior Scientist, Terry Fox Laboratory
BC Cancer Research Centre &
Assistant Professor, Medical Genetics, UBC
675 West 10th Avenue
Vancouver, BC V5Z 1L3
Tel: (604) 675-8132
http://www.bccrc.ca/tfl
Received on Tue Dec 12 14:58:00 2006

This archive was generated by hypermail 2.1.8 : Thu Dec 14 2006 - 03:12:07 EST