As median is a better estimator of central tendency (mean), in the presence of outliers or skewed distributions, so median absolute deviation ("mad") is a better or "robust" estimator of variability (variance or standard deviation) in the same setting. It can be easily computed in R (open source: http://www.r-project.org/), maybe using rflowcyt package for importing FCS data. Pietro Bulian Servizio di Onco-Ematologia Clinico-Sperimentale I.R.C.C.S. Centro di Riferimento Oncologico Via Franco Gallini 2 33081 Aviano (PN) - Italy phone: +39 0434 659 412 fax: +39 0434 659 409 e-mail: pbulian@cro.it ----- Original Message ----- From: "James Wood" <jcswood@mac.com> To: cyto-inbox Sent: Tuesday, February 19, 2008 4:24 AM Subject: Re: RE : Statistics questions > For normal distributions only, the standard error of the median is about > 25% > larger than the standard error of the mean. If the distribution is skewed > then the standard error of the median is very hard to calculate. The > following link is to a useful simulation that can be used to show how the > standard error of the mean and the standard error of the median changes > with > the distribution shape. If you make a custom skewed distribution in the > lower channels and add some outliers in the upper channels, you can see > for > yourself how the standard error of the mean can be much larger than the > standard error of the median. > > http://onlinestatbook.com/stat_sim/sampling_dist/index.html > > Jim Wood > > > On 2/17/08 3:44 PM, "Mario Roederer" <roederer@drmr.com> wrote: > >> Yes, the statistics on the MFI follow very closely that on the >> frequency. >> >> The standard error of the mean of a population (SEM), which defines >> how precisely you know the "true" mean, is equal to the standard >> deviation divided by the square root of the number of events. The SD >> (or CV) tells you how broad the distribution is, but will not change >> substantially as you collect more and more events. But the SEM >> decreases with increased events,, in your example, if you count 10,000 >> events vs. 1,000 events, the SEM will be about 30% as large on the >> first sample -- saying that you know the mean with 3x increased >> precision. >> >> Note that while the use of the SD is most appropriate for gaussian >> (normal) distributions, the relationship between increased precision >> of the mean and increasing numbers of events is independent of the >> actual distribution. >> >> By the way, I advocate use of the median fluorescence intensity rather >> than the mean, since the median is less subject to outliers >> (particularly when you are dealing with log-distributions). I don't >> know (nor did a statistician I asked!) how the variance in the median >> will relate to the number of events used to calculate it. However, I >> think the square-root relationship is probably a reasonable one to use >> as an estimator... i.e., you need 100x the number of events to >> increase your precision 10x. >> >> Finally, note that once you have more than a few dozen events on which >> you are computing the MFI (or frequency), the statistical error in the >> MFI (or %) is probably much lower than experimental error, so it's >> kind of pointless to collect much more than this if your goal is only >> to increase precision of the measurement. >> >> mr >> >> On Feb 15, 2008, at 8:50 AM, Carl Simard wrote: >> >>> Since we're on the subject of Poisson statistics, number of events >>> and CV, there's a >>> question that I'm asking myself since some time. Does all these >>> statistics limitations >>> also applied with MFI ? >>> >>> Just to give a practical example, let say I'm not interested in the >>> proportion of cells >>> being positive or negative for a given marker (the experiement is >>> done on a cultured cell >>> line and thus all cells behave pratically homogenously to a given >>> treatment). Instead, I >>> just want to look at change in the relative expression of this >>> marker based on change on >>> MFI readings. In this case, will the MFI be more significative if >>> you count, let say, 10 >>> 000 cells versus 1000 ? >>> >>> Carl >>> >>> -----Message d'origine----- >>> De : Howard Shapiro [mailto:hms@shapirolab.com] >>> Envoyé : 13 février 2008 21:40 >>> À : Cytometry Mailing List >>> Objet : Re: Statistics questions >>> >>> >>> >>> Maciej Simm wrote (in response to Petra Disterer) >>> >>>> >>>>> 2. I've read about coefficient of variation and that one should have >>>>> more than 400 >>>>> positive events to have a CV of less than 5%. In my understanding >>>>> that means that if I have 400 positive events the probability that >>>>> these positive events are due to chance is >>>>> less than 5%. I'm not sure that I have understood this correctly. >>>> >>>> CV=100/sqrt(400) or 5%, so "yes". This was elegantly described on >>>> this >>>> list before - http://www.cyto.purdue.edu/hmarchiv/2001/0261.htm >>>> >>> I'm glad Maciej dug up the pointer to my 2001 posting, which saves >>> me some writing this >>> time around, but Petra seems to be laboring under a misconception >>> about Poisson >>> statistics. If you count n events, and there are no other sources of >>> variance in the >>> measurement, the "mean" of your measurement is n, i.e., the number >>> of events you count, >>> and the expected standard deviation of a series of counts of events >>> from the same sample >>> is the square root of n. Since the coefficient of variation, in per >>> cent, is 100 times >>> the mean divided by the standard deviation, i.e., 100 divided by the >>> square root of n, >>> you get 5 per cent as the minimum possible CV for a count of 400 >>> objects, 10 per cent for >>> a count of 100 objects, 1 per cent for a count of 10,000 objects, >>> etc. Poisson statistics >>> therefore tell you how many objects you actually need to count to >>> get the result to a >>> desired level of precision. They tell you >>> *nothing* about the probability that the events you count are or are >>> not due to chance!!! >>> >>> A major reason those of us who can afford it use cytometry is that >>> it is usually >>> difficult for even the keenest-eyed and best-trained human observer >>> to sit at a >>> microscope and count several hundred of anything. When I was a >>> medical student, one of >>> the hardest tests my classmates and I had to do in our role as the >>> de facto "clinical >>> laboratory" in the emergency department of a busy city hospital was >>> the blood >>> reticulocyte count. Reticulocytes are immature red cells that have >>> not completely shed >>> what's left of their protein synthetic apparatus (ribosomes and >>> endoplasmic reticulum). >>> They take between one and two days to do this, and, since red cells >>> normally last about >>> 120 days in circulation, we expect that about one per cent of red >>> cells in blood will be >>> reticulocytes. Reticulocytes can be demonstrated on a blood smear by >>> staining them with a >>> dye such as new methylene blue, which will precipitate the ribosomes >>> into a "network" >>> (whence comes the term reticulocyte), which, if you are sharp-eyed, >>> persistent, and >>> lucky, you will see as one or a few blue dots within the red cell. >>> The reticulocyte count >>> goes up if someone has lost blood and is replacing it, and down if >>> he or she has a >>> condition such as vitamin B12 deficiency, in which the marrow isn't >>> generating new red >>> cells. To do a reticulocyte count on a blood smear, you look at and >>> count 1,000 red >>> cells, noting the number of reticulocytes you see while you do this. >>> If a normal person >>> has about 1 per cent reticulocytes, you can expect to count 10 of >>> them while you cruise >>> (or bruise) through 1,000 red cells, meaning the CV of your >>> measurement will be over 30 >>> per cent. If you do the count the next day and only count 7, or >>> count a whopping 13, it >>> is not at all unlikely that there has been no real change in the >>> patient's hematologic >>> status. That's what we learn from Poisson statistics. >>> >>> These days, the Clinical Laboratory Improvement Act (CLIA) has made >>> it illegal for >>> medical students to be used as lab slaves, at least in the United >>> States, and >>> reticulocytes are typically counted in a properly certified lab in a >>> flow cytometer, >>> using a dye such as thiazole orange, which binds to nucleic acid, >>> and analyzing at least >>> a few tens of thousands of red cells in toto, which yields a >>> measurement with a >>> respectable CV. Since red cells spit out their nuclei on the way to >>> becoming >>> reticulocytes, they don't (except in pathologic situations) contain >>> DNA, so dyes that >>> bind to both DNA and RNA are usually OK for reticulocyte counting. >>> It only took about >>> five years for the hematologists to get comfortable with this. >>> >>> Reflecting on my career in cytometry, most of it seems to have been >>> spent automating >>> various parts of the "scut" lab work I was forced to do as a medical >>> student; as many of >>> you may know, I am now looking at cytometric diagnosis of TB (which >>> I did do in medical >>> school) and malaria (which I don't recall ever doing, but might have >>> once or twice). >>> These diseases were, and are, much bigger problems in resource-poor >>> countries than in >>> places where laboratories can afford both flow cytometers and the >>> infrastructure needed >>> to run them. TB is typically diagnosed by transmitted light >>> microscopy of sputum smears >>> using the Ziehl-Neelsen stain developed in 1883; malaria is >>> diagnosed by transmitted >>> light microscopy of blood smears using the Giemsa stain developed in >>> 1904. >>> >>> The vast majority of the people who use these stains don't know how >>> or why they work; >>> when they try to evaluate modifications of the staining method, they >>> typically compare >>> slides from clinical samples on which examination of several hundred >>> high-power >>> microscope fields on a blood or sputum slide will often turn up >>> fewer than ten pathogens. >>> Since Poisson statistics have, for the most part, not impinged on >>> the consciousness of TB >>> and malaria diagnosticians, it is not generally appreciated that >>> many such comparisons >>> are meaningless. >>> >>> Now that LEDs have become cheap, there is a big push toward >>> equipping TB labs in >>> resource-poor countries with (relatively) inexpensive fluorescence >>> microscopes, so they >>> can use stains based on auramine O, which is a blue-excited, green >>> fluorescent dye that >>> stains nucleic acids (although the texts on TB erroneously describe >>> it as staining the >>> mycolic acid in the cell wall) instead of the Ziehl-Neelsen stain. >>> That's going to be a >>> waste of money; true, you can look at a slide at somewhat lower >>> magnification using >>> fluorescence, but you're still up against Poisson statistics, and >>> you really need to look >>> at much more of the slide than is practical even with a fluorescence >>> microscope. That's >>> what cytometry is for. If it takes the TB diagnosticians as long to >>> catch on as it took >>> the hematologists, we can chalk up a million or so preventable >>> deaths to the steep >>> learning curve. And the same problem, and the same grim numbers, >>> turn up for malaria. >>> >>> The foundations of our science were laid by people very much focused >>> on human disease >>> (OK, so the original paper on Poisson statistics and cell counting >>> was written by >>> somebody at the Guinness brewery). The synthetic dyes that got us >>> from empirical >>> microscopy to cytometry originated from an attempted synthesis of >>> quinine - for malaria >>> treatment - that went wrong. Paul Ehrlich, who mastered the use of >>> those dyes (and caught >>> TB in the process), made the inductive leap from selective staining >>> of different cell >>> types to selective chemotherapy; many of the compounds he worked >>> with came from Hoechst, >>> still a manufacturer of both dyes and drugs. >>> >>> Whatever else we do with cytometry, we are all ambassadors to our >>> colleagues. There are >>> undoubtedly people coming through flow labs who want little more >>> than to run their >>> samples and get back to their patients or labs. These folks may not >>> realize, as I hope >>> you do, that cytometry does more than merely save time and labor. >>> Try to see that they >>> learn something useful while you have their attention. >>> >>> -Howard >>> >>> (P.S. A lot of this stuff will be in the new book) >>> >>> >>> >> >> > > >Received on Thu Feb 21 00:38:00 2008
This archive was generated by hypermail 2.1.8 : Wed Jan 31 2007 - 03:12:00 EST