如何解釋標準差或95%CI以描述總體?


4

在閱讀科學文章時,我看到了在描述研究人群時總結連續變量的不同方法。

例如,如果我想描述研究人群的平均年齡,可以將其與標準偏差或95%CI相結合進行顯示。儘管這些顯然是相關的,但我的問題是,哪一個最有意義。我的看法是,SD提供了有關特定研究人群的信息,而95%的置信區間告訴我我的樣本均值與總人口均值"匹配"的程度。因此,如果我想描述研究人群,那麼SD似乎是我的最佳選擇。但是,如果平均年齡是我研究的結果變量,則可以想像使用95%CI。

對此有何想法?

3

It depends on what information you want. The CI describes the population you sampled from, the standard deviation can describe both the sample and the population (if we take the sample standard deviation $s$ to be a good estimator $\hat\sigma$ of the true population value $\sigma$).

Looking at your question, you seem to be bumping into one of those times when casual vocabulary and statistical vocabulary trample on each other's toes. What you call "specific study population" is your sample and the "total population" is simply population.

With that terminological note, using confidence interval to describe your sample isn't appropriate because the sample is finite and "complete" and you use descriptive statistics (standard deviation,etc.) on completely sampled groups. Inferential statistics such as the CI should only be used make statements about the "incomplete", i.e. the population form which you drew your sample. In this sense, the CI doesn't describe your sample at all, but rather the population you drew your sample from. (Somewhat more precisely, but still simplifying a bit, the CI is an interval computed from the sample using a procedure that, in at least 95% of random samples from any population, will include the population's mean. [Thanks @whuber!] But make sure to look at the comments below for some "fine print" and discussion on the definition of the CI.) You correctly got at this intuition in your question, even if you stumbled a bit on the vocabulary.

In terms of practical advice, it really depends on what you want to do. If you just want to claim that your sample was well-balanced / representative / whatever, use descriptive statistics on your sample. If you want to make inferences about certain parameters in the broader population, use inferential statistics.

Bottom line: it depends on whether you want to describe your sample or use your sample to make statements about the general population.