Talk:Confidence interval

Confidence interval has been listed as a level-5 vital article in an unknown topic. If you can improve it, please do. This article has been rated as C-Class.

‹ The template below (Talk header) is being considered for merging. See templates for discussion to help reach a consensus. ›

This is the talk page for discussing improvements to the Confidence interval article.
This is not a forum for general discussion of the article's subject.

Put new text under old text. Click here to start a new topic.
Sign your posts by typing four tildes (~~~~).
New to Wikipedia? Welcome! Learn to edit; get help.

Article policies

Find sources: Google (books · news · newspapers · scholar · free images · WP refs) · FENS · JSTOR · NYT · TWL

Archives: Index, 1, 2, 3, 4

WikiProject Statistics

(Rated C-class, High-importance)

	This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.
C	This article has been rated as C-Class on the quality scale.
High	This article has been rated as High-importance on the importance scale.

WikiProject Mathematics

(Rated C-class, High-priority)

	Mathematics portal This article is within the scope of WikiProject Mathematics, a collaborative effort to improve the coverage of mathematics on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
C	This article has been rated as C-Class on the project's quality scale.
High	This article has been rated as High-priority on the project's priority scale.

This subject is featured in the Outline of statistics, which is incomplete and needs further development.

This article was the subject of an educational assignment at College of Engineering, Pune supported by Wikipedia Ambassadors through the India Education Program. Further details are available on the course page.

WP:NOTFORUM

With regards to the approachability of this article[edit]

Why not use the Simple English version of this complicated article (link below)? It seems more accessible for the average reader than the in-depth one here.

https://simple.wikipedia.org/wiki/Confidence_interval

DC (talk) 14:26, 30 March 2016 (UTC

Thank you for providing the link to the simple.wikipedia.org page. I found it to be more accessible just as you said. Thank you! -Anon 14:54 UTC, 15 Nov 2020

Misunderstandings Section[edit]

An anonymous editor has twice removed the text "nor that there is a 95% probability that the interval covers the population parameter" from the section under misunderstandings, with the claim that it is redundant. It is not redundant; I wish it were. There are erroneous accounts of confidence intervals which consider that it is incorrect to speak of the probability of a parameter lying within an interval but legitimate to speak of the probability of an interval covering the parameter. This is sometimes justified by saying that a parameter is a constant while the bounds of the inverval are random variables. Both statements are in fact false and it is important to state this clearly.Dezaxa (talk) 16:42, 7 March 2017 (UTC)

I don't see how you've made a distinction between the two. How is saying a given constant value is within a given range any different from saying a given range includes (or "covers") a given constant value? And where in the cited source is that distinction made? 23.242.207.48 (talk) 06:31, 10 March 2017 (UTC)

Update: I've changed "nor" to "i.e.". This way we are not giving the false impression that the two statements are distinct, but it is still clear that either way the statement is phrased it is false. This compromise should satisfy both of our concerns I think. 23.242.207.48 (talk) 21:17, 10 March 2017 (UTC)

The Misunderstandings section (and the 2nd paragraph of the summary) harp on what seems to be a pointless distinction without a difference. Of course a past sampling event either contains the true parameter, or does not. But when we speak of "confidence" that's exactly what we mean: our confidence that the sample DOES contain the parameter. To use the true-coin analogy: a person flips a coin and keeps it hidden. I am justified in having 50% confidence in my claim that the coin landed "heads". It did or did not... but confidence is referring to my level of certainty about the real, unknown value.

Now to the confidence interval: if it is correct to say that, when estimated using a particular procedure, 95 out of 100 95%-confidence intervals will contain the true parameter, than surely it must follow that I may be 95% confident that the one interval actually calculated contains that parameter. Go with the hypothetical: if the procedure was conducted 100 times, 95 of those would contain the parameter. But we have selected a subset (size one) of those 100 virtual procedures. 95 times out of 100, taking that subset will yield a sample that includes the true parameter. So I am 95% confident that this is what has happened. It either did or didn't, obviously, which is true for all statistics regarding past events. But confidence doesn't only apply to future events, but to unknown past ones.

Or one more way: I intend to do the procedure 100 times. It's expected that when I'm done, 95 of those 100 intervals will contain the true parameter. It then follows, since I have no reason to expect bias, that there's a 95% chance that the very first time I do the procedure, my interval will contain the true parameter. The fact I don't get around to doing 99 more procedures is irrelevant - I can be 95% confident that the one/first procedure performed does contain the true parameter.

How is this incorrect? (And, semi-related: I wasn't bold enough yet to edit down the pre-TOC introduction, but it unnecessarily duplicates this same Misunderstandings information). Grothmag (talk) 21:41, 6 April 2017 (UTC)

I couldn't agree more. As it stands, a "normal" person would see an instant contradiction between the meaning and interpretation section (The confidence interval can be expressed in terms of a single sample: "There is a 90% probability that the calculated confidence interval from some future experiment encompasses the true value of the population parameter.) and the misunderstanding section: "A 95% confidence interval does not mean that for a given realized interval there is a 95% probability that the population parameter lies within the interval". The only explanation currently offered is that an experiment in the past has a known outcome; it's probability is 0 or 1, and there is no doubt - the whole concept of a 95% interval is meaningless, while an experiment in the future is unknown. But that's mere word-play. The functional difference between a past experiment and a future one is merely that the past experiment corresponds to a definite, fixed population parameter (which is either in the interval or not, no shades of grey). The point about a definite, fixed population parameter is that we must know for sure whether it's in the range, so the percentage probabilities become meaningless. But (massive "but"), we never do know for sure what its value is, past or future! That's the whole point of statistical analysis. If we knew the population value, we wouldn't have to mess around calculating probabilities, we would live in a world of certainty. Since we don't know the population value in the future (because it hasn't happened yet) and we don't know it in the past (because we can't measure it) there is actually no functional difference between the two situations. Both the past and the future populations have definite, fixed population parameter values, and we are ignorant of both (albeit for different reasons).

I feel strongly that Wikipedia articles should help people understand things, rather than cover every obscure point that anyone has ever managed to get into print; we're not in the business of scoring points by being unnecessarily clever. If it's necessary to include something that is difficult to understand, then we must explain, carefully, why the point is vital, and we must endeavor to explain it appropriately for a normal reader (there is no point in writing an article that can only be understood by those who already have a thorough understanding). My feeling is that this distinction between future and past experiments requires both justification and clarification. I'm not confident enough to remove it, because it's been raised by reputable people, but if someone who knows enough could explain why the distinction is important in the interpretation of real-world experiments, the article would be greatly strengthened. 149.155.219.44 (talk) 09:08, 22 June 2017 (UTC)

In response to Grothmag and the unsigned comment above: the difference is important and is not pointless, nor is it trying to be unnecessarily clever. The crucial difference is between the probability that a method for generating intervals yields an interval that covers the parameter, and the probability that a particular realized interval covers the parameter. If I have a method for generating intervals that I can show will produce intervals that cover the parameter 95 times out of 100 in a long run of trials, then any interval that comes out of this method is a 95% confidence interval. But a particular realized interval, say 1.5 to 1.6, does not have a 95% probability of covering the parameter. It either covers it or it doesn't. Recall that confidence intervals are a frequentist concept, so its probabilities are frequencies. There cannot be such a thing as the frequency with which 1.5 to 1.6 covers the parameter, except trivially zero or one. Of course we don't know for certain whether 1.5 to 1.6 covers the parameter, but to speak of probability as a measure of how sure we are that this is true is to use Bayesian language. If you want that kind of probability you would need to calculate a Bayesian credible interval, not a frequentist confidence interval. When you say "I can be 95% confident that the one/first procedure performed does contain the true value" you seem to be using the word "confident" to mean a Bayesian probability, i.e. a degree of credibility in the truth of a proposition. A 95% probability in this sense does not follow merely from the fact that the interval comes from a method that yields covering intervals 95% of the time. This would only hold true if 1. there are no informative priors; 2. the CI is a sufficient statistic, i.e. it is capturing all the information from your experiment; and 3. there are no nuisance parameters. These are the conditions under which a confidence interval and a credible interval coincide. Dezaxa (talk) 11:58, 26 September 2017 (UTC)

I have to say, I have been mulling over this point, and it does seem like the way it is currently explained, there appears to be a distinction without a difference. I don't think that's actually the case but it would help if there was some example that would illustrate the distinction....the only example I can think of is that I can understand the error if you erroneously think the confidence interval represents a range of values that covers 95% of some probabilistic distribution of the true parameter. So an error in reasoning would be to think that by getting a particular confidence interval, that by choosing any of those values inside it as the true parameter, you are 95% chance within x/2 (where x is the length of the CI) of the true parameter. In actuality, you do not know how close or far any value is from the true parameter from only a single confidence interval calculation, because the true parameter has no distribution. You could be a little bit off or you could be way off. Or, more generally, you could make a mistake of thinking that any values within your confidence interval were somehow useful, when in fact they were meaningless.

But the way it's explained it doesn't answer the following question: Take the statement "The true value is an element of the set of values contained in the confidence interval." This will be correct 95% of time for a 95% confidence interval. If you were making a bet based on that statement, you would still win the bet 95% of the time wouldn't you? So why shouldn't one think of this as a probability? Why is this wrong (if it is wrong)? If it's not wrong, why is that different from saying the probability that the true value is inside the confidence interval is 95%? What's an example that would show these? 108.29.37.131 (talk) 21:24, 14 June 2018 (UTC)

I appreciate your efforts to explain, Dezaxa, and I wish I had noticed them earlier to respond in a more timely fashion. I remain unconvinced, partly because your response runs counter to this quote from Wikipedia's entry on "frequentist probability": "An event is defined as a particular subset of the sample space to be considered. For any given event, only one of two possibilities may hold: it occurs or it does not. The relative frequency of occurrence of an event, observed in a number of repetitions of the experiment, is a measure of the probability of that event". Note that a (non zero or one) probability still exists for the event, since probability is a measure of the sample space, not of the event.

Therefore we can still speak of the probability of an event, regardless of the fact it has occurred or has not (the two possibilities). The event had a particular probability in sample space... in this case, the true parameter falling within the "confidence interval" of the interval-generating method is the event in question. As you said yourself, the confidence interval is a frequentist concept. If the method generates intervals including the parameter 95% of the time, then any particular interval, regardless of whether it contains the parameter or not, had a 95% probability of doing so. This is where my objection arises, and where the gambling analogy of 108.29.37.131 fits nicely. This is a probability, in one sense of probability. If we are correct in our confidence interval, than it really should give us "confidence" ... We should be able to say, with 95% or whatever "odds" of being correct, that the true parameter falls within the interval generated by our method. I really don't see how this statement can be incorrect unless our method itself is flawed, in which case our confidence intervals are themselves incorrect (as they would be if nuisance parameters were ignored). Grothmag (talk) 23:27, 12 September 2018 (UTC)

I can only repeat that there is a crucial difference between speaking of probability as the long-run frequency of a hypothetical set of trials and the 'degree of confidence' attaching to a particular, observed instance of a trial. Once a trial has taken place and the result is known, this provides new information that must be conditionalised upon when assessing the probability. If my explanation is not clear enough, you might care to consult the paper referenced in the article: Morey, R. D.; Hoekstra, R.; Rouder, J. N.; Lee, M. D.; Wagenmakers, E.-J. (2016). "The Fallacy of Placing Confidence in Confidence Intervals". Psychonomic Bulletin & Review. 23 (1): 103–123. doi:10.3758/s13423-015-0947-8. PMC 4742505. PMID 26450628. It is worth noting also that the section "Counterexamples - confidence procedure for uniform location" provides an example of how an interval can be a 50% CI and yet not have a 50% probability of covering the parameter. It is really an instance of the same issue. Dezaxa (talk) 05:55, 30 November 2018 (UTC)

Dezaxa, thank you. The counter-example in the article is, indeed, a very good demonstration of your point, and though I did check out and enjoy Morey et al. too, that was really just the icing on the cake - I'm convinced. The only thing I might still stick on is the article's statement "once an interval is calculated, this interval either covers the parameter value or it does not; it is no longer a matter of probability" - since (as the Frequentist Probability article indicates) one can still discuss the probability that the interval includes the (presumably still unknown) parameter - events can have (had) probabilities after the fact. One just can't naively call it 95% (or whatever value arising from the confidence procedure). But that's a different issue, and I'll let it go. Grothmag (talk) 01:32, 4 December 2018 (UTC)

Does anyone still want a simple, everyday example for a (say 95%) confidence interval not being the same as there being a 95% probability of the parameter being contained within it? Consider using Google Maps (other mapping software is available) on your phone. You might use it dozens of times per week and most of the time the circle it draws contains your actual position. Sometimes, perhaps you have emerged from an underground public transport system, it draws a small circle but you are many miles (other distance units are available) away. Get off a plane and it might say you were still at the departure city. If Google correctly encompasses you 95% of the occasions you use the app it can be regarded as a frequentist experiment with your position as the parameter being estimated. The mapping circles are genuine 95% confidence intervals as they will truly contain the parameter value (where you actually are) 95% of the times you run the experiment (use the app). I hope it's clear that for any single use of the app you are either correctly placed inside the circle (i.e. it is 100% probable that you are within the circle) or you are outside ( = 0% probable that your position is within the circle). Many people assert that a 95% confidence interval means they can be "95% confident" about a single interval. In this example the app user would be closer to the correct meaning if they said "95% of the time Google is spot on but 5% of the time it is completely wrong". The "confidence" refers to how often the experimental method (such as the mapping application) correctly encompasses the parameter value. — Preceding unsigned comment added by 82.0.253.251 (talk) 18:53, 11 December 2019 (UTC)

The article contradicts itself[edit]

Due to this edit, the introduction is currently spreading precisely the misunderstanding that the article later warns about. The introduction says:

Given observations

x_{1},\ldots ,x_{n}

and a confidence level

\gamma

, a valid confidence interval has a

\gamma

probability of containing the true underlying parameter.

In direct contradiction, the article later rightly warns:

A 95% confidence level does not mean that for a given realized interval there is a 95% probability that the population parameter lies within the interval [...].

Joriki (talk) 11:23, 12 May 2020 (UTC)

Another incorrect statement:

Therefore, there is a 5% probability that the true incidence ratio may lie out of the range of 1.4 to 2.6 values.

According to the textbook Introduction to Data Science, the statement in question is false: a confidence level of $\gamma$ does not mean we can say the interval contains the underlying true parameter with probability $\gamma$ . How many references do we need before we can remove that misleading claim from the introduction?

TheKenster (talk) 20:06, 3 November 2020 (UTC)

I am a statistics expert. In light of Wikipedia:Be_bold, I have corrected the mistakes mentioned in this subsection, as well as a couple others that I caught. Stellaathena (talk) 23:30, 17 November 2020 (UTC)

Needs a simple non-technical explanation to begin the article.[edit]

I think the first section of this article should more clearly and simply explain what a confidence interval is, and what motivates their use. I propose something like:

The purpose of the field of statistics is to use functions of a sample (known as a statistic) to make predictions about parameters of a sampled population. Because in general a sample will not contain all the information about a population, there will be a degree of uncertainty in predictions made, and so estimated parameters will be described by random variables rather than fixed values. A confidence interval (or CI) gives, for a specific probability, an interval such that the probability that the true value for the parameter lies in the interval, given the sample.

--Effervecent (talk) 14:35, 15 May 2020 (UTC)

Effervecent Be bold and do it. Firestar464 (talk) 12:03, 30 March 2021 (UTC)

Cleanup tag[edit]

@Botterweg14: can you list the parts you think might be inaccurate? I don't have the bandwidth to rewrite at the moment, but I can help fix inaccuracies. Wikiacc (¶) 00:28, 17 September 2020 (UTC)

Search This Blog

anhduc wiki 4