2Statistical Inference II: Interval

2

Statistical Inference II: Interval Estimation, Hypothesis Testing, and Population Comparisons

Scientiﬁc decisions should be based on sound analysis and accurate infor- mation. This chapter provides the theory and interpretation of conﬁdence intervals, hypothesis tests, and population comparisons, which are statistical constructs (tools) used to ask and answer questions about the transportation phenomena under study. Despite their enormous utility, conﬁdence intervals are often ignored in transportation practice and hypothesis tests and popu- lation comparisons are frequently misused and misinterpreted. The tech- niques discussed in this chapter can be used to formulate, test, and make informed decisions regarding a large number of hypotheses. Such questions as the following serve as examples. Does crash occurrence at a particular intersection support the notion that it is a hazardous location? Do trafﬁc- calming measures reduce trafﬁc speeds? Does route guidance information implemented via a variable message sign system successfully divert motor- ists from congested areas? Did the deregulation of the air-transport market increase the market share for business travel? Does altering the levels of operating subsidies to transit systems change their operating performance? To address these and similar types of questions, transportation researchers and professionals can apply the techniques presented in this chapter.

2.1 Conﬁdence Intervals

In practice, the statistics calculated from samples such as the sample aver-
age

X , variance s2, standard deviation s, and others reviewed in the
previous chapter are used to estimate population parameters. For example,
the sample average X

is used as an estimator for the population mean Qx,
the sample variance s2 is an estimate of the population variance W2, and so
on. Recall from Section 1.6 that desirable or “good” estimators satisfy four
important properties: unbiasedness, efﬁciency, consistency, and sufﬁciency. However, regardless of the properties an estimator satisﬁes, estimates will vary across samples and there is at least some probability that it will be different from the population parameter it is meant to estimate. Unlike the point estimators reviewed in the previous chapter, the focus here is on interval estimates. Interval estimates allow inferences to be drawn about a population by providing an interval, a lower and upper boundary, within which an unknown parameter will lie with a prespeciﬁed level of conﬁ- dence. The logic behind an interval estimate is that an interval calculated using sample data contains the true population parameter with some level of conﬁdence (the long-run proportion of times that the true population parameter interval is contained in the interval). Intervals are called conﬁ- dence intervals (CIs) and can be constructed for an array of levels of conﬁdence. The lower value is called the lower conﬁdence limit (LCL) and the upper value the upper conﬁdence limit (UCL). The wider a conﬁdence interval, the more conﬁdent the researcher is that it contains the population parameter (overall conﬁdence is relatively high). In contrast, a relatively narrow conﬁdence interval is less likely to contain the population param- eter (overall conﬁdence is relatively low).
All the parametric methods presented in the ﬁrst four sections of this chapter make speciﬁc assumptions about the probability distributions of sample estimators, or make assumptions about the nature of the sampled populations. In particular, the assumption of an approximately normally distributed population (and sample) is usually made. As such, it is imper- ative that these assumptions, or requirements, be checked prior to apply- ing the methods. When the assumptions are not met, then the nonparametric statistical methods provided in Section 2.5 are more appropriate.

2.1.1 Conﬁdence Interval for with Known σ2

The central limit theorem (CLT) suggests that whenever a sufﬁciently large random sample is drawn from any population with mean Q and
standard deviation W, the sample mean

X is approximately normally
distributed with mean Q and standard deviation

W / n . It can easily be
veriﬁed that this standard normal random variable Z has a 0.95 proba-
bility of being between the range of values [–1.96, 1.96] (see Table C.1 in
Appendix C). A probability statement regarding Z is given as

(2.1)

With some basic algebraic manipulation the probability statement of Equa- tion 2.1 can be written in a different, yet equivalent form:
(2.2)

Equation 2.2 reveals that, with a large number of intervals computed from different random samples drawn from the population, the proportion of
values of

X for which the interval

(X 1.96W

n , X 1.96W

n ) captures Q
is 0.95. This interval is called the 95% conﬁdence interval estimator of Q. A
shortcut notation for this interval is

X s 1.96 W
n

. (2.3)

Obviously, probabilities other than 95% can be used. For example, a 90%
conﬁdence interval is

X s 1.645 W .
n

In general, any conﬁdence level can be used in estimating the conﬁdence
intervals. The conﬁdence interval is

1 E , and

ZE 2 is the value of Z such
that the area in each of the tails under the standard normal curve is E 2 . Using this notation, the conﬁdence interval estimator of Q can be written as

W
X s ZE 2 n

. (2.4)

Because the conﬁdence level is inversely proportional to the risk that the conﬁdence interval fails to include the actual value of Q, it generally ranges between 0.90 and 0.99, reﬂecting 10% and 1% levels of risk of not including the true population parameter, respectively.

Example 2.1

A 95% conﬁdence interval is desired for the mean vehicular speed on Indiana roads (see Example 1.1 for more details). First, the assumption of normality is checked; if this assumption is satisﬁed we can proceed with the analysis. The sample size is n = 1296, and the sample mean is X = 58.86. Suppose a long history of prior studies has shown the popu- lation standard deviation as W = 5.5. Using Equation 2.4, the conﬁdence interval can be obtained:
X s 1.96 W
n

! 58.86 s 1.96

5.5
1296

! 58.86 s 0.30 ! ?58.56, 59.16A .

The result indicates that the 95% conﬁdence interval for the unknown population parameter Q consists of lower and upper bounds of 58.56 and
59.16. This suggests that the true and unknown population parameter would lie somewhere in this interval about 95 times out of 100, on average. The conﬁdence interval is rather “tight,” meaning that the range of possible values is relatively small. This is a result of the low assumed standard deviation (or variability in the data) of the population examined.

The 90% conﬁdence interval, using the same standard deviation, is [58.60,
59.11], and the 99% conﬁdence interval is [58.46, 59.25]. As the conﬁdence
interval becomes wider, there is greater and greater conﬁdence that the
interval contains the true unknown population parameter.

2.1.2 Conﬁdence Interval for the Mean with Unknown Variance

In the previous section, a procedure was discussed for constructing conﬁ- dence intervals around the mean of a normal population when the variance of the population is known. In the majority of practical sampling situations, however, the population variance is rarely known and is instead estimated from the data. When the population variance is unknown and the population is normally distributed, a (1 – E)100% conﬁdence interval for Q is given by

X s t s , (2.5)
E 2 n

where s is the square root of the estimated variance (s2),

tE 2

is the value of
the t distribution with n 1 degrees of freedom (for a discussion of the t
distribution, see Appendix A).

Example 2.2

Continuing with the previous example, a 95% conﬁdence interval for the mean speed on Indiana roads is computed, assuming that the population variance is not known, and instead an estimate is obtained from the data with the same value as before. The sample size is n = 1296, and the sample
mean is

X = 58.86. Using Equation 2.3, the conﬁdence interval can be
obtained as

X s t s ! 58.86 s 1.96 4.41 ! ?58.61, 59.10A .
E 2 n

1296

Interestingly, inspection of probabilities associated with the t distribution
(see Table C.2 in Appendix C) shows that the t distribution converges to
the standard normal distribution as n pg . Although the t distribution is the correct distribution to use whenever the population variance is un- known, when sample size is sufﬁciently large the standard normal distri- bution can be used as an adequate approximation to the t distribution.

2.1.3 Conﬁdence Interval for a Population Proportion

Sometimes, interest centers on a qualitative (nominal scale) variable, rather than a quantitative (interval or ratio scale) variable. There might be interest in the relative frequency of some characteristic in a population such as, for exam- ple, the proportion of people in a population who are transit users. In such
cases, an estimate of the population proportion, p, whose estimator is

pˆ has
an approximate normal distribution provided that n is sufﬁciently large (np u 5
and

nq u 5 , where

q ! 1 p ). The mean of the sampling distribution pˆ is the
population proportion p and the standard deviation is

pq n .
A large sample 1 E 100% conﬁdence interval for the population propor- tion, p is given by

pˆ s Z pq , (2.6)
E 2 n

where the estimated sample proportion,

pˆ , is equal to the number of “suc-
cesses” in the sample divided by the sample size, n, and

qˆ ! 1 pˆ .

Example 2.3

Statistical Inference II: Interval Estimation, Hypothesis Testing, and Population Comparisons

Scientiﬁc decisions should be based on sound analysis and accurate infor- mation. This chapter provides the theory and interpretation of conﬁdence intervals, hypothesis tests, and population comparisons, which are statistical constructs (tools) used to ask and answer questions about the transportation phenomena under study. Despite their enormous utility, conﬁdence intervals are often ignored in transportation practice and hypothesis tests and popu- lation comparisons are frequently misused and misinterpreted. The tech- niques discussed in this chapter can be used to formulate, test, and make informed decisions regarding a large number of hypotheses. Such questions as the following serve as examples. Does crash occurrence at a particular intersection support the notion that it is a hazardous location? Do trafﬁc- calming measures reduce trafﬁc speeds? Does route guidance information implemented via a variable message sign system successfully divert motor- ists from congested areas? Did the deregulation of the air-transport market increase the market share for business travel? Does altering the levels of operating subsidies to transit systems change their operating performance? To address these and similar types of questions, transportation researchers and professionals can apply the techniques presented in this chapter.

2.1 Conﬁdence Intervals

In practice, the statistics calculated from samples such as the sample aver- 
age
 
X , variance s2, standard deviation s, and others reviewed in the 
previous chapter are used to estimate population parameters. For example, 
the sample average X
 
is used as an estimator for the population mean Qx, 
the sample variance s2 is an estimate of the population variance W2, and so
on. Recall from Section 1.6 that desirable or “good” estimators satisfy four 
important properties: unbiasedness, efﬁciency, consistency, and sufﬁciency. However, regardless of the properties an estimator satisﬁes, estimates will vary across samples and there is at least some probability that it will be different from the population parameter it is meant to estimate. Unlike the point estimators reviewed in the previous chapter, the focus here is on interval estimates. Interval estimates allow inferences to be drawn about a population by providing an interval, a lower and upper boundary, within which an unknown parameter will lie with a prespeciﬁed level of conﬁ- dence. The logic behind an interval estimate is that an interval calculated using sample data contains the true population parameter with some level of conﬁdence (the long-run proportion of times that the true population parameter interval is contained in the interval). Intervals are called conﬁ- dence intervals (CIs) and can be constructed for an array of levels of conﬁdence. The lower value is called the lower conﬁdence limit (LCL) and the upper value the upper conﬁdence limit (UCL). The wider a conﬁdence interval, the more conﬁdent the researcher is that it contains the population parameter (overall conﬁdence is relatively high). In contrast, a relatively narrow conﬁdence interval is less likely to contain the population param- eter (overall conﬁdence is relatively low).
All the parametric methods presented in the ﬁrst four sections of this chapter make speciﬁc assumptions about the probability distributions of sample estimators, or make assumptions about the nature of the sampled populations. In particular, the assumption of an approximately normally distributed population (and sample) is usually made. As such, it is imper- ative that these assumptions, or requirements, be checked prior to apply- ing the methods. When the assumptions are not met, then the nonparametric statistical methods provided in Section 2.5 are more appropriate.

2.1.1 Conﬁdence Interval for with Known σ2

The central limit theorem (CLT) suggests that whenever a sufﬁciently large random sample is drawn from any population with mean Q and 
standard deviation W, the sample mean
 
X is approximately normally 
distributed with mean Q and standard deviation
 
W / n . It can easily be 
veriﬁed that this standard normal random variable Z has a 0.95 proba-
bility of being between the range of values [–1.96, 1.96] (see Table C.1 in
Appendix C). A probability statement regarding Z is given as

(2.1)

With some basic algebraic manipulation the probability statement of Equa- tion 2.1 can be written in a different, yet equivalent form:
   (2.2)

Equation 2.2 reveals that, with a large number of intervals computed from different random samples drawn from the population, the proportion of 
values of
 
X for which the interval
 
(X 1.96W
 
n , X 1.96W
 
n ) captures Q 
is 0.95. This interval is called the 95% conﬁdence interval estimator of Q. A
shortcut notation for this interval is

X s 1.96 W
n

. (2.3)

Obviously, probabilities other than 95% can be used. For example, a 90%
conﬁdence interval is

X s 1.645 W .
n

In general, any conﬁdence level can be used in estimating the conﬁdence 
intervals. The conﬁdence interval is
 
 1 E , and
 
ZE 2 is the value of Z such 
that the area in each of the tails under the standard normal curve is E 2 . Using this notation, the conﬁdence interval estimator of Q can be written as

W
X s ZE 2 n

. (2.4)

Because the conﬁdence level is inversely proportional to the risk that the conﬁdence interval fails to include the actual value of Q, it generally ranges between 0.90 and 0.99, reﬂecting 10% and 1% levels of risk of not including the true population parameter, respectively.

Example 2.1

A 95% conﬁdence interval is desired for the mean vehicular speed on Indiana roads (see Example 1.1 for more details). First, the assumption of normality is checked; if this assumption is satisﬁed we can proceed with the analysis. The sample size is n = 1296, and the sample mean is X = 58.86. Suppose a long history of prior studies has shown the popu- lation standard deviation as W = 5.5. Using Equation 2.4, the conﬁdence interval can be obtained: 
X s 1.96 W
n

! 58.86 s 1.96
 
 5.5 
1296

! 58.86 s 0.30 ! ?58.56, 59.16A .

The result indicates that the 95% conﬁdence interval for the unknown population parameter Q consists of lower and upper bounds of 58.56 and
59.16. This suggests that the true and unknown population parameter would lie somewhere in this interval about 95 times out of 100, on average. The conﬁdence interval is rather “tight,” meaning that the range of possible values is relatively small. This is a result of the low assumed standard deviation (or variability in the data) of the population examined.

The 90% conﬁdence interval, using the same standard deviation, is [58.60,
59.11], and the 99% conﬁdence interval is [58.46, 59.25]. As the conﬁdence
interval becomes wider, there is greater and greater conﬁdence that the
interval contains the true unknown population parameter.

2.1.2 Conﬁdence Interval for the Mean with Unknown Variance

In the previous section, a procedure was discussed for constructing conﬁ- dence intervals around the mean of a normal population when the variance of the population is known. In the majority of practical sampling situations, however, the population variance is rarely known and is instead estimated from the data. When the population variance is unknown and the population is normally distributed, a (1 – E)100% conﬁdence interval for Q is given by

X s t s , (2.5)
E 2 n
 
where s is the square root of the estimated variance (s2),
 
tE 2
 
is the value of 
the t distribution with n 1 degrees of freedom (for a discussion of the t
distribution, see Appendix A).

Example 2.2

Continuing with the previous example, a 95% conﬁdence interval for the mean speed on Indiana roads is computed, assuming that the population variance is not known, and instead an estimate is obtained from the data with the same value as before. The sample size is n = 1296, and the sample 
mean is
 
X = 58.86. Using Equation 2.3, the conﬁdence interval can be 
obtained as

X s t s ! 58.86 s 1.96 4.41 ! ?58.61, 59.10A . 
E 2 n
 
1296

Interestingly, inspection of probabilities associated with the t distribution
(see Table C.2 in Appendix C) shows that the t distribution converges to 
the standard normal distribution as n pg . Although the t distribution is the correct distribution to use whenever the population variance is un- known, when sample size is sufﬁciently large the standard normal distri- bution can be used as an adequate approximation to the t distribution.

2.1.3 Conﬁdence Interval for a Population Proportion

Sometimes, interest centers on a qualitative (nominal scale) variable, rather than a quantitative (interval or ratio scale) variable. There might be interest in the relative frequency of some characteristic in a population such as, for exam- ple, the proportion of people in a population who are transit users. In such 
cases, an estimate of the population proportion, p, whose estimator is
 
pˆ has 
an approximate normal distribution provided that n is sufﬁciently large (np u 5 
and
 
nq u 5 , where
 
q ! 1 p ). The mean of the sampling distribution pˆ is the 
population proportion p and the standard deviation is
 
pq n . 
A large sample 1 E 100% conﬁdence interval for the population propor- tion, p is given by

pˆ s Z pq , (2.6)
E 2 n
 
where the estimated sample proportion,
 
pˆ , is equal to the number of “suc- 
cesses” in the sample divided by the sample size, n, and
 
qˆ ! 1 pˆ .

Example 2.3

0/5000

Từ: -

Sang: -

Kết quả (Việt) 1: [Sao chép]

Sao chép!

2Suy luận thống kê II: Ước tính khoảng thời gian, thử nghiệm giả thuyết, và dân số so sánhScientiﬁc quyết định nên được dựa trên âm thanh phân tích và chính xác thông tin-mation. Chương này cung cấp các lý thuyết và giải thích của conﬁdence khoảng, thử nghiệm giả thuyết, và so sánh dân là thống kê xây dựng (công cụ) được sử dụng để hỏi và trả lời các câu hỏi về các hiện tượng giao thông đang được nghiên cứu. Mặc dù của tiện ích rất lớn, conﬁdence khoảng thường bị bỏ qua trong giao thông vận tải thực tế và giả thuyết thử nghiệm và so sánh popu-lation thường xuyên được lạm dụng và misinterpreted. Công nghệ-niques thảo luận trong chương này có thể được sử dụng để xây dựng, kiểm tra, và đưa ra quyết định thông báo liên quan đến một số lớn các giả thuyết. Những câu hỏi như sau phục vụ như là ví dụ. Tai nạn xảy ra tại một ngã tư đặc biệt có hỗ trợ quan điểm cho rằng nó là vị trí nguy hiểm? Làm trafﬁc - làm dịu các biện pháp giảm tốc độ trafﬁc? Thông tin hướng dẫn lộ trình thực hiện thông qua một hệ thống đăng thông báo thay đổi thành công chuyển động cơ-ists từ các khu vực tắc nghẽn? Đã bãi bỏ quy định của thị trường vận tải máy tăng thị phần cho kinh doanh du lịch? Không thay đổi các đơn vị điều hành trợ cấp để hệ thống quá cảnh thay đổi hiệu suất hoạt động của họ? Để giải quyết những và tương tự như các loại câu hỏi, các nhà nghiên cứu giao thông vận tải và các chuyên gia có thể áp dụng các kỹ thuật trình bày trong chương này.2.1 Conﬁdence khoảngTrong thực tế, thống kê các tính từ mẫu chẳng hạn như bộ mẫu- tuổi X, s2 phương sai, độ lệch chuẩn s, và những người khác xem xét trong các trước chương được sử dụng để ước tính tham số dân. Ví dụ, mẫu là X được sử dụng như một công cụ ước tính có nghĩa là dân Qx, mẫu phương sai s2 là một ước tính của phương sai dân W2, và như vậyngày. Thu hồi từ phần 1.6 mong muốn hoặc "tốt" estimators đáp ứng 4 thuộc tính quan trọng: unbiasedness, efﬁciency, nhất quán và sufﬁciency. Tuy nhiên, bất kể các thuộc tính một ước tính satisﬁes, ước tính sẽ khác nhau trên mẫu và có ít nhất một số xác suất nó sẽ khác nhau từ các tham số dân, nó có nghĩa là để ước tính. Không giống như các estimators điểm được nhận xét trong chương trước, tập trung ở đây là trên ước tính khoảng thời gian. Ước tính khoảng thời gian cho phép suy luận được rút ra về dân bằng cách cung cấp một khoảng thời gian, một hạ và thượng ranh giới, trong đó một tham số không biết sẽ nói dối với một prespeciﬁed cấp của conﬁ-dence. Logic đằng sau một ước tính khoảng thời gian là một khoảng thời gian được tính toán bằng cách sử dụng dữ liệu mẫu có chứa các tham số dân thực sự với một số mức độ conﬁdence (tỷ lệ dài hạn của thời gian khoảng thời gian thật sự dân tham số nằm trong khoảng thời gian). Khoảng thời gian được gọi là conﬁ-dence khoảng (CIs) và có thể được xây dựng cho một loạt các mức độ conﬁdence. Giá trị thấp hơn được gọi là conﬁdence thấp giới hạn (LCL) và giá trị trên giới hạn trên conﬁdence (UCL). Các rộng hơn một khoảng thời gian conﬁdence, conﬁdent thêm các nhà nghiên cứu là rằng nó có chứa các tham số dân (nói chung conﬁdence là tương đối cao). Ngược lại, một khoảng thời gian tương đối hẹp conﬁdence là ít có khả năng chứa dân param-eter (tổng conﬁdence là tương đối thấp).Tất cả các phương pháp tham số trình bày trong vòng bốn phần này chương làm cho speciﬁc giả định về phân bố xác suất của mẫu estimators, hoặc làm cho các giả định về bản chất của các quần thể lấy mẫu. Đặc biệt, các giả định của một dân số khoảng bình thường phân phối (và mẫu) thường được thực hiện. Như vậy, nó là imper-Anh có những giả định, hoặc yêu cầu, kiểm tra trước khi áp dụng-ing những phương pháp. Khi các giả định không được đáp ứng, sau đó các phương pháp thống kê nonparametric cung cấp trong phần 2,5 là thích hợp hơn.2.1.1 Conﬁdence khoảng thời gian nhất với nổi tiếng σ2Định lý giới hạn Trung tâm (CLT) cho thấy rằng bất cứ khi nào một sufﬁciently lớn mẫu ngẫu nhiên được rút ra từ bất kỳ dân với có nghĩa là Q và độ lệch chuẩn W, có nghĩa là mẫu X là khoảng bình thường phân phối với có nghĩa là Q và độ lệch chuẩn W / n. Nó có thể dễ dàng veriﬁed rằng này tiêu chuẩn bình thường biến ngẫu nhiên Z có một proba 0,95-bility là giữa phạm vi giá trị [–1.96, 1,96] (xem bảng C.1 ởPhụ lục C). Một tuyên bố xác suất liên quan đến Z được cho là (2,1)Với một số thao tác cơ bản của đại số tuyên bố xác suất của Equa-tion 2.1 có thể được ghi trong một hình thức khác nhau, nhưng tương đương: (2,2) Phương trình 2.2 cho thấy rằng, với một số lớn các khoảng tính từ khác nhau mẫu ngẫu nhiên rút ra từ dân số, tỷ lệ giá trị của X cho khoảng thời gian (X 1.96W n, X 1.96W n) chụp Q là 0,95. Khoảng thời gian này được gọi là công cụ ước tính khoảng conﬁdence của 95% của Q. Aký hiệu lối tắt cho khoảng thời gian này là X s 1,96 Wn . (2,3) Rõ ràng, khác hơn so với 95% các xác suất có thể được sử dụng. Ví dụ, 90%conﬁdence khoảng thời gian làX s 1.645 W.nNói chung, bất kỳ mức độ conﬁdence có thể được sử dụng trong ước tính conﬁdence khoảng thời gian. Khoảng thời gian conﬁdence là 1 E, và ZE 2 là giá trị của Z như vậy rằng diện tích trong mỗi của đuôi theo tiêu chuẩn đường cong bình thường là E 2. Sử dụng ký hiệu này, công cụ ước tính khoảng conﬁdence q có thể được viết dưới dạng WX s ZE 2 n . (2,4) Bởi vì mức độ conﬁdence là tỷ lệ nghịch với nguy cơ conﬁdence khoảng thời gian không bao gồm giá trị thực của Q, nó thường khoảng giữa 0,90 0,99, reﬂecting 10% và 1% mức độ nguy cơ không bao gồm các tham số thực dân, tương ứng.Ví dụ 2.1Một khoảng thời gian conﬁdence 95% là mong muốn cho tốc độ xe cộ có nghĩa là trên những con đường Indiana (xem ví dụ 1.1 cho biết thêm chi tiết). Trước tiên, giả định của bình thường được kiểm tra; Nếu giả định này là satisﬁed chúng tôi có thể tiến hành phân tích. Kích thước mẫu là n = 1296, và có nghĩa là mẫu là X = 58.86. Giả sử một lịch sử lâu dài của các nghiên cứu trước khi có hiển thị độ lệch chuẩn popu-lation như W = 5,5. Sử dụng phương trình 2.4, khoảng conﬁdence có thể thu được: X s 1,96 Wn ! 58.86 s 1,96 5.5 1296 ! 58.86 s 0,30!? 58.56, 59 .16A. Kết quả chỉ ra rằng 95% conﬁdence khoảng thời gian cho các tham số không rõ dân Q bao gồm thấp hơn và giới hạn trên của 58,56 và59.16. điều này cho thấy rằng các tham số đúng và không rõ dân sẽ nằm một nơi nào đó trong khoảng thời gian này khoảng 95 lần ra khỏi 100, Trung bình. Khoảng conﬁdence là khá "chặt chẽ," có nghĩa là phạm vi của các giá trị có thể là tương đối nhỏ. Đây là kết quả của giả thấp tiêu chuẩn độ lệch (hoặc các biến đổi trong dữ liệu) dân số kiểm tra.Khoảng 90% conﬁdence, bằng cách sử dụng cùng một tiêu chuẩn độ lệch, là [58,60,59.11], và khoảng thời gian conﬁdence 99% là [58.46, 59.25]. Như conﬁdencekhoảng thời gian trở nên rộng lớn hơn, đó là lớn hơn và lớn hơn conﬁdence mà cáckhoảng thời gian có chứa các tham số dân không biết sự thật.2.1.2 Conﬁdence khoảng thời gian nhất có nghĩa là với phương sai không rõTrong phần trước, một thủ tục đã được thảo luận để xây dựng conﬁ-dence khoảng xung quanh bình dân bình thường khi phương sai của dân số được biết đến. Trong phần lớn các tình huống thực tế lấy mẫu, Tuy nhiên, phương sai dân hiếm khi được biết đến và thay vào đó được ước tính từ dữ liệu. Khi phương sai dân chưa được biết và dân thường được phân phối, (1-E) 100% conﬁdence khoảng thời gian cho Q được cho bởiX s t s, (2,5)E 2 n nơi s là bậc hai của phương sai ước tính (s2), tE 2 là giá trị của việc phân phối t với n 1 bậc tự do (cho một cuộc thảo luận của tphân phối, xem phụ lục A).Ví dụ 2.2Tiếp tục với các ví dụ trước đó, một khoảng thời gian conﬁdence 95% cho tốc độ trung bình trên những con đường Indiana được tính, giả định rằng phương sai dân không được biết đến, và thay vào đó một dân số ước tính thu được từ các dữ liệu có giá trị tương tự như trước khi. Kích thước mẫu là n = 1296, và mẫu có nghĩa là X = 58.86. Sử dụng phương trình 2.3, khoảng conﬁdence có thể thu được như là X s t s! 58.86 s 1,96 4,41!? 58.61, 59 .10A. E 2 n 1296 Điều thú vị, kiểm tra xác suất kết hợp với phân phối t(xem bảng C.2 trong phụ lục C) cho thấy rằng sự phân bố t hội tụ để phân phối bình thường tiêu chuẩn như n pg. Mặc dù phân phối t là phân phối chính xác để sử dụng bất cứ khi nào phương sai dân là Liên Hiệp Quốc được biết đến khi kích thước mẫu là sufﬁciently lớn bình thường tiêu chuẩn distri-bution có thể được sử dụng như là một xấp xỉ đầy đủ để phân phối t.2.1.3 Conﬁdence khoảng thời gian cho một tỷ lệ dân sốĐôi khi, quan tâm đến Trung tâm trên một biến chất lượng (danh nghĩa quy mô), thay vì một định lượng biến (khoảng thời gian hoặc tỷ lệ quy mô). Có thể có quan tâm đến tần suất tương đối của một số đặc trưng trong một dân số chẳng hạn như, cho kỳ thi-ple, tỷ lệ người ở dân quá cảnh người. Trong đó trường hợp, ước tính tỷ lệ dân số, p, ước tính có là pˆ has an approximate normal distribution provided that n is sufﬁciently large (np u 5 and nq u 5 , where q ! 1 p ). The mean of the sampling distribution pˆ is the population proportion p and the standard deviation is pq n . A large sample 1 E 100% conﬁdence interval for the population propor- tion, p is given bypˆ s Z pq , (2.6)E 2 n where the estimated sample proportion, pˆ , is equal to the number of “suc- cesses” in the sample divided by the sample size, n, and qˆ ! 1 pˆ . Example 2.3

đang được dịch, vui lòng đợi..

Kết quả (Việt) 2:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Kết quả (Việt) 3:[Sao chép]

Sao chép!

đang được dịch, vui lòng đợi..

Các ngôn ngữ khác

Hỗ trợ công cụ dịch thuật: Albania, Amharic, Anh, Armenia, Azerbaijan, Ba Lan, Ba Tư, Bantu, Basque, Belarus, Bengal, Bosnia, Bulgaria, Bồ Đào Nha, Catalan, Cebuano, Chichewa, Corsi, Creole (Haiti), Croatia, Do Thái, Estonia, Filipino, Frisia, Gael Scotland, Galicia, George, Gujarat, Hausa, Hawaii, Hindi, Hmong, Hungary, Hy Lạp, Hà Lan, Hà Lan (Nam Phi), Hàn, Iceland, Igbo, Ireland, Java, Kannada, Kazakh, Khmer, Kinyarwanda, Klingon, Kurd, Kyrgyz, Latinh, Latvia, Litva, Luxembourg, Lào, Macedonia, Malagasy, Malayalam, Malta, Maori, Marathi, Myanmar, Mã Lai, Mông Cổ, Na Uy, Nepal, Nga, Nhật, Odia (Oriya), Pashto, Pháp, Phát hiện ngôn ngữ, Phần Lan, Punjab, Quốc tế ngữ, Rumani, Samoa, Serbia, Sesotho, Shona, Sindhi, Sinhala, Slovak, Slovenia, Somali, Sunda, Swahili, Séc, Tajik, Tamil, Tatar, Telugu, Thái, Thổ Nhĩ Kỳ, Thụy Điển, Tiếng Indonesia, Tiếng Ý, Trung, Trung (Phồn thể), Turkmen, Tây Ban Nha, Ukraina, Urdu, Uyghur, Uzbek, Việt, Xứ Wales, Yiddish, Yoruba, Zulu, Đan Mạch, Đức, Ả Rập, dịch ngôn ngữ.