Lecture 3 Research Methods Lecture 10

Lecture 3
Research Methods
n
Lecture 10
Additional Main Reading
Business Statistics by Sonia Taylor
2001, Palgrave, New York
n Statistics for Economics, Accounting and Business
by Mike Barrow, 3rd Edition, Prentice Hall.
n
Topics: The Normal Distribution
Confidence Interval
2
The Normal Distribution
n
n
n
n
n
Mean and Variance
n
“Bell shaped”
Symmetrical
Mean, median and
mode are equal
Asymptotic
X is normally distributed
Mean
Median
Mode
f(X)
n
m
X
n
n
Mean X = 10
Variance X = 4
X~N (10, 4)
Mean Y = 50
Standard Deviation Y = 3
Y~N (50, 9)
n
n
3
How to Find Probabilities?
V(3X) = 32.V(X) = 9 V(X)
V(3+X) = V(3) + V(X) = V(X)
V(X+Y) = V(X) + V(Y)
4
n
pre-calculated
But there are different size of normal
distribution because means and variances differ
\ use the standardise normal distribution
n
Standardise using: Z =
n
f(X)
d
E(4X)= 4 E(X)
E(4+X) = 4 + E(X)
E(X+Y) = E(X) + E(Y)
§ Areas are difficult to estimate, \ need to be
P (c £ X £ d )
c
n
hX
The Standardised Normal Distribution
Probability is the area under the curve!
P ( X £ c)
n
Mean of X = Expected value of X: E(X) = X =
Variance of X = V(X) = s X
X
n
n
P( X ³ d )
5
n
X -hX
sX
X~N (0, 1)
Proof!!
Hence only one table is needed
6
Dr N. Gooroochurn, Nov 03
Lecture 3
The Standardised Normal Distribution
The Standardised Normal Distribution
Table A
Table B
P (X ³ Z)
P(0 £ X £ Z)
7
8
X is the weight of students and follow a normal distribution
with mean 56 kg and variance 81. Find P(X £ 60).
If mean is 5 and the standard deviation is 10. Find P(X ³ 6.2).
Normal Distribution
(i) Standardise the data and find the value
of Z.
s = 10
(i)
Z=
X -h X
sX
60 - 56
=
= 0.44
9
s
=
6.2 - 5
= 0.12
10
Standardised
Normal Distribution
Normal Distribution
Standardised
Normal Distribution
X
60
X -m
Z=
(ii) From the Z table find the probability
value.
m = 56
Example 2a
s = 10
sZ =1
sZ =1
(ii) From the table A when z= 0.44, p = 0.3300
\ P(X £ 60) = 1-0.33 = 0.67
From table B when z = 0.44, p = 0.1700
hZ = 0
\ P(X £ 60) = 0.50+0.17 = 0.67
0.44
9
Z
Find P (2.9 £ X £ 7.1)
Z=
s
2.9 - 5
=
= -.21
10
Z =
6.2
X
P(X ³ 6.2) = 0.4522
mZ = 0
0.12
Z
10
Sampling Distributions: Some properties
Example 2b
X -m
m =5
X -m
s
=
7.1 - 5
= .21
10
Unbiasedness
Unbiased (h = h x)
Biased (h
= h x)
Standardised
Normal Distribution
Normal Distribution
s = 10
.0832
sZ =1
.0832
2.9
m =5
7.1
X
-0.21
P (2.9 £ X £ 7.1) = 0.1664
mZ = 0
0.21
Z
11
h
hx
X
12
Dr N. Gooroochurn, Nov 03
Lecture 3
Variability
Effect of Large Sample
Larger
sample size
Low Variance
High Variance
Smaller
sample size
m
m
X
13
Central Limit Theorem
How Large is Large Enough?
… the
sampling
distribution
becomes
almost
normal
regardless
of shape of
population
As sample
size gets
large
enough…
n
For most distributions, n>30
n
For fairly symmetric distributions, n>15
n
For normal distribution, the sampling
distribution of the mean is always normally
distributed
X
15
16
Estimation Process
Population
Random Sample
Mean, h, is
unknown
Mean
X = 50
Point Estimates
I am 95%
confident that
h is between 40
& 60.
Point Estimate is a single value that is
obtained from sample data and is used as the
best guess of the corresponding population
parameter
Population
Mean:
Sample
Interval
Estimate
Point Estimate
14
17
Sample
hX = X
Variance: s 2 = s 2
Standard Deviation: s = s
18
Dr N. Gooroochurn, Nov 03
Lecture 3
Interval Estimates
n
n
Confidence Interval Estimates
It is an interval centered on the point estimate
within which we expect the point estimate to lie.
Provides range of values
n
n
n
n
n
Take into consideration variation in sample statistics
from sample to sample
Based on observation from 1 sample
Give information about closeness to unknown
population parameters
Stated in terms of level of confidence
n Never 100% sure
Referred to as ‘Confidence Interval’
Mean
s Known
19
Standard Error of X (SE X )
SE: Shows the variation of h across different samples
n
s: shows the variation of the observations (X’s)
s
n
s known: SE X =
n
s2
s
=
s unknown: SE X =
n
n
n
=
n
n
n
n
n
n
Confidence interval estimate for h:
21
22
Interpretation of Confidence Interval
X - z.SE X £ h £ X + z.SE X
n
Population SD (s) in known
Population is normally distributed
If population is not normal, sample is large
We know the level of confidence
X - z.SE X £ h £ X + z.SE X
Elements of Confidence Interval
n
20
Assumptions
n
s
n
s Unknown
Differences:
1. Calculating the standard error of the mean
2. Whether to use normal distribution or student tdistribution
within one sample
SE depends on whether population standard
deviation is known or not
2
Proportion
Confidence Interval for h (s known)
n
n
Confidence
Intervals
n
n
Level of confidence: z
n Confidence in which the interval will contain the
unknown population parameter.
n High confidence, larger z and wider the range
Precision (range): SE
n Closeness to the unknown parameter
n Large SE, wider range
Sample size: n
n Smaller sample, less precise, wider range
n
n
Say 95% Confidence Interval
It does not mean that there is a 95% chance that the true mean will
lie between the range
Probability is for random variables and not for parameters such as h
If we take many samples (same size) from a population and calculate
CI for each, h will lie within 95% of the calculated intervals
2.5%=0.025
95%
2.5%=0.025
(0.95)
X - z.SE X
23
X
X + z.SEX
What value of z will give probability of 0.025?
Look in table A and the answer is 1.96
24
Dr N. Gooroochurn, Nov 03
Lecture 3
Degrees of Freedom (df )
Confidence Interval for h (s unknown)
n
n
n
Assumptions
n Population standard deviation is unknown
Use Student’s t Distribution
Confidence Interval Estimate
n
n
Number of observations that are free to
vary after sample mean has been calculated
Example
n
X - t (a , n -1).SE X £ h £ X + t (a , n -1).SE X
2
2
X 1 = 1 (or any number)
X 2 = 2 (or any number)
X 3 = 3 (cannot vary)
Degrees of Freedom
Probability
95%: a= 0.05, a/2 = 0.025
n
Mean of 3 numbers is 2
But as sample gets large, t-distribution is
the same as normal distribution and t = z
degrees of freedom
= n -1
= 3 -1
=2
25
26
V is degrees of freedom
Student’s t Distribution
a = a/2
Standard
Normal
Bell-Shaped
Symmetric
‘Fatter’
Tails
t (df = 13)
t (df = 5)
0
Z
t
27
Learning Outcomes
Example
A random sample of n = 25 has X = 50 and S = 8.
Set up a 95% confidence interval estimate for m
S
S
£ m £ X + ta /2,n-1
n
n
8
8
2.064
50 - 2.0639
£ m £ 50 + 2.0639
2.064
25
25
46.69 £ m £ 53.30
X - ta /2,n-1
29
n
You should be able to understand
n
The normal distribution
n
Point estimate
n
Confidence interval
n
the Student’s t distribution
30
Dr N. Gooroochurn, Nov 03