The Relative Impact of Interviewer Effects and Sample Design Effects... Author(s): Colm O'Muircheartaigh and Pamela Campanelli

The Relative Impact of Interviewer Effects and Sample Design Effects on Survey Precision
Author(s): Colm O'Muircheartaigh and Pamela Campanelli
Source: Journal of the Royal Statistical Society. Series A (Statistics in Society), Vol. 161, No. 1
(1998), pp. 63-77
Published by: Wiley for the Royal Statistical Society
Stable URL: http://www.jstor.org/stable/2983554 .
Accessed: 16/04/2013 08:25
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
.
Wiley and Royal Statistical Society are collaborating with JSTOR to digitize, preserve and extend access to
Journal of the Royal Statistical Society. Series A (Statistics in Society).
http://www.jstor.org
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
J. R. Statist.Soc. A (1998)
161, Part1,pp. 63-77
The relativeimpactof interviewer
effectsand
sample design effectson survey precision
Colm O'Muircheartaight
London School of Economics and PoliticalScience, UK
and Pamela Campanelli
Social and CommunityPlanningResearch, London, UK
[ReceivedMay1996. RevisedFebruary
1997]
indatacollected
sourcesoferror
from
structured
face-to-face
interSummary.One oftheprincipal
viewsis theinterviewer.
The othermajorcomponent
ofimprecision
insurveyestimates
is sampling
variance.Itis rare,however,
tofindstudiesinwhich
thecomplex
sampling
varianceandthecomplex
interviewer
varianceare bothcomputed.
This papercomparesthe relative
impactof interviewer
and sampledesigneffects
on surveyprecision
use ofan interpenetrated
effects
bymaking
primary
unit-interviewer
whichwas designedbytheauthorsforimplementation
inthe
sampling
experiment
secondwaveoftheBritish
Household
PanelStudyas partofitsscientific
Italso illustrates
programme.
theuse ofa multilevel
(hierarchical)
approachinwhichtheinterviewer
andsampledesigneffects
are
ina substantive
estimated
whilebeingincorporated
modelofinterest.
simultaneously
Interviewer
Keywords:Interviewer
effect;
variance;Multilevel
models;Responsevariance
1.
Introduction
The intervieweris seen as one of the principalsources of errorin data collected from
structuredface-to-faceinterviews.Surveystatisticianshave expressedthe effectin formal
statisticalmodels of two kinds.In theanalysis-of-variance
the errors
(ANOVA) framework
are seen as net biases fortheindividualinterviewers
and theeffectis seen as the increasein
variancedue to thevariability
amongthesebiases.The alternative
approachis to considerthe
interviewer
effectto arise fromthe creationof positivecorrelationsbetweenthe response
deviationscontainedin (almostall) surveydata; theincreasein thevarianceof a mean is due
to the positivecovariance among these deviations.Studies of interviewer
variabilitydate
fromthe 1940s (see, forexample,Mahalanobis (1946)). The ANOVA model in thiscontext
was expounded by Kish (1962) and developedby Hartleyand Rao (1978) and others;the
correlationmodel was firstpresentedby Hansen et al. (1961)- theCensus Bureau modeland extendedby Fellegi (1964, 1974).
The othermajor componentof imprecisionin surveyestimatesis samplingvariance.It is
known that for most complex sample surveydesigns the precisionof estimatorsis low
comparedwiththatof simplerandomsampledesignsof thesame size. Area clusterstypically
formthe samplingunits for complex sample designsand the loss of precisionis due to
positivecorrelationsbetweenpeople belongingto the same area clusters.
tAddressfor correspondence:Methodology Institute,London School of Economics and Political Science,
Houghton Street,London, WC2A 2AE, UK.
E-mail:colm@lse.ac.uk
? 1998 RoyalStatistical
Society
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
0964-1998/98/161063
64
and P. Campanelli
C. O'Muircheartaigh
There are many other sources of measurementerror in surveys.Some (e.g. coder
to estimatethrougheitherreplicationor intervariance) are relativelystraightforward
penetration.Others (e.g. question wording effects)require special interventionsin the
reviewmay be foundin Biemeret
A comprehensive
surveyprocess fortheirinvestigation.
al. (1989).
Though thereare some studiesin whichthe complexsamplingvarianceand thecomplex
variance are both computed(Bailey et al. (1978) for the US National Crime
interviewer
Surveyin Lesotho and Peru
Surveyand O'Muircheartaigh(1984a,b) fortheWorld Fertility
are examples),such studiesare rare. This is due to a combinationof designand analytic
interviewsurveysin both the USA and the UK is to
challenges.The normforface-to-face
have theworkloadfroma givenprimarysamplingunit(PSU) assignedto a singleinterviewer
workin onlyone PSU. This confoundsthesampling
and, moreover,to have each interviewer
designin
and non-samplingvariances.Such confoundingis removedby an interpenetrated
Owing to cost considerations,
which respondentsare assignedat random to interviewers.
surveys.Even fortelephonesurveys,where
thesedesignsare rarelyemployedin face-to-face
(see Grovesand Magilavy(1986)),
thepracticalproblemsare less severe,thoughnon-trivial
such studiesare uncommon.
and cluster
Whereasithas beenpossibleto carryout a simultaneousanalysisof interviewer
effectsfor sample means and othersimplestatistics,it is only recentlythat softwarehas
whileincorporatsimultaneously
and clustereffects
becomeavailable to estimateinterviewer
directlyintoa substantivemodel of interest.This is possiblethroughtheuse
ing theseeffects
multilevelmodelusingthesoftwarepackage MLn (Rasbash et al., 1995);
of a cross-classified
analysisare VARCL (Longford,1988)and HLM (Bryket
alternativeprogramsformultilevel
means and proportionsestimatedfromsurveydata are
al., 1986). (Note that,technically,
ratio estimatesas thereis uncontrolledvariationin the sample size. For the BritishHousehold Panel Study (BHPS) the selectionof PSUs withprobabilityproportionalto size and
equal probabilitiesoverall,thisvariationis fairlytightlycontrolled.)
on
effects
and sampledesigneffects
This paper comparestherelativeimpactof interviewer
whichwas
experiment
PSU-interviewer
surveyprecisionby makinguse of an interpenetrated
in the second wave of the BHPS. Section 2
designedby the authorsfor implementation
describesin detail the data and methodsused. Section3 exploresthe resultsover all BHPS
variables and illustratesthe use of a multilevel(hierarchical)approach in which the
whilebeingincorporated
are estimatedsimultaneously
and sampledesigneffects
interviewer
in a substantivemodel of interest.Finally,Section4 summarizesand discussesour findings
and theirimplicationsforsurveyresearchpractice.
2.
Data and methods
Design
2.1. The BritishHousehold Panel Study and the Interpenetrated
The data sourceforthisprojectis theBHPS whichis conductedby theEconomicand Social
ResearchCouncil (ESRC) CentreforMicro-socialChange at the Universityof Essex, UK.
on theBHPS began in 1991and is scheduledto continuein annual waves until
Interviewing
clusterdesigncoveringall of Great
at least 1998. The surveyused a multistagestratified
compriseda shorthousehold level questionnaire
Britain.The wave 2 surveyinstrument
schedulewithevery
45-minuteinterview
and shortself-completion
followedby a face-to-face
adult in thehousehold.Topics coveredincludedhouseholdorganization,incomeand wealth,
labour marketexperience,housingcosts and conditions,healthissues,consumptionbehaviour,educationand training,socioeconomicvalues and marriageand fertility.
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
Interviewer
Effectsand Sample DesignEffects
65
An interpenetrated
designwas implemented
in a sampleof PSUs in wave 2 of thesurvey.
Owing to fieldrequirementsand travelcosts, a constrainedformof randomizationwas
adopted in which addresses were allocated to interviewers
at random withingeographic
'pools'; thesepools are sets of two or threePSUs. EveryPSU whose centroidwas no more
than 10km fromthe centroidof at least one otherPSU was eligiblefor inclusionin the
design. 153 of the 250 PSUs in the BHPS sample were eligible.Mutually exclusiveand
exhaustivecombinationsof these153 eligiblePSUs wereformed;thisprocessresultedin 70
pools of PSUs, mostwithtwo,and some withthree,PSUs each. A systematicsample of 35
pools was thenselectedforinclusionin theinterpenetrating
sampledesign.GreatBritainwas
partitionedforthe sample designinto 18 regions;onlytwo of thesedid not includeat least
one selectedgeographicpool.
Of the35 geographicpools formed,fourprovedto be ineligibleas thesame interviewer
was
needed to cover all the PSUs in the pool and one proved to be effectively
ineligiblefor
analysisas one interviewer
was needed to cover three-quarters
of the geographicpool. An
examinationof the 30 areas in whichthe designwas implementeddoes not indicateany
systematicabnormality.To the extentthat an abnormalitydid exist,it would affectour
resultsonlyif it wereto interactwiththeeffectof interviewers
or withthe designeffect.
25 of the 30 usable geographicpools includedtwo interviewers
and two PSUs and five
includedthreeinterviewers
and threePSUs. WithinPSUs in a givenpool, householdswere
randomlyassignedto theinterviewers
workingin thosePSUs. The samplesize foranalysisof
the 30 geographicpools was 1282 householdsand 2433 individualrespondents.
2.2. Analytic
methods
Our initialfocuswas on thecalculationof intraclasscorrelationcoefficients
p foreach of the
componentsfromtheinterpenetrated
design.These includedtheinterviewer
(pi) and thePSU
wereestimatedforall variablesin thedata setforwhichtherewere700
(ps). These coefficients
or moreresponses.(In general,the multivariate
ANOVA (MANOVA) analyseswhichwere
used required74 degreesof freedom.A rough rule of thumbto ensuresufficiently
stable
estimatesis to set n greaterthan or equal to thedegreesof freedomtimes10. Applyingthis
ruleto thecurrentmodelssuggestsan n of approximately
740.) Categoricaland mostordinal
variablesweretransformed
into binaryvariablesbeforetheanalyses;ordinalattitudescales
(Likert scales) were, however,treatedas continuous.HierarchicalANOVAs were then
carriedout foreach of thesevariablesusingthe SPSS MANOVA option.The use of SPSS
allowed us to explorethislargenumberof variablesmorequicklyand efficiently
thanwould
have been feasiblewithMLn. These hierarchicalANOVAs wererestricted
to cases fromthe
2 x 2 geographicpools as the programwould not handle the simultaneouscalculationof
2 x 2 and 3 x 3 geographicpools (note, however,that this is feasible with MLn). The
eliminationof the3 x 3 geographicpools resultedin a reductionin samplesize of 21% at the
householdlevel (to 1010 households)and 22% at theindividuallevel (to 1903 individuals).
The sums of squares were partitionedusing a 'regressionapproach' in whicheach term
is correctedfor everyothertermin the model. This makes sense substantivelyand also
facilitatescomparisonwith MLn. It also means that the values for pi and p, which are
reportedare conditionalon each other.(As our designis not balanced,the sums of squares
forthe various componentsof themodel willnot add up to the total sum of squares. Also
hierarchicalANOVA assumes a continuousdependentvariable. For proportionsbetween
0.20 and 0.80, however,theapproximationshouldbe fairlyclose.) Data fromthehierarchical
ANOVA runswerethenassembledto createa metadata set of p-estimates
constructedfrom
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
66
and P. Campanelfi
C. O'Muircheartaigh
was added to
theresultsof the 820 separateanalysesof theoriginaldata. Otherinformation
checks)and
thisdata set such as questiontype(attitudes,facts,quasi-factsand interviewer
topic area of the questionnaire.
2.3. Cross-classifiedmultilevelmodels
An alternativeconceptualizationof the analysisis as a multilevel(hierarchical)model in
whichthe interviewer,
PSU and geographicpool are hierarchicalpartitionsand the terms
correspondingto themin themodel are consideredto be randomeffects.It is onlyrecently
that cross-classifiedmultilevelanalysis has become feasible (see Goldstein (1995) and
Rasbash et al. (1995)); the designis implementedin MLn by viewingone memberof the
cross-classification
as an additional level above the other. A basic multilevelvariance
withingeographic
by PSU cross-classification
componentsmodel to capturethe interviewer
pool can be definedas
Yi(jk)l = Cl +
3Xi(jk)l + Uj + Uk + Ul + ei(jk)I
(1)
withinthe Ith
forthe ith surveyelement,withinthejth PSU crossedby thekthinterviewer,
geographicpool, where Yi(jk)lis a functionof an appropriateconstant al, explanatory
/3,and an individualerrortermei(jk)l. Here uj is a
variable(s) x and associated coefficients
k, and ul is the
randomdeparturedue to PSU j, Ukis a randomdeparturedue to interviewer
random departuredue to geographicpool 1. Each of these termsand ei(jk), are random
quantitieswhosemeansare assumedto be equal to 0. In cases wherethedependentvariableis
a dichotomy,Yi(jk)l would be replacedin equation (1) by log{17rjk),/(l -7i(jk)l)}, where
'i(jk)l
exp(ce+
f3Xi(jk)l+ uj + Uk+ u,)
Uj + Uk + u,)
1 + exp(al + ,3xi(jk),+
When the dependentvariableis continuous,p can be calculateddirectlyfromthe variance
estimates in a variance componentsmodel (e.g. interviewervariance divided by total
variance).When the dependentvariableis dichotomous,the variancecomponentsare given
on the logisticscale and a more complex computationis required.We generaterandom
normal deviates withvariance given by the componentestimate.These deviatesare then
values is calculated
transformed
(takingtheanti-logit)and thevarianceof thesetransformed
directlyto give the numeratorforp.
and PSU effectsas randomeffectsratherthan as fixed
The treatmentof the interviewer
effects(which is more common in the surveysamplingliterature)postulates a 'superused in thestudyweredrawnand an
fromwhichtheinterviewers
population' of interviewers
we can considertheinference
infinitely
largepopulationof PSUs. In thecase of interviewers
fromwhomthesurveyinterviewers
as beingmade to thepopulationof potentialinterviewers
were drawn. For the PSUs the assumptioninvolves essentiallyignoringa small finite
in the relative
populationcorrection(see, forexample,Kalton (1979)). As we are interested
and the sample design
magnitudesof the componentsof variancedue to the interviewers
under the same essentialsurveyconditionsthis treatmentwill not affectour conclusions
materially.
demonstrated
An added advantageof multilevelmodellingin general,as recently
(see Hox
to
covariates
al.
is
the
directlyinto
et al. (1991) and Wigginset
facility incorporate
(1992)),
the
the
factors
as
of
such
work
we
are
able
to
examine
the analysis.For our
interviewer,
age
was presentforbothwave
gender,lengthof service,statusand whetherthesame interviewer
1 and wave 2 of thepanel survey.We can also includecharacteristics
of therespondents.We
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
Interviewer
Effectsand SampleDesignEffects
67
based on a matchto censussmallarea statisticsin due
plan to add area levelcharacteristics
linearmodelshave ofcoursebeen used to analysesurveydata. Such noncourse.Single-level
hierarchicalmodels ignorethe way in whichthe clusteringin the sample design and the
may affectthe variance-covariance
clusteringof responsesgeneratedby the interviewers
structureof the observations.
3. Results
3.1. Findingsfromhierarchicalanalysis of variance
The designeffectis the most commonlyused measureof the effectof within-PSUhomogeneityon surveyresults;this is deff= 1 + ps(b- 1) wheres denotes the clusteringin the
correlationand b is the average numberof elements
samplingframe,Ps is the intracluster
selectedfroma cluster(the clustertake). We presentthe resultsof thisanalysisin termsof
for interviewers
and PSUs. Both measurethe withinthe intraclasscorrelationcoefficients
unit (interviewer
or PSU) homogeneityof the observations.Within-PSUhomogeneityis a
characteristicof the true values of the elementsin the population. Within interviewer
and his or
workloadsthe homogeneityresultsfromthe interactionbetweenthe interviewer
her respondents;the effecton the varianceof an estimatemay,however,be expressedin a
form that is identical with that for the design effect. The interviewereffect is
correlation
inteff= 1 + pi(m- 1) wherei denotesthe interviewer,
pi is the intra-interviewer
workload
workload.The clustertake and the interviewer
and m is the average interviewer
arise as a resultof decisionsby the designerof the survey;p, and pi are quantitiesthatare
As such thelatterare
and to thequalityof interviewers.
intrinsicto thepopulationstructure
more portable than the variance components themselves;the variance components
themselvescan of course be calculatedonce the p-valuesare known.
Duringthepast 30 yearsor so evidencehas accumulatedabout theorderof magnitudeof
correlationcoefficient
correlationcoefficient
and the intra-interviewer
both the intracluster
in sample surveysin the USA and elsewhere.Though it is impossibleto generalizewith
confidence,theevidencesuggeststhatvalues of pi greaterthan0.1 are uncommon.(Thereis
numbersof interviewers,
difficulty
in comparingacross studiesas each involvesdifferent
reportthe
different
typesofvariables.In addition,someresearchers
samplesizesand different
negativevalues ofpi whichoccurand otherssettheseto 0.) Also, as indicatedby themeansin
Table 1, themajorityof values tendto be less than0.02 (all thesevalues are estimates,which
accounts for the negativevalues in Table 1). There is also some evidence,althoughthis is
in different
ways;attitude
by interviewers
typesof variablesare affected
mixed,thatdifferent
effectthan
itemsand complexfactualitemsare consideredmoresensitiveto an interviewer
simplefactualitemsare (see, forexample,Collinsand Butcher(1982), Feather(1973), Fellegi
(1964), Gray (1956) and Hansen et al. (1961)).
The range of values reportedin the literatureforPs is similarto that for Pi, thoughwe
would expectpi to have morevalues near 0. Again,theevidencesuggeststhatvalues greater
than 0.1 are uncommonand thatpositivevalues are almostuniversal.The largevalues tend
to be forcertaintypesof demographicvariables,notablytenureand ethnicorigin.This is to
be expectedsinceadjacentgroupsof housesin a smallarea willtendto be of similartypeand
tenure,and people of similarethnicoriginoftenliveclose to each other(Lynnand Lievesley,
1991). Other demographicvariablessuch as sex and maritalstatustend to show verylow
values. It is typicallyfoundthatbehaviouraland attitudinalvariableshave p,-valuesthatare
somewherebetweentheseextremes,withattitudinalvariablesshowingslightlylowervalues
than behaviouralvariables. In the World FertilitySurvey(see Verma et al. (1980)), the
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
68
C. O'Muircheartaigh
and P. Campanell
Table1. Summary
ofotherinterviewer
varianceinvestigations
Study
Valuesof pi
Neighbour
noiseandillness(UK) (Gray,1956)
Television
habits(UK) (GalesandKendall,1957)
Census(USA) (Hansonand Marks,1958)
Blue-collar
workers
(USA) (Kish,1962)
Firststudy
Secondstudy:interview
Secondstudy:self-completion
Census(Canada)(Fellegi,1964)
Healthsurvey
(Canada)(Feather,1973)
Mentalretardation
(USA) (FreemanandButler,1976)
Aircraft
noise(UK) (O'Muircheartaigh
andWiggins,
1981)
Consumer
attitude
survey
(UK) (CollinsandButcher,
1982)
9 telephone
surveys
(USA) (GrovesandMagilavy,
1986)
Mean
-0.018 to 0.10t
(0.00)to 0.05,0.19$
-0.00 to 0.061$
0.015t
?
0.011$
-0.031 to 0.092
-0.005 to 0.044
-0.024 to 0.040
(0.00)to 0.026
-0.007 to 0.033
-0.296 to 0.216
(0.00)to 0.09
-0.039 to 0.119
-0.042 to 0.171
0.020
0.014
0.009
0.008
0.006
0.036
0.020
0.013
0.009
fromF-ratios
tCalculated
byusingtheformula
supplied
byKish(1962).
availablethrough
Kish(1962).
tNumbers
?Meancannotbe computed:
GalesandKendall(1957)didnotreport
all thevariables
analysed.
medianPs across variouscountrieswas 0.02 forvariousnuptiality
and fertility
variables.The
median was muchhigher(around 0.08) forvariablesconcerningcontraceptiveknowledge.
In comparing these two sources of variability,Hansen et al. (1961) found that the
interviewer
variance was oftenlargerthan the samplingvariance. Bailey et al. (1978), in
contrast,found responsevariance componentsthat were at least 50% of theirsampling
varianceforonlya quarterof theirstatistics.
We includedin the analysis 820 variables,some representing
subcategoriestaken from
BHPS items.Of these,98 were attitudequestions,574 were factual,88 were interviewer
checks (itemscompletedby the interviewers
withouta formalquestion)and 60 werequasifacts(mostlyon a self-completion
form).Fig. 1 showsthecumulativefrequency
distributions
for ps and pi. The orders of magnitudefor the two coefficients
were strikingly
similar.
As these values are themselvesestimatestheyare subject to imprecision;using a test of
significance
at the 5% levelfourin 10 of thevalues of p, and threein 10 of the values of pi
%
Cumulative
100
90
80
70
50 -
40 -1
/
20
XY-10
-0.02 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.16 0.18
Valueofp
and intracluster
ofpi (
Fig. 1. Intra-interviewer
correlations:
cumulative
distribution
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
) and p, (..
and Sample DesignEffects
Effects
Interviewer
69
as positivevaluesare
than0. In thecase ofp, thisis notsurprising
greater
weresignificantly
is that,within
thestudy,
surprising
Whatis somewhat
variables.
formostsurvey
expected
pi
For thesedata,becauseofthewaythattheinvestigation
is ofthesameorderofmagnitude.
takewerethesame;
workloadand theaveragecluster
theaverageinterviewer
was designed,
ofthesampledesignandtheinterviewers
thattheeffects
ofPs andpi imply
thusourestimates
werealso aboutthesame.
valuesof pi. For attitudequestions,28%
All typesof questionsshowsomesignificant
greaterthan0; forfactualquestionsit was 26%;
of the valuesof pi weresignificantly
questions,25% (withthe
58%; forthequasi-factual
checks,a staggering
forinterviewer
for
ofthefindings
is thesimilarity
items).Whatis interesting
oftheself-completion
exclusion
There
ofsomestudies.
withthefindings
whichis incontrast
andfactualitems,
theattitudinal
item.Amongthoseitemsbased on Likert
is somevariationbetweentypesof attitudinal
valuesof Pi; thiscompareswith25% oftheotherattitude
scales,33% showedsignificant
items.
32% oftheitemsin
bysourceofthequestion.Forexample,
We also lookedfordifferences
than0. The samewas
greater
whichweresignificantly
theindividual
schedulehad pi-values
items,27% ofthecoversheetitems,28% ofthederived
truefor17% oftheself-completion
itemsand
questionnaire
32% ofthehousehold
questionnaire,
variablesfromtheindividual's
The notabledifference
here
34% ofthederivedvariablesfromthehouseholdquestionnaire.
itemsand thosethatare
is between
theself-completion
effects
to interviewer
in susceptibility
effect
at all on the selfThe factthatthereis an interviewer
administered.
interviewer
such
to suggest
foundlittleevidence
Kish(1962),forexample,
formis interesting.
completion
and Wiggins
thathe examined.O'Muircheartaigh
on thewritten
questionnaires
an effect
ofthe
in thepresence
fora healthsupplement
completed
didfindan effect
(1981),however,
items).
(as weretheBHPS self-completion
interviewer
in theproportion
of significant
Therewas also basicallyno difference
pi-valuesbetween
health,marriageand fertility,
demographics,
sectionsof the questionnaire:
the different
valuesand incomeand householdallocation(withthe
history,
employment
employment,
thesectionat theendofthe
from22% to 35%). In contrast
ranging
significant
percentage
was highlysusceptibleto
for interviewers
to recordtheirobservations
questionnaire
sectionshowedsignificant
observation
76% oftheitemsintheinterviewer
interviewer
effects.
witha
and continuous
variables,
dummy
between
valuesof Pi. Therewas also a difference
variables.
ofeffects
beingnotedforthecontinuous
higherproportion
of0.35between
therewas a clearpositivecorrelation
Furthermore,
pi and Ps,A positive
thatshowlargeintracluster
homogeneity
thatvariables
correlation
between
p, andpi implies
to differential
substantial
amongtruevalues)are also sensitive
clustering
(showrelatively
been observed
has not,to our knowledge,
Such a correlation
frominterviewers.
effects
the
are themselves
variables,
in thecomputation
ofthiscorrelation
before.As theelements
to have a large
maybe becauseit is necessary
absenceof suchevidencein theliterature
coefficient
withanyprecision.In our
numberof variablesto estimatesucha correlation
acrosstypesofvariables.
showsremarkable
consistency
analysisthecorrelation
itis reasonable
to oneanother;
whoaresimilar
containindividuals
clusters
Homogeneous
in
withsimilarvaluesforthevariable questionmayrespondin a
to suggestthatindividuals
tobearintheinterviewer-respondent
brings
similarwayto whatever
qualitiestheinterviewer
would
intracluster
thatmanifested
homogeneity
Thiswouldmeanthatvariables
interaction.
intra-interviewer
to
be
homogeneity.
display
on balancebe morelikelythanothervariables
(see
An alternative
maybe foundin someof theearlyworkon interviewers
explanation
obtained
the
known
to
influence
of
interviewers
are
responses
Hyman(1954)).Expectations
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
70
C. O'Muircheartaigh
and P. Campanelli
For a variableto havea relatively
within
a
largevalueofP. theindividuals
byinterviewers.
values;itis possiblethatthisconsistency
willaffect
cluster
willhaverelatively
homogeneous
workloadprogresses,
leadingto enhanced
theinterviewers'
expectations
as theinterviewer's
correlations
withininterviewer
workloads.
withthe technicalinterpretation
of the correlation
These explanations
are consistent
in
between
theresponse
andthesampling
deviation
fora singlevariablepostulated
deviation
inHansenet al. (1961),Fellegi(1964)andBaileyet al.
theCensusBureaumodelandincluded
at
thiscorrelation
directly
fora singlevariablewithout
(1978).It is notpossibleto estimate
ofpi.
inthestandard
modelestimate
leasttwowavesofdatacollection,
thoughitis included
mayarisefora singlevariable.
Hansenet al. (1961)gavean exampleofhowthiscorrelation
3.2. Findingsfrommultilevel
models
For illustration,
we includethreeMLn models,oneforeachofthemaintypesofvariables:
We
TheseareshowninTables2-4 respectively.
interviewer
checkitems,factsand attitudes.
whether
(single-level)
modelto discover
havealso shownthecorresponding
non-hierarchical
willbe affected
thedata structure
approconclusions
whenwe incorporate
oursubstantive
in theanalysis.
priately
whether
children
were
ThevariablemodelledinTable2 is a binarysubcategory
indicating
From
sectionoftheinterview,
as notedbytheinterviewer.
present
duringthedemographics
thehierarchical
p-valuesforthischildren
presentsubanalysesof variance,theestimated
categorywerepi = 0.171 and p, = 0.062 (n = 725).
modelshowingthe
The hierarchical
versionof model1 is a basicvariancecomponents
standarderrorsof the
of PSU and interviewer.
theestimated
Although
cross-classification
of therandomparameters
is
randomparameters
are includedin Table 2, thesignificance
of thestandarderrors
as thedistribution
based on a contrasttest.(Thisis recommended
fromnormality,
forthe randomparameters
especiallyin small
maydepartconsiderably
modeloftheinterviewer
checkitem:children
presentt
Table2. Multilevel
logisticregression
Model3
Model2
ModelI
Hierarchical
Hierarchical NonHierarchical NonNonhierarchical
hierarchical
hierarchical
Fixedeffects
Grand mean
No. of childrenin
household
Respondent'sgender
(female)
Interviewer's
gender
(female)
-1.05
(0.08)
-1.05
(0.14)
-3.24
(0.37)
1.20
(0.10)
0.62
(0.21)
-3.30
(0.41)
1.23
(0.11)
0.59
(0.22)
-5.42
(0.94)
1.23
(0.10)
0.62
(0.21)
1.11
(0.43)
-5.49
(1.27)
1.25
(0.11)
0.60
(0.22)
1.14
(0.62)
Variance
components
Randomeffects:
source
Respondent
PSU
Interviewer
1
0.09
(0.12)
0.49
(0.20)t
1
0.08
(0.17)
0.89
(0.32)$
tStandard errorsare givenin parentheses.
randomparametersbased on a contrasttest.
$ Significant
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
1
0.08
(0.17)
0.81
(0.31)t
Effects
and Sample DesignEffects
Interviewer
71
but not betweenPSUs. In the
samples.)We foundsignificant
variationbetweeninterviewers
model the estimateforvariationbetweengeographicpools forthisvariablewas 0; thiswas
not of coursethecase forall variables.Parametersclose to 0 are oftenconstrainedto 0 by the
MLn program;in thiscase theparameterremains0 evenwhenemployingthe 'second-order
MLn
estimationprocedure'.(In theestimationof randomparametersin a logisticregression,
uses a weightedgeneralizedleast squares estimationprocedurewhich requiresthe quantitiesto be estimatedto be in the linear part of the model. A series expansion is used
to approximatea linear form.Simulationand theoryhave suggestedthat the first-order
of thepaxameters.In manymodelsthe
estimationprocedurescan lead to an underestimation
underestimation
is negligible.However,in some models wherepredictedprobabilitiesare
can be severe.
extreme,or wherethereare fewlevel 1 unitsper level2 unit,underestimation
estimationprocedure.
Thereis an optionin MLn whichallows theselectionofa second-order
This procedure,however,is less computationallyrobust.See Woodhouse (1995) for a full
of themodeltheindividualvariation
descriptionof thismatter.)In thestandardformulation
is assumed to have a binomial distributionand is constrainedto 1. (The validityof this
assumptioncan be testedin MLn by relaxingthisconstraint.)
In model 2, we have includedtheindividuallevelexplanatoryvariable,numberof children
differences
betweeninterviewers
in household,as it is desirableto controlforany systematic
take place in housein thecompositionof theirworkloads;an interviewer
whose interviews
whose
on thisitemfromthoseinterviewers
holds withoutchildrenwouldbe expectedto differ
workloadscontaineda largenumberof householdswithchildren.This controlvariablehas a
in thehierarchicalmodel. (For fixedeffects
coefficient
significance
may be judged
significant
by comparingthe estimatewithits standarderrorin the usual way.)
Also included is the individual level explanatoryvariable 'respondent'sgender'. We
expected that the presenceof childrenduringthe interviewwould be a functionof the
respondent'sgender,withwomenrespondentsbeingmorelikelyto have childrenwiththem
than male respondentsare. As can be seen by the values in Table 2, thisexpectationwas
confirmed.
in the hierarchical
to note that the random coefficient
for interviewers
It is interesting
version of model 2 increasesin comparisonwith model 1. This suggeststhat it is not
workloadsthatexplainsthisinterviewer
variability,but
haphazard variationin interviewer
in recordingthepresenceof childrenis greater
ratherthatthevariationbetweeninterviewers
when opportunity(i.e. childrenin the household) is taken into account as well as the
respondent'sgender.The basic conclusionwhichcan be drawnfrommodel2 is thesame for
versionsof the model.
both the hierarchicaland thenon-hierarchical
age,
We thenadded severalinterviewer
explanatoryvariables.These includedinterviewer
supervisoror area manager)and yearswiththe
gender,status(whethera basic interviewer,
had visitedthe
company. Also includedwas a measure of whetherthe same interviewer
Of thesevariouscharacteristics,
onlyinterviewer
householdforthepreviousyear'sinterview.
in thenon-hierarchical
model and
genderis consideredin model 3. It was clearlysignificant
that in this case
only approached significancein the hierarchicalmodel. It is interesting
different
conclusionsmighthave been reacheddependingon whichmodel was considered.
We also investigatedthe possibilityof an interactionbetween interviewergender and
undereitherversionof model 3.
was not significant
respondentgender.This coefficient
in thiscase.
effect
Thereare at leasttwopossibleexplanationsforthecorrelatedinterviewer
to arrangethe
in the abilityof interviewers
First,it is quite likelythatthereis a difference
in
circumstancesof the interviewso that the respondentis alone at the time- flexibility
emphasizesthe need for an
making appointments,the degree to which the interviewer
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
72
C. O'Muircheartaighand P. Campanelli
undisturbedsettingfor the interview,etc. There is also the possibilitythat most of the
variabilityis due to differencesin the extent to which, or the
between-interviewer
in which,interviewers
recordthepresenceof children;one sourceof variation
circumstances
of othersbeing'present'.
could be in the definition
The keycontrasthereis betweenthemessagethatwe would obtainfrompi and ps and the
message fromthe multilevelanalysis. With the formerwe would be concernedthat the
estimated.In thiscase
to therelationships
standardanalysiswould givespurioussignificance
- thoughpresentforthe dependentvariable- does
effect
at least, however,an interviewer
not affectthe substantiveanalysis.
Table 3 deals withone of the respondentlevel factualitems,newspaperreadership.The
variablemodelledis a binarysubcategoryindicatingwhetheror not therespondenttypically
reads the Independent.From the hierarchicalANOVAs, the estimatedp-values for this
readershipsubcategorywerepi = 0.129 and p, = 0.106 (n = 1268).
checkitem(see model 1),
Unlikethevariancecomponentsmodelshownfortheinterviewer
variationbetween
thebasic variancecomponentsmodel givenin model4 showsa significant
variationbetween
For thisalso therewas no significant
PSUs as wellas betweeninterviewers.
geographicpools.
In model 5, we have includedtheindividuallevelexplanatoryvariable'respondent'sage'.
Several otherexplanatoryvariableshad also been exploredin both the hierarchicaland the
witha political
versionsof themodel(e.g. gender,social class,identification
non-hierarchical
party and income) but only respondent'sage was significant.With this addition, the
the respondent
reads the
whether
logisticregressionmodelof newspaperreadership:
Table 3. Multilevel
Independentr
Model4
ModelS
Model6
Hierarchical
NonHierarchical NonHierarchical Nonhierarchical
hierarchical
hierarchical
Fixedeffects
Grand mean
-3.04
(0.13)
Respondent'sage
-2.99
(0.30)
Whethersame interviewer
as previousyear
Interviewer
status
Whetherregular
interviewer
(compared
witharea manager)
Whethersupervisor
interviewer
(compared
witharea manager)
-1.70
(0.35)
-0.03
(0.01)
-1.94
(0.45)
-0.03
(0.01)
-
-2.99
(0.67)
-0.04
(0.01)
0.21
(0.28)
-3.19
(0.90)
-0.03
(0.01)
0.63
(0.34)
1.35
(0.60)
1.06
(0.84)
2.25
(0.76)
2.23
(1.25)
Variance
components
source
Randomeffects:
Respondent
PSU
Interviewer
1
1.55
(0.64)$
1.97
(0.71)4
1
1.48
(0.63)$
1.78
(0.68)4
t Standarderrorsare givenin parentheses.
randomparametersbased on a contrasttest.
Significant
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
1
1.59
(0.66)4
1.67
(0.67)4
and Sample DesignEffects
Effects
Interviewer
73
random variationis reduced slightlyand the PSU random variationremains
interviewer
essentiallythe same.
explanatoryvariableswe considered,two approached signiOf the various interviewer
ficancein thehierarchicalversionof model 6. These werethebinaryvariableforwhetherthe
had visitedthe household for the previousyear's interview(interviewer
same interviewer
interviewer
continuity)and one of the two dummyvariablesmodellingthe three-category
supervisoror area manager).Here we can see that the
statusvariable (regularinterviewer,
variancecomponentis again slightlyreduced.
interviewer
of
of whichcharacteristics
interpretation
we would have had a verydifferent
Interestingly
model.
effectifwe had onlyrunthenon-hierarchical
are havinga significant
theinterviewer
continuityvariablewas clearlynot signimodel,the interviewer
With the non-hierarchical
In addition(although
statusvariableswereclearlysignificant.
ficantand thetwo interviewer
Middle-aged
age variableapproached significance.
not shown in Table 3), the interviewer
to recordrespondentsas readersof the
weremorelikelythanelderlyinterviewers
interviewers
Independent.
Table 4 presentsa behaviouralintentionitemlookingat whetheror not the respondent
expectsto have any morechildren.As thisis a subjectiveassessment,the questionhas been
classifiedin theattitudecategoryforour analysis.From thehierarchicalanalysesof variance,
the estimatedp-valuesfor thisitemwere pi = 0.075 and ps = 0.048 (n = 1177). As was the
variation
case forthevariancecomponentsmodelundermodel 1,model7 showsa significant
betweeninterviewers
and possiblevariationbetweenPSUs but not amonggeographicpools.
In model 8, we have includedthe threeindividuallevel explanatoryvariablesnumberof
childrenin thehousehold,respondent'sgenderand respondent'sage. Each of theseis highly
versionsof themodel. Withthe
in both the hierarchicaland thenon-hierarchical
significant
is likely
to have morechildrent
therespondent
model:whether
Table4. Multilevel
logisticregression
Model8
Model7
Model9
Hierarchical
NonHierarchical NonHierarchical Nonhierarchical
hierarchical
hierarchical
Fixedeffects
Grand mean
No. of childrenin
household
Respondent'sgender
(female)
Respondent'sage
-0.39
(0.06)
-0.44
(0.11)
years
Interviewer's
withcompany
7.73
(0.46)
-0.85
(0.09)
-0.65
(0.19)
-0.24
(0.01)
7.59
(0.46)
-0.83
(0.10)
-0.63
(0.19)
-0.23
(0.01)
8.81
(0.60)
-0.86
(0.10)
-0.64
(0.19)
-0.24
(0.01)
0.042
(0.020)
7.39
(0.48)
-0.84
(0.10)
-0.62
(0.19)
-0.24
(0.01)
0.043
(0.027)
Variance
components
Randomeffects:
source
1
Respondent
PSU
-
Interviewer
-
0.15
(0.09)
0.22
(0.10)l
1
0.00
(0.00)
0.38
(0.16)4
t Standarderrorsare givenin parentheses.
randomparametersbased on a contrasttest.
t Significant
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
1
0.00
(0.00)
0.34
(0.15)$
74
C. O'Muircheartaigh
and P. Campanelli
additionof theseexplanatoryvariablesin the hierarchicalmodel,randomvariationdue to
increases.The disappearanceof the
PSUs goes to 0 and randomvariationdue to interviewers
thatled to the possible PSU effecthave been
PSU effectmay mean thatthe characteristics
adequatelyspecifiedin the substantivemodel. Again, thissuggeststhatit is not haphazard
but rather
to interviewer
variability,
workloadsthatis contributing
variationin interviewer
in theirmeasurementof people's intentionsto
that thereis variationbetweeninterviewers
have morechildren.
predictor
experienceis a significant
In thenon-hierarchical
versionof model9, interviewer
withmore experiencedinterviewers
beingmorelikelyto recorda 'yes' to the morechildren
are. Althoughnot shown,in the non-hierarchical
question than inexperiencedinterviewers
Whenthesame
continuity
variableapproachedstatisticalsignificance.
model,theinterviewer
interviewer
returnedon thesecondwave of thesurveyhe or she was less likelyto recordyes
was. These findings,
however,do
interviewer
to the more childrenquestionthan a different
not hold forthe hierarchicalmodel.
effect,the
Perhaps the most importantpoint here is that,despitethe stronginterviewer
by
by thesubstantivefixedpartof themodelis unaffected
substantivedescriptionrepresented
in the
However,thereare differences
the interviewers
(at least not affecteddifferentially).
characteristics
dependingon whetheran interconclusionsabout the effectof interviewer
viewervariancetermis explicitlyincluded.
In additionto theseexamples,we conducteda further
explorationof theeffect
of theextra(Sudman and Bradburn,1974) on model conclusions.
role characteristics
of theinterviewers
checks),
typesof item(attitudes,facts,quasi-factsand interviewer
For each of the different
a sample of variables was drawn from among those shown to have highlysignificant
interviewer
variability.Across the fourcategories,26 itemswere drawn from84. A crossby PSU) was conductedon each of thesewiththe
classifiedmultilevelanalysis(interviewer
interviewercharacteristicsas the explanatoryvariables. These included interviewerage,
overtime.
continuity
gender,status,yearswiththecompanyand an indicatorof interviewer
in sevenof the 26 cases (27%).
Of the 26 modelsconsidered,interviewer
age was significant
The comparablepercentagesof significant
effectsthatwerefoundforthe otherinterviewer
status,
12%; gender,8%; interviewer
characteristics
wereas follows:interviewer
continuity,
8%; yearswiththecompany,4%. Althoughsuch data should be treatedwithcaution,they
variability
age is a generalpredictorof some of theinterviewer
may indicatethatinterviewer
on thehighvariabilityitems.Freemanand Butler(1976), forexample,foundage and gender
to be significantpredictorsof interviewervariance. Collins and Butcher (1982) also
of interviewers.
Theirstrongest
investigatedtheexplanatorypowerof severalcharacteristics
evidencewas foran age effect.
model
dependingon whethera hierarchicalor non-hierarchical
Again we saw differences
in 27%
forthenon-hierarchical
modelswereage significant
was used. The comparablefigures
in 15%, genderin 12%, interviewer
of cases, interviewer
statusin 35% and years
continuity
withthecompanyin 15%. In 11 of the26 models,different
conclusionsabout theeffectsof
on substantiveresultswould have been reached,dependingon
interviewer
characteristics
variancetermwas explicitlyincludedin themodel.
whetheran interviewer
4.
Summarizing remarks and discussion
-that the observationsare independent
The assumptionunderlying
most statisticalsoftware
and identically
distributed
(IID) -is certainlynot appropriateformost sample surveydata.
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
Effectsand Sample DesignEffects
Interviewer
75
of surveydesign
Variancescomputedon thisassumptiondo not take intoaccounttheeffects
effects).
interviewer
due to correlated
and execution(e.g.inflation
due to clustering)
(e.g.inflation
effectsand
reasons why we mightbe interestedin interviewer
There are two different
The firstis to establishwhetherthesampledesign(typicallyclustering
sampledesigneffects.
(because many respondentsare interviewedby each
in the design) and/orthe interviewer
of the observations.This is
have an effecton thevariance-covariancestructure
interviewer)
thetraditionalsamplesurveyapproach and includesa considerationof the designeffectand
the interviewer
effectfollowingtheANOVA and Census Bureau models.The emphasisis on
the estimationof means or proportionsand on the standard errorsof these estimates;
variancecomponentsmodels do not add anythingto theseanalyses.
Our work witha speciallydesignedstudyin wave 2 of the BHPS permittedus to assess
both theseinflationcomponents.Acrossthe820 variablesin thestudy,therewas evidenceof
a significanteffectof both the population clusteringand the clusteringof individualsin
p was used as the measureof
workloads.The intraclasscorrelationcoefficient
interviewer
effects
werecomparablein
and interviewer
We foundthatsampledesigneffects
homogeneity.
impact,withoverallinflationof thevarianceas greatas fivetimesthe unadjustedestimate.
The median effectacross the 820 variables was an 80% increase in the variance. The
was comparableacrossthesetypes,
correlationcoefficients
magnitudeof theintra-interviewer
check items. There was a
though the most sensitiveitems tended to be the interviewer
tendencyfor variablesthatwere subjectto large designeffectsto be sensitivealso to large
of thiscorrelationin Section3.1.
effectsand we offera possibleinterpretation
interviewer
The large values of pi on particularitemsand the fact that pi is of the same order of
of Pi
magnitudeas Ps suggestthatsurveyorganizationsshouldincorporatethemeasurement
of thesurveydesignare too expensiveto allow
in theirdesigns.If thenecessarymodifications
this,organizationsshould at least tryto minimizeits effect;thiscould be accomplishedby
reducing interviewers'workloads. Current practice tends to favour smaller dedicated
effects
interviewer
forceswithlarge assignments;in the presenceof substantialinterviewer
thisis a misguidedpolicy.
The second reason is to ensure that effectson the univariatedistributionsdo not
contaminateour estimatesof relationshipsbetweenvariablesin thepopulation;in thiscase
our objectiveis to controlthe effectsor to eliminatethemfromthe analysis.The standard
approach of thesurveysampleris to estimatetheparametersassumingthattheyare IID and
to produce design-basedvarianceestimatesusingresamplingmethodssuch as thejackknife
or bootstrap;this,however,is onlyan approximatesolution.The explicitmodellingof effects
In thissituationthereare two aspectsof interest:
is bothmorepreciseand moreinformative.
workloadsin themodel
and theinterviewer
includingthesampleclustering
whetherexplicitly
changes the estimatesof the relationships(the contaminationissue) and whetherthe
have an effecton the distributionof values obtained for the
clusteringand interviewers
dependentvariable.
Using software developed for multilevelanalysis (hierarchicalmodelling) we have
presented an alternativeframeworkwithin which to consider the sample design and
interviewer
effectsby incorporatingthemdirectlyinto substantivemodels of interest.For
checkitemon whether
childrenwere
illustrationwe chose threebinaryitems-an interviewer
the
Independent,and a
present duringthe interview,a behavioural item, readershipof
have anotherchild.
that
would
it
was
they
likely
thought
respondents
subjectiveitem,whether
which
a
interviewer
we
found
effect,
persistedwhenwe
significant
For each of theseitems,
extra-role
characterand
various
in
interviewers'
workloads
controlledfor inequalities the
where
found
situations
here
we
For
items
other
not
presented
istics of the interviewers.
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
76
C. O'Muircheartaigh
and P. Campanelli
did help to explainthe interviewer
effects.In addition,we found
interviewer
characteristics
would have
that conclusionsabout the influenceof the various extra-rolecharacteristics
model ratherthana
in manycases ifwe had used onlythestandardnon-hierarchical
differed
hierarchicalmodel.
thefactorsthatmightprovidean explanationof
In laterworkwe hope to explorefurther
the variance components.From a modellingstandpointthe issue is of specifyingapprofactorsin the substantivemodels of interest.From a sample survey
priatelythe underlying
standpointthe issue is that of incorporatingin the analysis a recognitionof the special
featuresof the sample designand surveyexecutionthatmake a particulardata set deviate
fromIID data. Multilevelmodelshave a naturalcongruencewithmanyimportantaspectsof
the surveysituation;both the sample design and the fieldworkimplementationcan be
describedappropriatelyas introducinghierarchicallevelsinto the data and thusmultilevel
thatmakesit possibleto includeboth substantiveand design
analysisprovidesa framework
factorsin the same analysis.
Acknowledgements
The data wereoriginallycollectedby theESRC ResearchCentreforMicro-socialChange at
the Universityof Essex. The data fileforthispaper was made available throughthe ESRC
for the analyses or interpretations
preData Archive;the Archivebears no responsibility
sentedhere.
References
variancestudyforthe eightimpactcitiesof the
Bailey,L., Moore, T. F. and Bailar, B. A. (1978) An interviewer
National CrimeSurveycitiessample.J. Am. Statist.Ass., 73, 16-23.
Biemer,P., Groves, R., Lyberg,L., Mathiowetz,N. and Sudman,S. (eds) (1989) MeasurementErrorsin Surveys.
New York: Wiley.
to HLM: ComputerProgram
Bryk,A. S., Raudenbush,S. W., Congdon,R. and Seltzer,M. (1986) An Introduction
and User's Guide.Chicago: Universityof Chicago.
in an attitudesurvey.J. MarktRes. Soc., 25,
and clusteringeffects
Collins, M. and Butcher,B. (1982) Interviewer
no. 1, 39-58.
Medicine,University
variance.Report.Departmentof Social and Preventive
Feather,J.(1973) A studyofinterviewer
of Saskatchewan,Saskatoon.
Fellegi,I. P. (1964) Responsevarianceand its estimation.J. Am. Statist.Ass., 59, 1016-1041.
thecorrelatedresponsevariance.J. Am. Statist.Ass.,69, 496-501.
(1974) An improvedmethodof estimating
variancein surveys.Publ. Opin. Q., 40, 79-91.
Freeman,J. and Butler,E. W. (1976) Some sourcesof interviewer
variability(withdiscussion).J. R. Statist.
Gales, K. and Kendall, M. G. (1957) An inquiryconcerninginterviewer
Soc. A, 120, 121-147.
Goldstein,H. (1995) MultilevelStatisticalModels,2nd edn. London: Arnold.
variabilitytakenfromtwo samplesurveys.Appl. Statist.,5, 73-85.
Gray, P. G. (1956) Examplesof interviewer
effectsin centralizedtelephone
Groves, R. M. and Magilavy,L. J. (1986) Measuringand explaininginterviewer
surveys.Publ. Opin. Q., 50, 251-256.
Hansen, M. H., Hurwitz,W. N. and Bershad,M. A. (1961) Measurementerrorsin censusesand surveys.Bull. Int.
Statist.Inst.,38, 359-374.
on theaccuracyof surveyresults.J. Am. Statist.
Hanson, R. H. and Marks,E. S. (1958) Influenceof theinterviewer
Ass., 53, 635-655.
Hartley,H. 0. and Rao, J. N. K. (1978) Estimationof nonsamplingvariancecomponentsin sample surveys.In
SurveySamplingand Measurement(ed. N. K. Namboodiri),pp. 35-43. New York: AcademicPress.
on the
and respondentcharacteristics
of interviewer
Hox, J.J.,de Leeuw, E. D. and Kreft,I. G. G. (1991) The effect
qualityof surveydata: a multilevelmodel. In MeasurementErrorsin Surveys(eds P. P. Biemer,R. M. Groves,
L. E. Lyberg,N. A. Mathiowetzand S. Sudman). New York: Wiley.
of Chicago Press.
in Social Research.Chicago: University
Hyman,H. (1954) Interviewing
Kalton, G. (1979) Ultimateclustersampling.J. R. Statist.Soc. A, 142, 210-222.
varianceforattitudinalvariables.J. Am. Statist.Ass., 57, 92-115.
Kish, L. (1962) Studiesof interviewer
Longford,N. T. (1988) VARCL Manual. Princeton:EducationalTestingService.
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions
InterviewerEffectsand Sample Design Effects
77
Lynn, P. and Lievesley,D. (1991) Drawing GeneralPopulationSamples in Great Britain.London: Social and
CommunityPlanningResearch.
in statisticalsamplingin theIndian StatisticalInstitute.J. R. Statist.
Mahalanobis,P. C. (1946) Recentexperiments
Soc., 109, 325-370.
O'Muircheartaigh,
C. A. (1984a) The magnitudeand patternof responsevariancein thePeruFertility
Survey.World
Fertility
SurveyScientific
Report45. InternationalStatisticalInstitute,the Hague.
(1984b) The magnitudeand patternof responsevariancein the Lesotho FertilitySurvey. WorldFertility
SurveyScientificReport70. InternationalStatisticalInstitute,the Hague.
O'Muircheartaigh,C. A. and Wiggins,R. D. (1981) The impactof interviewer
variabilityin an epidemiological
survey.Psychol.Med., 11, 817-824.
Rasbash, J.,Woodhouse,G., Goldstein,H., Yang, M., Howarth,J. and Plewis,I. (1995) MLn Software.London:
Instituteof Education.
Sudman,S. and Bradburn,N. (1974) ResponseEffectsin Surveys.Chicago: Aldine.
Verma,V., Scott,C. and O'Muircheartaigh,C. (1980) Sample designsand samplingerrorsforthe World Fertility
Survey(withdiscussion).J. R. Statist.Soc. A, 143, 431-473.
Wiggins,R. D., Longford,N. and O'Muircheartaigh,
C. A. (1992) A variancecomponentsapproachto interviewer
In Surveyand StatisticalComputing
effects.
(eds A. Westlake,R. Banks,C. Payne and T. Orchard).Amsterdam:
North-Holland.
Woodhouse,G. (1995) A Guide to MLn for New Users.London: Instituteof Education.
This content downloaded from 46.18.87.93 on Tue, 16 Apr 2013 08:25:33 AM
All use subject to JSTOR Terms and Conditions