Question Writing Workshop Stanford University Medical Education Faculty Lynn C. Webb, Ed.D.

Question Writing Workshop
Stanford University
Medical Education Faculty
Lynn C. Webb, Ed.D.
August 15, 2007
testing@lwebb.com
Agenda
1.
2.
3.
4.
5.
6.
7.
Introductions
Examination goals
Giving away the answers
3 formats for questions
Investing in future examinations
The joy of feedback
Questions / discussion
1. Introductions
• Lynn Webb – Testing consultant
• Previous affiliations
– American Board of Psychiatry &
Neurology
– National Board of Medical Examiners
– ETS (Princeton)
– ACT (Iowa City)
Examples of Tests
•
•
•
•
•
•
•
•
AAN’s RITE
ACP’s PRITE
APA’s FOCUS
AOA’s UHM
ABPM
AAPM&R’s SAE-R
WIP
CBNC
• ABPN
–
–
–
–
Certification
Recertification
Subspecialty
Added qualifications
Non-medical exams
• NCLEX
• Solar Energy
2. Examination Goals
• Test students’ learning
• Validate your teaching
• Prepare students for future testing
Your questions can contribute to these
goals, or detract from them.
Contribute to Goals
• Questions should be aligned with the
specific content
• Questions should be technically sound
– Clearly written
– Straight-forward, NOT tricky
Detract from Goals
• Questions don’t address teaching or
learning
• Questions are convoluted or
confusing, testing something other
than the content
• The focus shifts to the question instead
of the content
3. Giving away the answers
• Medical students tend to be test-wise
• When they don’t know an answer,
they retrace your steps in writing the
question to guess the correct answer
• Even if YOU are test-wise, blind-spots
exist when it’s your material
• Multiple-choice questions are
especially vulnerable
Let’s take a quiz
• Our content will be highly esoteric
facts about the Alaskan Malamute
• Each quiz question will contain a flaw
in its structure that will assist test-wise
examinees. You won’t need to study
the Alaskan Malamute to pass this quiz.
Boris Badenov Webb
1. The primary function of sled dogs such
as the Alaskan Malamute, Samoyed,
Siberian Husky, and Eskimo Dog is to:
A.
B.
C.
D.
rescue drowning humans
retrieve tossed objects
guard property
pull sleds
Test-wise examinees
• Look for hints that the question writer
overlooked between the question and
the correct option
• Sled dogs – Pull sleds (no other options
about sleds)
2. Sled dogs provided vital transport
of a life-saving serum during a
diphtheria epidemic in Nome,
Alaska in:
A.
B.
C.
D.
1492
1776
1925
1952
Several ways to approach question
• First, eliminate options that aren’t
plausible: 1492 and 1776 are too early
for question content
• Now you have 50/50 chance between
C and D.
• Is D plausible? Why not fly in 1952?
• Numeric options – when in doubt,
choose C
3. Credit for the introduction of
the Alaskan Malamute in the
‘lower 48’ is usually given to:
A. Arthur Treadwell Walden and The
Seeleys
B. Admiral Byrd and Arthur Treadwell
Walden
C. The Seeleys and Commodore Perry
D. Arthur Treadwell Walden and Jack
London
Look for repeats
A. Arthur Treadwell Walden and the
Seeleys
B. Admiral Byrd and Arthur Treadwell
Walden
C. The Seeleys and Commodore Perry
D. Arthur Treadwell Walden and Jack
London
More repeats?
A. Arthur Treadwell Walden and The
Seeleys
B. Admiral Byrd and Arthur Treadwell
Walden
C. The Seeleys and Commodore Perry
D. Arthur Treadwell Walden and Jack
London
Some guys listed only once
A. Arthur Treadwell Walden and The
Seeleys
B. Admiral Byrd and Arthur Treadwell
Walden
C. The Seeleys and Commodore Perry
D. Arthur Treadwell Walden and Jack
London
Where do repeats converge?
A. Arthur Treadwell Walden and The
Seeleys
B. Admiral Byrd and Arthur Treadwell
Walden
C. The Seeleys and Commodore Perry
D. Arthur Treadwell Walden and Jack
London
Why do test-wise examinees do that?
• They are retracing your steps in
creating the question
• That’s why you shouldn’t submit your
first draft
• It’s easy to fix this problem
4. Champion Coldfoot Oonanik, the
world’s most titled dog, was how
many inches at the withers?
A.
B.
C.
D.
2.75”
25.0”
27.5”
275”
Test-wise examinees compare options
• 3 options have the digits 2 – 7 – 5
• One option has different digits (B), but
it’s close in size to C
• C must be the answer
• (Author had the answer 27.5, then
moved the decimal place for 2
distractors, then chose another
plausible one near answer)
5.
Alaskan Malamutes are known
for a:
A. wooly undercoat, from 1-2” in depth
when the dog is in full coat
B. early warning system concerning
burglaries or robberies
C. interest in jumping on furniture and
sleeping in beds with humans
D. eye color of blue, brown, or one blue
and one brown
Which option fits with stem?
• Grammar issue – “a” only works with
“wooly”
• Also, author qualified answer in A –
“when the dog is in full coat” – not all
year round. The other options aren’t
qualified
6. Alaskan Malamutes’ tails are:
A.
B.
C.
D.
wiry
short
tightly curled
plumed and carried over their backs
most of the time
Where did author spend time?
• Correct answer is longest
• Correct answer is most detailed
• Correct answer is qualified
7. Which of the following is characteristic of Alaskan Malamutes?
A. They don’t shed
B. They are always strong and silent
C. They can survive extreme cold and
usually tolerate warm climates
D. They are always ‘one man’ dogs
Where did author spend time?
• Correct answer is longest
• Correct answer is qualified
• Other answers include absolutes
Could these clues exist in
your examinations?
Thrombotic microangiopathies are
associated with:
A. Thrombocytopenia
B. Reduced coagulation factors
C. Antibodies to glomerula basement
membranes
D. Anti-neutrophilic cytoplasmic
antibodies
E. Necrotizing vasculitis
A patient carefully reads the labels on all
her food products and calculates that
she is taking in about 3.5 grams of
sodium. This is closest to:
A.
B.
C.
D.
60 mEq of sodium
100 mEq of sodium
160 mEq of sodium
200 mEq of sodium
Numeric options
• You may use C as the correct answer
• It shouldn’t ALWAYS be the correct
answer
4. 3 formats for questions
• Let’s talk about the 3 formats that you
use for the test questions:
– Multiple choice questions (MCQs)
– True/False
– Matching
Multiple-choice questions
The examinee’s task is to choose the
one best answer of 5 options (A-E)
Multiple Choice
• Idea
• Correct answer
• Distractors
Multiple Choice Qs : Ideas
• The idea for the question is sometimes
called the ‘teaching point’
• You need questions for each lecture
hour you present
• Test essential concepts
Source of ideas
• Essential content
• Questions asked during lecture
• Conversations overheard
• Any points upon which you will build
How to start?
• Review syllabus notes
• What are the most essential concepts
in the lecture?
• The purpose of this lecture is to teach
_______________________
• You might have 6 concepts, or you
might have 2 concepts and some subtopics
Sampling
• Testing is always a sampling of
knowledge
• Even the USMLE samples knowledge
• If you make a list of 12 things that MUST
be tested from a lecture, perhaps ½ of
them could be used now and ½ of
them could be used later. (More on
this in Topic 5 – Investing in the Future)
Characteristics of GREAT questions
• Clearly written
– Examinee knows what is being asked
before reading the options
– All the information needed to answer is in
the question (options don’t build upon
each other)
– Simplest phrasing possible
– No extraneous information
MCQs: Correct answer
• Each question should have one
answer that is best
• When you’re writing, the best
sequence is
– Teaching point (What you want to ensure
they know)
– Correct answer
– Distractors
Other Sequence
• (Not recommended)
• Vague concept
• List of options
• Choose one to be correct OR revise
some options to be either correct or
incorrect
Example: FK506 (Tacrolimus):
A. is an anti-proliferative drug
B. binds and inhibits the stimulation of a
calcium activated phosphatase, thereby
blocking the induction of the early
response genes in the immune response
C. is less commonly used than cyclosporin
because of its lower potency
D. is preferred over cyclosporin since it has no
nephrotoxicity
E. is most commonly used to treat autoimmune disorders
If you didn’t know anything
about FK506….
• Which option would you choose
• Hint: retrace the steps of the question
writer
FK506 (Tacrolimus):
A. is an anti-proliferative drug
B. binds and inhibits the stimulation of a
calcium activated phosphatase, thereby
blocking the induction of the early
response genes in the immune response
C. is less commonly used than cyclosporin
because of its lower potency
D. is preferred over cyclosporin since it has no
nephrotoxicity
E. is most commonly used to treat autoimmune disorders
What to do?
1. Choose a specific teaching point
from the lecture notes – what is it that
you want to be sure the students
know about FK506?
2. Form that teaching point into a
question
3. Write the correct answer
4. Write the distractors
Multiple Choice Qs: Distractors
The distractors should be:
- plausible
- less correct than the answer
They are called ‘distractors’ because
their intent is to DISTRACT examinees
who do not know the concept
Sources for Distractors
• Common misconceptions (“Why do
some of the students think that
___________ “)
• Logical thinking, but it doesn’t apply in
this case
FK506 (Tacrolimus) binds and
inhibits the stimulation of:
A. calcium activated phosphatase
B.
C.
D.
Better format
• But is it an essential teaching point?
(or) What happens when FK506
(Tacrolimus) binds and inhibits the
stimulation of calcium activated
phosphatase?
A. The induction of the early response genes
in the immune response is blocked
B.
C.
D.
E.
Your item here?
Why change format?
• When questions are written clearly, it’s
more likely that responses will reflect
examinees’ knowledge on this topic,
rather than their care in reading the
question
• Typical of the type of questions they
will see on USMLE and specialty
certification examinations
USMLE & others do NOT use:
• Which of the following is TRUE?
• Which of the following is FALSE?
• Each of the following statements
about X is true EXCEPT:
Which one is true (or false)
• Is not a multiple-choice question
format
• It’s a multiple true/false format
• Options tend to be heterogeneous
(mixing apples and oranges) because
the teaching point isn’t focused
True/False Questions
• + Much easier/faster to write
• - Less information about what the
students know
• They have a 50/50 chance of
answering correctly, just by answering
(MCQs were only 20% chance)
When to write a T/F question?
• When the content is essential, but
shallow
• When the multiple choice question
didn’t work
FK506 (Tacrolimus):
A. is an anti-proliferative drug
B. binds and inhibits the stimulation of a
calcium activated phosphatase, thereby
blocking the induction of the early
response genes in the immune response
C. is less commonly used than cyclosporin
because of its lower potency
D. is preferred over cyclosporin since it has no
nephrotoxicity
E. is most commonly used to treat autoimmune disorders
True/False Qs: Ideas
(same as Multiple Choice Qs)
• Essential content
• Questions asked during lecture
• Conversations overheard
• Any points upon which you will build
True/False Qs: Correct answer
• Be clear
• Don’t be tricky
• If something is generally true (say in
999 of 1,000 cases) do you want them
to say TRUE for the 999 cases or FALSE
for the 1 case?
Examples
• Menarche is usually the first sign of
puberty in females
• Endometrial adenocarcinoma
typically presents as high stage
disease.
Great material for T/F
Can be found in unfocused multiple
choice questions
Which statement is true about Bladder
carcinoma?
A. Bladder carcinoma is not associated with
smoking
B. Recurrence and progression are very rare
after local excision of low grade papillary
carcinomas
C. “Field effect” refers to the concept that
toxins in wheat grain pesticides may cause
prostate carcinoma
D. Bladder carcinoma is usually adenocarcinoma
E. Squamous cell carcinoma is associated
with chronic irritative conditions such as
chronic catheterization
Any questions about the
True/False format?
Matching
• A series of 2 or more questions that use
the same set of up to 10 possible
answers (listed once as a heading).
• The answers may be used once, more
than once, or not at all
• There is only ONE correct answer per
question
Matching: Ideas
• Essential content
• Several concepts that are alike –
typically testing recognition more than
application of knowledge
• Lists of diseases, microorganisms,
medications, etc.
Matching: Correct answers
• This format requires more thought
about correct answers
• After preparing a matching set, review
each description again (left side)
comparing to each of the options on
the right. You must be sure that only
ONE is the best answer
Matching: Distractors
• There are typically fewer distractors to
generate for Matching than for MCQs,
because the other correct answers are
included in the distractor list
• Additional distractors that are added
should be common misconceptions
about descriptions on left or plausiblesounding alternatives from within the
category of things listed.
Any questions about the
matching format?
5. Investing in Future Examinations
• You might decide to work ahead for
future examinations when you’re
working on your assigned questions.
• Work ahead when you have an
abundance of ideas
• “Clone” your questions
Cloning Multiple Choice Qs
Especially easy to do when starting from
a question that had a negative
context (NOT, EXCEPT, etc.)
Which one of the following obstetric
risks is NOT commonly associated
with mullerian anomalies?
A.
B.
C.
D.
Intrauterine growth restriction (IUGR)
Pre-eclampsia
Malpresentation
Preterm labor
4 options
• 3 are associated with mullerian
anomalies
• 1 is NOT associated with mullerian
anomalies
Create a positive item
Which of the following obstetric risks is
commonly associated with mullerian
anomalies?
A. Pre-eclampsia (now a distractor,
instead of correct answer)
B. Intrauterine growth restriction (IUGR)
(now the correct answer)
C. (add distractor here)
D. (add distractor here)
Clone new item
• Same item, use same distractors, but
insert new correct answer
Which one of the following obstetric risks
is commonly associated with mullerian
anomalies?
A. Pre-eclampsia (still a distractor,
instead of correct answer)
B. Malpresentation (new correct
answer)
C. (add same distractor here)
D. (add same distractor here)
Clone second item
• Same process
• Keep same distractors
• Insert new correct answer
Which one of the following obstetric risks
is commonly associated with mullerian
anomalies?
A. Pre-eclampsia (still a distractor,
instead of correct answer)
B. Preterm labor(new correct answer)
C. (add same distractor here)
D. (add same distractor here)
Another example
• Change negative context Multiple
choice question to positive, then clone
it.
Negative context question
Post-streptococcal glomerulonephritis is
associated with all EXCEPT
A. Subepithelial humps
B. Pharyngitis
C. Low serum complement levels
D. Low anti-streptolysin antibody titers*
E. Hematuria
Post-streptococcal glomerulonephritis is
associated with :
A. Low anti-streptolysin antibody titers
(now a distractor)
B. Subepithelial humps (now correct)
C. (add a distractor)
D. (add a distractor)
E. (add a distractor)
Clone 3 more items
• Keep the 4 distractors (1 old, 3 new)
the same
• Change the correct answer each
time, alternatively using:
– Pharyngitis
– Low serum complement levels
– Hematuria
Do students share exams?
• Cloning items is a great way to
discourage memorizing test materials.
The correct answer changes across
administrations of the test.
Many topics have more than
one compenent
• Think of the questions you used to write
as “All of the above” - Those kinds of
topics are great for cloning
• Write the question with only one of the
possible correct answers
• Work hard to develop plausible
distractors
• Keep the question and distractors;
alternate the correct answers
Use 1 per test
• Store the others for the future
Cloning True/False
• Write the True/False question
• Clone it to the opposite answer
TRUE Original
• The most common cause of congenital
adrenal hyperplasia is 21-hydroxylase
deficiency
• Now, clone to FALSE
Clone to FALSE
The most common cause of congenital
adrenal hyperplasia is 3-beta-HSD
deficiency
Or
The most common cause of congenital
adrenal hyperplasia is 11-hydroxylase
deficiency
Cloning Matching
• Not as easy to do as for Multiple
Choice or True/False
• Easiest when the list of possibilities is
very long
• Try to utilize the options that were not
used in the original set, if possible.
6. The joy of feedback
• Writing questions is difficult
• Writing questions alone is even more
difficult
• It’s unlikely that your first draft of any
question should be submitted
(Clinical presentation was here)
• Which one of the following is the best
explanation for the serum sodium
concentration and the concentrated
urine?
Options in first draft
A. Arginine vasopressin release caused
by hypotonic serum.
B. Arginine vasopressin release caused
by hypertonic urine.
C. Arginine vasopressin release caused
by low effective blood volume.
D. Arginine vasopressin release caused
by a combination of hypertonic urine
and low effective blood volume
Your colleague recommends
tidying the options
• The phrase “Arginine vasopressin
release caused by” repeats in all the
options
• Move it up to the question
The best explanation for the serum
sodium concentration and the
concentrated urine is that arginine
vasopressin release was caused by:
A.
B.
C.
D.
hypotonic serum
hypertonic urine
low effective blood volume
a combination of hypertonic urine
and low effective blood volume
st
1
drafts can include too
many details
• When it’s your patient, many of the
details will come to mind when writing
the test question
• It’s easier for your colleague to weed
out any information that isn’t needed
to answer the question
Sample first draft question
Miss Smith, a previously healthy 22-yearold woman, is brought to the ER by her
overly-protective mother. Miss Smith
has a one day history of dysuria,
urinary frequency, urgency, and suprapubic tenderness. She noted cloudy
urine, but denied fever. Her urinalysis
showed many polymorphonuclear
leukocytes and bacteria. Which of the
following best characterizes her
condition?
Edited question
Miss Smith, a previously healthy 22-yearold woman, presents to the ER with a
one day history of dysuria, urinary
frequency, urgency, and supra-pubic
tenderness. She noted cloudy urine,
but denied fever. Her urinalysis showed
many polymorphonuclear leukocytes
and bacteria. Which of the following
best characterizes her condition?
Questions?
Should we write some
questions together?
What problems have you
faced when working on test
assignments?
THANK YOU
Lynn C. Webb, Ed.D.
testing@lwebb.com