Making an ALARP Decision of Sufficient Testing

Making an ALARP Decision
of Sufficient Testing
(HASE’14)
Mahnaz Malekzadeh
mahnaz.malekzadeh@mdh.se
Mälardalen Real-Time Research Center
Mälardalen University
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
2
Safety-Critical Systems
●  Failure can lead to catasrophic damage to people or environment.
*
**
***
3
Motivation
●  Testing is an extremely important part of development
and certification process.
●  However, it is also one of the most expensive ones.
Development*Cycle*
Design$
24%$
45%$
6%$
Requirements$
System$Tes7ng$
5%$
20%$
Acceptance$Tes7ng$
4
Motivation
●  Therefore, testers have to determine whether there is
any benefit in running the current testing strategy
further.
●  To date research effort has been mainly focused on
diverse testing strategies.
●  However, it leaves an open issue of when to stop
testing.
5
Motivation
●  Test Process Challenges
QUALITY
COST
How to make a decision to stop
testing a system?
6
Motivation
●  Such a decision also plays an important role for As
Low As Reasonable Practicable (ALARP) principle.
●  The concept of “reasonably practicable” lies at the
heart of the British health and safety system.
●  It is a key part of the general duties of the Health and
Safety at Work etc. Act 1974.
7
Motivation
●  Risk tolerability depends on practicability of further
risk reduction.
●  It must be feasible to demonstrate that cost of
reducing the risk further would outweight the benefit
gained.
●  ALARP: Currently, this is at best a qualitative decision.
●  We address this decision challenge quantitatively for
the worst-case timing properties of safety-critical
systems.
8
Outline
●  Motivation
●  Worst-case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
9
Worst-Case Response Time
●  Safety-critical systems: Extremely important to
respond in a timely manner, e.g., the car braking
system.
●  They have to respond no later than a specific amount
of time called ”deadline”.
●  i.e, their Worst-Case Response Time (WCRT) has to be
less than or equal to their deadline.
WCRT
✓
WCRT
✗
time
deadline
10
Why WCRT?
●  Traditional Response Time Analysis techniques are
based on simplified assumtions of systems and exact
Worst-Case Execution Time (WCET).
●  Incapable of capturing features inhabiting complex
real-world safety-critical systems.
●  Thus, resulting in inaccurate worst-case timing
analysis.
●  In contrary, our approach based on testing allows us to
not to depend on an abstract model of a system nor
the exact WCET.
●  Thus, makes it interesting for real-world scenarios.
11
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
12
System Model
A1
t11
t12
t13
t21
A2
t22
●  A set of applications (Ai) running on execution
platform.
●  Each application has a set of tasks (tij) scheduled
for execution based on their deadline.
13
Task Set Simulator
●  Task set simulator is used that allows long simulation
time.
●  It establishes two ground truths:
ü  Static WCRT,
ü  High Water Mark (HWM) achieved by significantly
long simulation time.
14
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
15
ALARP
●  The definition set out by the Court of Appeal:
“‘Reasonably practicable’ is a narrower term than
‘physically possible’
… a computation must be made by the owner in which
the quantum of risk is placed on one scale and the
sacrifice involved in the measures necessary for averting
the risk (whether in money, time or trouble) is placed in
the other, and that, if it be shown that there is a gross
disproportion between them – the risk being
insignificant in relation to the sacrifice – the defendants
discharge the onus on them.”
16
ALARP
●  In essence, making sure a risk has been reduced
ALARP is about weighing the risk against the sacrifice
needed to further reduce it.
●  Extreme examples:
●  Disproportionate: To spend £1m to prevent five staff
suffering bruised knees.
●  Proportionate: To spend £1m to prevent a major
explosion capable of killing 200 people.
17
ALARP Triangle
High Risk
Risk can not be tolerated
(safety-critical tasks)
Intolerable region
Risk
Risk is tolerable only if
cost of further risk
reduction is grossly
inappropriate to the
benefit attained
(Risk-tolerable tasks)
ALARP or
Tolerable region
Medium Risk
Broadly acceptable
region
Low Risk
Negligible risk
18
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
19
Convergence Algorithm
When Maximum Observed Response Time
(MORT) is not changing?
•  We never know this but may have suggestions.
•  By looking at whether a response times
distribution model is changing.
20
Convergence Algorithm
Main techniques the testing info. is used for:
ü  High Watermark (HWM), MORT,
ü  Statistical techniques, i.e. Kullback-Leibler
DIVergence (KL DIV) test.
Test
Vectors
Software
Under Test
HWM
Response
Times
KL DIV
21
Convergence Algorithm
ü  MORT is not increasing (HWM),
ü  Nature of the response time distibution is
not changing (KL DIV),
More testing of the same nature is not going
to reveal further useful information.
22
Convergence Algorithm
Test
Vectors
Software
Under Test
Response
Times
BinSize = λ
X = ResponseTimeEachTaskAfterTime_t
Y=α*X+β
Counter >= i
Counter++
Passed
Yes
HWM
Not Passed
No
Counter = 0
KL DIV < δ
Passed
END
t=t+Δ
Not Passed
23
Binning
Response
Times
t2
t3
λ
t1
t4
λ
t2
λ
t3
Binning:
ü  Simulator for scalability,
ü  Convergence algorithm to avoid the
outliers affect the test result.
24
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
25
Motivational Exmple
−4
6
4
x 10
x 10
3.9
KL DIV
MORT
−6
x 10
(a)
KL DIV
8
2
0
1000
KL DIV
3.7
4
0
0
MORT
3.8
KL DIV
4
0
2000
400
800
Testing Time
3000
4000
1200
5000
6000
Testing Time
7000
8000
9000
3.6
10000 26
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
27
Evaluation
Criteria:
ü  Closeness of the algorithm Stopping Point
MORT (SPMORT) to the Last MORT (LM)
seen in simulation,
ü  Closeness of SPMORT to a quantified MORT
called ALARP MORT (AM).
Algorithm Achievement (AA)
Algorithm Effort (AE)
28
Evaluation
(a) High priority task
LM
SP
3861.5
3861
0AM
HWM
100
50
150
4
MORT
x 10
4
AM SP
200
250
Testing Time
(b) Low priority task
300
350
400
LM
HWM
3
2
Algorithm Effort Algorithm Achievement
MORT
3862
0
500
1000
1500
2000
2500
3000
3500
Testing Time
(c) Normalized algorithm achievement for task set
4000
4500
5000
100
100.5
10
5
0
96
96.5
97
97.5
98
98.5
99
Bins
(d) Normalized algorithm effort for task set
99.5
20
10
0
0
0.2
0.4
0.6
0.8
1
Bins
1.2
1.4
1.6
1.8
2
4
x 10
29
Evaluation
(a) High priority task
ü  AM (2, 3861)
ü  LM (210, 3862)
ü  SP (378, 3862)
(b) Low priority task
ü  AM (194, 36259)
ü  LM (4774, 37087)
ü  SP (378, 36259)
ü  AA (99.95) in the range [96, 100]
ü  AE (1011) in the range [0, 20000]
30
Outline
●  Motivation
●  Worst-Case Timing Properties
●  System Model
●  ALARP
●  Convergence Algorithm
●  Motivational Example
●  Evaluation
●  Conclusion
31
Conclusion
ü  The algorithm performs well, that is, testing
stopped after the point at which new significant
information may not achieved at justifiable cost
(ALARP).
ü  Can ensure safe worst-case response time
(different criteria for different criticality).
Future work:
•  To improve the performance and scalability.
•  To investigate how the algorithm can be
tuned for robustness.
32
Thank You
33
References
*
[Photograph] Retrieved from:
http://www.ainonline.com/sites/default/files/uploads/
524_saabgripen_pic1.jpg
** [Photograph] Retrieved from:
http://www.carpriceinindia.in/blog/wp-content/
uploads/2012/07/Volvo-Looking-To-Be-No-3-LuxuryCar-Dealer-In-India-In-2020.jpg
*** [Photograph] Retrieved from:
http://www.bombardier.com/content/dam/Websites/
bombardiercom/News/import/883-bombardier-sifangwins-contract-to-build-80-very-high-speed-trains-forchina-3.jpg/_jcr_content/renditions/original
34