A Survey on Mobile Affective Computing

1
A Survey on Mobile Affective Computing
Shengkai Zhang and Pan Hui
arXiv:1410.1648v2 [cs.HC] 11 Oct 2014
Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
{szhangaj, panhui}@cse.ust.hk
Abstract—This survey presents recent progress on Affective
Computing (AC) using mobile devices. AC has been one of the
most active research topics for decades. The primary limitation of
traditional AC research refers to as impermeable emotions. This
criticism is prominent when emotions are investigated outside
social contexts. It is problematic because some emotions are
directed at other people and arise from interactions with them.
The development of smart mobile wearable devices (e.g., Apple
Watch, Google Glass, iPhone, Fitbit) enables the wild and natural
study for AC in the aspect of computer science. This survey
emphasizes the AC study and system using smart wearable
devices. Various models, methodologies and systems are discussed
in order to examine the state of the art. Finally, we discuss
remaining challenges and future works.
Index Terms—Affective computing, mobile cloud computing,
wearable devices, smartphones
I. I NTRODUCTION
Affective Computing (AC) aspires to narrow the gap between the highly emotional human and the emotionally challenged machine by developing computational systems that
can response, recognize and express emotions [1]. With the
long-term research on emotion theory from psychology and
neuroscience [2]–[4], emotion has been confirmed to be a
significant effect on human communication, decision making,
perception and so on. In other words, emotion contributes
to an important part of intelligence. Although the research
on artificial intelligence and machine learning has made big
progress, the computer system still lacks proactive interaction
with human without the understanding of users’ implicit affective state. Therefore, computer scientists have put more and
more efforts on AC to fulfill a dream of constructing a highly
intelligent computer system, which enriches the intelligent and
proactive human-computer interaction.
Existing works detect and recognize affects following three
major steps: (1) collecting affect-related signals; (2) using
machine learning, pattern recognition techniques to classify
emotions; (3) evaluating the performance based on benchmarks. To evaluate the performance and effectiveness of
models and methods, many multimedia emotion databases
have been proposed [5]–[13]. However, such databases were
conducted based on the “artificial” material of deliberately
expressed emotions, either asking the subjects to perform a
series of emotional expressions in front of camera and/or
microphone or using film clips to elicit emotions in the lab
[14]. Research suggests that deliberate behavior differs in
spontaneously occurring behavior. [15] shows that the posed
nature of emotions in spoken language may differ in the choice
of words and timing from corresponding performances in natural settings. [16] comes to the facial behavior, demonstrating
that spontaneous deliberately displayed facial behavior has
differences both in utilized facial muscles and their dynamics.
Hence, there is a great need to find data that included
spontaneous displays of affective behavior. Several studies
have emerged on the analysis of spontaneous facial expressions [17]–[20] and vocal expressions [21], [22]. These studies
recruited a few participants to record their speech and facial
expressions for certain time (several days to months). The
major problem is the scale of databases. Recently, researchers
address this problem exploiting crowdsourcing [23]–[28]. Basically, these works designed a mechanism (e.g., a game [27],
a web search engine [29]) to collect user’s affective state in
large-scale. [25] designed a Amazon Mechanical Turk [30]
setup to generate the affective annotations for corpus.
An alternative to address the problem and create new
research opportunities for affective computing is the combination of smart mobile wearable devices (e.g., iWatch, Goggle
Glass, iPhone, Fitbit) and AC. We simply call it Mobile
Affective Computing (MAC). Such devices have been widely
used over the world. For example, in the third quarter of
2012, one billion smartphones were in use worldwide [31].
The key feature of smart devices is the abundant sensors that
enable various affect-related signals unobtrusive monitoring.
Exploring the possibility of using smart mobile devices for
AC will benefit at least three perspectives by the potential of
long-term unobtrusive monitoring user’s affective states:
•
•
•
Influence the affect-related research literature by the wild,
natural and unobtrusive study
Establish the spontaneous affect databases efficiently to
evaluate new AC methods, models, systems more accurately
Enhance the user-centered human-computer interaction
(HCI) design for the future ubiquitous computing environments [1], [32]–[35]
It is important to note that computer scientists and engineers
can also contribute the emotion research. Scientific research
in the area of emotion stretches back to the 19th century when
Charles Darwin [2], [36] and William James [3] proposed
theories of emotion. Despite the long history of emotion theory
research, Parkinson [37] contends that traditional emotion
theory research has made three problematic assumptions while
studying individual emotions: (1) emotions are instantaneous;
(2) emotions are passive; (3) impermeable emotions. These
assumptions simplify the considerably more complex story in
2
reality. There is considerable merit to argue that some affective
phenomena cannot be understood at the level of small group of
people and must always be studied as a social process [37]–
[42]. With the widespread usage and rich sensors of smart
wearable devices, we are able to verify emotion theories by
designing and implementing MAC systems to proceed wild,
natural and unobtrusive studies.
Establishing a large-scale and trustful labeled data of human
affective expressions is a prerequisite in designing automatic
affect recognizer. Although crowdsourcing provides a promising method to generate large-scale affect databases. It is either
expensive (pay participants) or error prone. In addition, it
cannot monitor human affective events in daily lives failing
to establish spontaneous affect databases. Current spontaneous
emotional expression databases for the ground truth are established by manually labeling movies, photos and voice records,
which is very time consuming and expensive. MAC research
[43], [44] designed software systems upon smartphones that
ran continuously in the background to monitor user’s mood
and emotional states. With proper design of mechanisms to
collect feedback from users, we can establish large-scale
spontaneous affect databases efficiently with very low cost.
Furthermore, ubiquitous computing environments have
shifted the human-computer interaction (HCI) design from
computer-centered to human-centered [1], [32]–[35]. Mobile
HCI designs currently involve mobile interface devices like
smartphone, wisdom band, and smart watch. These devices
transmit explicit messages while ignoring implicit information
of users, i.e., the affective state. As proved by emotion
theory, the user’s affective state is a fundamental component
of human-human communication, motivating human actions.
Consequently, the understanding of user’s affective states
for mobile HCI would be able to perceive user’s affective
behaviour and to initiate interactions intelligently rather than
simply responding to user’s commands.
This paper introduces and surveys these recent advances in
the research on AC. In contrast to previously published survey
papers and books in the field [33], [45]–[68], it focuses on the
approaches and systems for unobtrusive emotion recognition
using human-computer interfaces and AC using smart mobile
wearable devices. It discusses new research opportunities and
examines the state-of-the-art methods that have not been
reviewed in previous survey papers. Finally, we discuss the
remaining challenges and outline future research avenues of
MAC.
This paper is organized as follows: With the significance of
AC described in Section II, we introduce the sensing ability of
mobile devices relating to AC research and in detail review the
related studies combining smart mobile devices in Section IV.
Section V outlines challenges and research opportunities on
MAC. A summary and closing remarks conclude this paper.
II. A FFECTIVE COMPUTING
Affective computing allows computers to recognize, express, and “have” emotions [1]. It is human-computer interaction in which a device has the ability to detect and
appropriately respond to its user’s emotions and other stimuli.
Affective computing was given its name from psychology, in
which “affect” is a synonym for “emotion”. It seems absurd.
Marvin Minsky wrote in The Society of Mind: “The question
is not whether intelligent machines can have any emotions, but
whether machines can be intelligent without emotions” [69].
Scientists agree that emotion is not logic, and strong emotions
can impair rational decision making. Acting emotionally implies acting irrationally, with poor judgment. Computers are
supposed to be logical, rational, and predictable. These are
the very foundations of intelligence, and have been the focus
of computer scientists to build an intelligent machine. At first
blush, emotions seem like the last thing we would want in an
intelligent machine.
However, this negative aspect of emotion is less than half of
the story. Today we have evidence that emotions are an active
part of intelligence, especially perception, rational thinking,
decision making, planning, creativity, and so on. They are
critical in social interactions, to the point where psychologists
and educators have redefined intelligence to include emotional
and social skills. Emotion has a critical role in cognition and in
human-computer interaction. In this section, we demonstrate
the importance of affect and how it can be incorporated in
computing.
A. Background and motivation
Emotions are important in human intelligence, rational
decision making, social interaction, perception, memory and
more. Emotion theories can be largely examined in terms of
two components: 1) emotions are cognitive, emphasizing their
mental component; 2) emotions are physical, emphasizing
their bodily component. The cognitive component focuses on
understanding the situations that give rise to emotions. For
example, “My brother is dead, therefore, I am sad.” The
physical component focuses on the physiological response that
occurs with an emotion. For example, your heart rate will
increase when you are nervous.
The physical component helps computers understand the
mappings between emotional states and emotional expressions
so that the latter can infer the former. Physical aspects of
emotion include facial expression, vocal intonation, gesture,
posture, and other physiological responses (blood pressure,
pulse, etc.). Nowadays, abundant sensors equipped on computers, mobile devices or humans are able to see, hear, and
sense other responses from people. The cognitive component
helps computers perceive and reason about situations in terms
of the emotions they raise. By observing human behaviour and
analyzing a situation, a computer can infer which emotions are
likely to be presented.
With the ability of recognizing, expressing and “having”
emotions for computers, we can at least benefit entertainment, learning, social development. In entertainment, a music application selects particular music to match a person’s
present mood or to modify the mood. In learning, an affective
computer tutor [70] is able to be more effective in teaching
by attending to the student’s interest, pleasure, and distress.
In social development, affective-sensitive system can instruct
people who lack the intelligence of social communications
3
to act more friendly. More sophisticated applications can be
developed by recognizing user’s affect, reasoning with emotional cues, and understanding how to intelligently respond
given the user’s situation. We next provide more depth for
key technologies of AC.
B. Existing technologies
Affective computing system consists of affective data collection, affect recognition, affect expression, etc. Currently,
researchers focus on affect detection and recognition. A computing device could gather cues to user emotion from a variety
of sources.
This paper will focus on reviewing the recent progress of
Mobile Affective Computing (MAC) that affective computing
studies using mobile devices. The biggest difference between
traditional AC research and MAC is the data collection
method. As a result, I will review typical works in terms of
different signals for AC. From previous studies, the following
signals can indicate user’s emotional state and be detected and
interpreted by computing devices: facial expressions, posture,
gestures, speech, the force or rhythm of key strokes and
physiological signal changes (e.g., EEG). Table I summarizes
the traditional data collection methods in terms of different
signals.
Signals
Facial expression
Voice & speech
Gesture & posture
Key strokes
EEG [92]
Text
Collection methods
Camera
Microphone
Pressure sensors
Keyboard
EEG sensors
Human writing
material
Typical references
[17], [18], [71]–[75]
[7], [21], [76]–[81]
[82]–[85]
[86]–[91]
[93]–[96]
[97]–[100]
TABLE I
T RADITIONAL SIGNAL COLLECTION METHODS .
The majority of affect detection research has focused on
detecting from facial expressions since the signal is very easy
to be captured by camera. Microphone is commonly equipped
for computers, smartphones, etc. to record speech and vocal
expressions. Gesture and posture can offer information that
is sometimes unavailable from the conventional nonverbal
measures such as the face and the speech. Traditionally,
researchers install pressure sensors on pad, keyboard, mouse,
chair, etc. to analyze the temporal transitions of posture
patterns. The pressure-sensitive keyboard used in [86] is the
one described by Dietz et al. [101]. For each keystroke, the
keyboard provides readings from 0 (no contact) to 254 (maximum pressure). Implementing a custom-made keyboard logger
allows researchers to gather the pressure readings for key
stroke pattern analysis. EEG studies often used some special
hardware to record EEG signals. For example, AlZoubi et al.
[93] used a wireless sensor headset developed by Advanced
Brain Monitoring, Inc (Carlsbad, CA). It utilizes an integrated
hardware and software solution for acquisition and real-time
analysis of the EEG, and it has demonstrated feasibility for acquiring high quality EEG in real-world environments including
workplace, classroom and military operational settings.
With the collected data by different methods and tools,
affective detection studies choose appropriate feature selection
techniques and classifiers based on the model used (dimensional [102], [103] or categories [61]) to detect affects. The
performance of these systems and algorithms can be evaluated by subjective self-report, experts report and standard
databases. The subjective self-report is to ask participants
of the experiment to report their affective states periodically
while the experts report needs some other affective recognition
experts or peers to report the affective states of participants
with some records (videos, photos, etc.). The third evaluation
approach relies on the developed affect databases. Researchers
test algorithms and systems using the data from these labeled
databases to examine the classification accuracy.
However, such databases usually were created by deliberate
emotions that differ in visual appearance, audio profile, and
timing from spontaneously occurring emotions. In addition,
the experiments of traditional AC research are obtrusive that
they need to recruit participants to take part in under control.
Two obvious limitations are: 1) the data scale is limited that
it is not possible to collect affect-related data in a large
population; 2) the experimenting duration is limited that it
is not possible to involve participants in the experiment too
long.
With the advent of powerful smart devices, e.g., smartphone,
smart glass, smart watch, wisdom band and so on, AC will
enter a new era that eliminates the limitations above with three
reasons:
• Widespread use, in that as of 2013, 65 percent U.S.
mobile consumers own smartphones. The European mobile device market as of 2013 is 860 million. In China,
smartphones represented more than half of all handset
shipments in the second quarter of 2012 [104];
• “All-the-time” usage, in that 3 in 4 smartphone owners
check them when they wake up in the morning, while up
to 40% check them every 10 minutes. When they go on
vacation, 8 in 10 owners say they take their devices along
and use them all the time, compared to slightly more than
1 in 10 who take them but rarely use them [105];
• Powerful sensors and networking, in that smart devices
boast an array of sensors, which include accelerometer,
light sensor, proximity sensor and more. The advantage of
using the smart device sensors is that the device itself can
collect the sensor output, store and process them locally
or communicate them to a remote server [106].
Next section will review the state-of-the-art unobtrusive
emotion recognition research.
III. U NOBTRUSIVE EMOTION RECOGNITION
This section elaborates the related work about emotion
recognition based on human-computer interfaces such as keyboard and webcam.
Unlike existing emotion recognition technologies, users use
a device with an expressionless face or silence, and they do
not want to feel any burden associated with the recognition
process. There have been attempts to satisfy these requirements; recognise emotions of the user inconspicuously with a
minimum cost.
4
Poh et al. [107] presented a simple, low-cost method for
measuring multiple physiological parameters using a basic
webcam. They applied independent component analysis (ICA)
on the color channels in facial image captured by a webcam,
and extracted various vital signs such as a heart rate (HR),
respiratory rate, and HR variability. To prove lightness and
applicability of their approach, they utilized a commonly used
webcam as a sensing unit. Their approach showed significant
potentials for affective computing, because there is a close
correlation between the bio-signal such as a HR and emotion,
and it does not require any attention of the user.
Independent component analysis (ICA) is a relatively new
technique for uncovering independent signals from a set of
observations that are composed of linear mixtures of the
underlying sources. The underlying source signal of interest
in the study is the BVP that propagates throughout the body.
During the cardiac cycle, volumetric changes in the facial
blood vessels modify the path length of the incident ambient
light such that the subsequent changes in amount of reflected
light indicate the timing of cardiovascular events. By recording
a video of the facial region with a webcam, the red, green,
and blue (RGB) colour sensors pick up a mixture of the
reflected plethysmographic signal along with other sources of
fluctuations in light due to artefacts. Given that hemoglobin
absorptivity differs across the visible and near-infrared spectral
range, each color sensor records a mixture of the original
source signals with slightly different weights. These observed
signals from the RGB color sensors are denoted by y1 (t),
y2 (t), y3 (t), respectively, which are the amplitudes of the
recorded signals at time point t. Three underlying source
signals, represented by x1 (t), x2 (t), x3 (t). The ICA model
assumes that the observed signals are linear mixtures of the
sources, i.e.,
y(t) = Ax(t)
where the column vectors y(t) = [y1 (t), y2 (t), y3 (t)]T , x(t) =
[x1 (t), x2 (t), x3 (t)]T , and the square 3 × 3 matrix A contains
the mixture coefficients aij . The aim of ICA is to find a
demixing matrix W that is an approximation of the inverse
of the original mixing matrix A whose output
ˆ
x(t) = Wy(t)
is an estimate of the vector x(t) containing the underlying
source signals. To uncover the independent sources, W must
maximize the non-Gaussianity of each source. In practice,
iterative methods are used to maximize or minimize a given
cost function that measures non-Gaussianity.
This study was approved by the Institutional Review Board,
Massachusetts Institute of Technology. They sample featured
12 participants of both genders (four females), different ages
(18–31 years) and skin colour. All the participants provided
their informed consents. The experiments were conducted indoors and with a varying amount of ambient sunlight entering
through windows as the only source of illumination. Participants were seated at a table in front of a laptop at a distance of
approximately 0.5m from the built-in webcam (iSight camera).
During the experiment, participants were asked to keep still,
breathe spontaneously, and face the webcam while their video
was recorded for one minute.
With the proper experimental design, they demonstrated the
feasibility of using a simple webcam to measure multiple
physiological parameters. This includes vital signs, such as
HR and RR, as well as correlates of cardiac autonomic
function through HRV. The data demonstrate that there is
a strong correlation between these parameters derived from
webcam recordings and standard reference sensors. Regarding
the choice of measurement epoch, a recording of 1–2 min
is needed to assess the spectral components of HRV and an
averaging period of 60 beats improves the confidence in the
single timing measurement from the BVP waveform [108].
The face detection algorithm is subject to head rotation limits.
Epp et al. [87] proposed a solution to determine emotions
of computer users by analyzing the rhythm of their typing
patterns on a standard keyboard. This work highlights two
problems with current system approaches as mentioned above:
they can be invasive and can require costly equipment. Their
solution is to detect users emotional states through their
typing rhythms on the common computer keyboard. Called
keystroke dynamics, this is an approach from user authentication research that shows promise for emotion detection in
human-computer interaction (HCI). Identifying emotional state
through keystroke dynamics addresses the problems of previous methods by using costly equipment which is also nonintrusive to the user. To investigate the efficacy of keystroke
dynamics for determining emotional state, they conducted a
field study that gathered keystrokes as users performed their
daily computer tasks. Using an experience-sampling approach,
users labeled the data according to their level of agreement
on 15 emotional states and provided additional keystrokes by
typing fixed pieces of text. Their approach allowed users’
emotions to emerge naturally with minimal influence from
their study, or through emotion elicitation techniques.
From the raw keystroke data, they extracted a number of
features derived mainly from key duration (dwell time) and
key latency (flight time). They created decision-tree classifiers for 15 emotional states, using the derived feature set.
They successfully modelled six emotional states, including
confidence, hesitance, nervousness, relaxation, sadness, and
tiredness, with accuracies ranging from 77.4% to 87.8%. They
also identified two emotional states (anger and excitement) that
showed potential for future work in this area.
Zimmermann et al. [109] developed and evaluated a method
to measure affective states through motor-behavioural parameters from the standard input devices mouse and keyboard. The
experiment was designed to investigate the influence of induced affects on motor-behaviour parameters while completing
a computer task. Film clips were used as affect elicitors. The
task was an online-shopping task, that required participants
to shop on an e-commerce website for office-supplies. 96
students (46 female, 50 male, aged between 17 and 38)
participated in the experiment. During the experiment, all
mouse and keyboard actions were recorded to a log-file by
software running in the background of the operating system,
invisible to the subject. Log-file entries contained exact time
and type of actions (e.g. mouse button down, mouse position
x and y coordinates, which key pressed).
ntain exact time and type of action (e.g. mouse button down, mouse position x and
tes, which key pressed).
ocedure
5
After subjects arrived at the laboratory, sensors and res- duration of keystroke (from key-down until key-up event) and
performance.
piratory
measurement
were attached
and connected,
val at the laboratory,
sensors andbands
respiratory
measurement
bands were
attached and
Vizer et al. [91] explored the possibility of detecting cogand then they were asked to complete subject and health
, and then data
subjects
were asked
to complete
subject
and health
questionnaires
nitive
and physical stress by monitoring keyboard interactions
questionnaires
on the
computer. All
instructions
duringdata
withgiven
the eventual
the instructions
experiment were
giventhe
at the
appropriatewere
stageswritten
on the and
mputer. All
during
experiment
at thegoal of detecting acute or gradual changes
computer interface. The procedure was automated (as Fig. 1 in cognitive and physical function. The research analyzed
e stages onshown).
the computer
interface.
procedure
was automated
Figureand
2).linguistic features of spontaneously generated
keystroke
The participants
firstThe
familiarized
themselves
with the (see
text.
Theindicated
study is designed with two experimental conditions
online-shopping
task andwith
then the
indicated
their mood on
the and
cts first familiarized
themselves
online-shopping
task
then
graphical and verbal differentials. Afterwards, the first movie (physical stress and cognitive stress) and one control condid on the clip,
graphical
and
verbal
differentials.
Afterwards,
the first
movie
clip,
tion (no
stress).
Data were analyzed with machine learning
expected
to be
affectively
neutral, was
presented. Partechniques
using
ticipantsneutral,
then filled
the mood (control
assessmentrun).
questionnaires,
o be affectively
wasoutpresented
Subjects then
filled out
thenumerous features, including specific types
completed the online-shopping task and filled out the question- of words, timing characteristics, and keystroke frequencies,
essment questionnaires,
completed the online-shopping task and filled out the
naires again. Then they randomly chose the second movie clip, to generate reference models and classify test samples. They
aires again.inducing
Then the
wasPVLA,
randomly
chosen,
one
of theclassifications of 62.5% for physical stress
achieved
correct
onesecond
of the movie
moods clip
PVHA,
NVHA,
NVLA,inducing
nVnA (P=positive, N=negative, H=high, L=low, n=neutral, and 75% for cognitive stress (for 2 classes), which they state is
VHA, PVLA, NVHA, NVLA, nVnA (P=positive, N=negative, H=high, L=low,
V=valence, A=arousal). The film was followed by graphical comparable to other affective computing solutions. They also
V=valence,
A=arousal).
Thequestionnaires,
film was then
followed
verbal
that their
solutions should be tested with varying typing
and verbal
differential
the taskby
andgraphical
the statedand
abilities
and
keyboards,
with varying physical and cognitive
two
questionnaires
again.
The
experiment
ended
after
1.5
to
l questionnaires, then the task and the two questionnaires again. The experiment
abilities, and in real-world stressful situations.
2 hours.
r 1.5 to 2 hours.
Twenty-four participants with no documented cognitive or
physical disabilities were recruited for the study. The data
Introduction
collected per keystroke consisting the type of keyboard event
Subject Data
(either key up or key down), timestamp, and key code.The time
stamp was recorded to a resolution of 10ms using theTicks
Task Exercise
call to the system clock in Visual Basic. Key codes allowed
for subsequent analysis of the keys pressed and words typed.
Inital Mood Assessment
All data were collected using a single desktop computer
and standard keyboard.At the conclusion of the experimental
MOOD INDUCTION 1
session, the participant completed a demographic survey and
MOOD ASSESSMENT 1A
reported his or her subjective level of stress for each cognitive
and physical stress task on an 11-point Likert scale.
PART I - MOOD:
TASK 1
NEUTRAL
The authors extracted features from the timing and
keystroke
data captured during the experiment. Data were
MOOD ASSESSMENT 1B
normalized per participant per feature using z-scores. For
each participant, the baseline samples were used to calculate
MOOD INDUCTION 2
means and standard deviations per feature; then the control
and experimental samples were normalized using those values.
MOOD ASSESSMENT 2A
Both the raw and normalized data were analyzed to determine
PART II - MOOD:
if accounting for individual differences by normalizing results
TASK 2
PH/PL/NH/NL/
in this way would yield higher classification accuracy.
NEUTRAL
MOOD ASSESSMENT 2B
Machine learning methods were employed to classify stress
conditions. The machine learning algorithms used in this
Debriefing
analysis included Decision Tree (DT),Support Vector Machine
(SVM), k-Nearest Neighbor (kNN), AdaBoost,and Artificial
Procedure of the experiment (PH=positive valence/high arousal, PL=positive
Neuralvalence/low
Network(ANN). These techniques were selected priFig. 1.
Procedure of the experiment (PH=positive valence/high
H=negative valence/high
arousal,
NL=negative
valence/low
arousal)
arousal, PL=positive valence/low arousal, NH=negative valence/high arousal,
marily because they have been previously used to analyze
NL=negative valence/low arousal).
keyboard behaviour (e.g., kNN) or they have shown good
alysis
performance across a variety of applications (e.g., SVM). The
They analyzed the mouse and keyboard actions from the primary goal was to confirm whether the selected features
ted data will be analyzed in three steps. In a first step, the independent measure is
log-files. The following parameters were tested for correlations and automated models were able to successfully detect stress
induction with
procedure
of (the
the list
5 different
affective
statesof PVHA,
PVLA,
mood state
may be extended):
number
mouse rather
thanNVHA,
optimize how well they detect stress. Therefore,
clicks
per
minute,
average
duration
of
mouse
clicks
(from
they
adopted
the most commonly used parameter settings
VnA (P=positive, N=negative, H=high, L=low, n=neutral, V=valence, A=arousal).
button-down until button-up event), total distance of mouse when training and testing stress detection models and used
movements in pixels, average distance of a single mouse the same set of features for all five machine learning methods.
movement, number and length of pauses in mouse movement, The parameter settings were chosen based on experience and
number of events “heavy mouse movement” (more than 5 knowledge about the problem at hand rather than through
changes in direction in 2 seconds), maximum, minimum and exhaustive testing of all possible combinations of parameter
average mouse speed, keystroke rate per second, average settings. For example, three-fold cross validation is chosen to
6
strike a balance between the number of folds and the number
of data points in each fold.
The primary contribution of this research is to highlight
the potential of leveraging everyday computer interactions to
detect the presence of cognitive and physical stress. The initial
empirical study described above (1) illustrated the use of a
unique combination of keystroke and linguistic features for
the analysis of free text; (2) demonstrated the use of machine
learning techniques to detect the presence of cognitive or
physical stress from free text data; (3) confirmed the potential
of monitoring keyboard interactions for an application other
than security; and (4) highlighted the opportunity to use
keyboard interaction data to detect the presence of cognitive
or physical stress. Rates for correct classification of cognitive
stress samples were consistent with those currently obtained
using affective computing methods. The rates for correct
classification of physical stress samples, though not as high
as those for cognitive stress, also warrant further research.
Hernandez et al. [86] explores the possibility of using
a pressure-sensitive keyboard and a capacitive mouse to
discriminate between stressful and relaxed conditions in a
laboratory study. During a 30-minute session, 24 participants
performed several computerized tasks consisting of expressive
writing, text transcription, and mouse clicking. Under the
stressful conditions, the large majority of the participants
showed significantly increased typing pressure (> 79% of the
participants) and more contact with the surface of the mouse
(75% of the participants).
The purpose of this work is to study whether a pressuresensitive keyboard and a capacitive mouse can be used to
sense the manifestation of stress. Thus, they devised a withinsubjects laboratory experiment in which participants performed several computerized tasks under stressed and relaxed
2
conditions.
aredes
Asta Roseway3
Mary Czerwinski3
In
order
to
comfortably
and
unobtrusively
monitor stress,
Amherst Street, Cambridge, MA 02139, USA
this
study
examines
gathering
behavioural
activity
from the
387 Soda Hall, Berkeley, CA 94720-1776, USA
keyboard and the mouse. These devices are not only one of
Microsoft Way, Redmond, WA 98052-6399, USA
the most common channels of communications for computer
edes@berkeley.edu, {astar, marycz}@microsoft.com
users but also represent a unique opportunity to non-intrusively
capture
longitudinal
information
that can
help(also
capture
changes
and their associated
behavioral
effects
knownlongconditions
such
as
chronic
stress.
Instead
of
ed can help term
as the fight or flight response), has evolved to help usanalyzing
face
variety of traditional
keyboard
and mouse
dynamics
on time
life-threatening
situations.
However,
repeatedbased
triggering
of or
ronic stress. frequency
this stress
during
daily this
activity
canfocuses
result inon
chronic
of reflex
certain
buttons,
work
pressure.
continuously In particular,
stress, leading
to aused
largea array
of adverse health
conditions
they
pressure-sensitive
keyboard
and a
explores the
such
as
depression,
hypertension
and
various
forms of
capacitive mouse (see Fig. 2).
ensing Stress of Computer Users
oard and a
tressful and
During a
med several
writing, text
he stressful
ants showed
9% of the
face of the
he potential
ndations for
d;
User
cardiovascular diseases [21].
Fig. 2.
Figure 1. Pressure-sensitive keyboard (left), and
Pressure-sensitive
keyboard,
and capacitive
capacitive
mouse
(right). mouse.
A first
towardsthe
preventing
type of
condition
For
each step
keystroke,
keyboardthis
provides
readings
from 0
consists in being able to detect when a person is stressed.
(no contact) to 254 (maximum pressure). They implemented
Ideally, stress measurement systems should be continuous
and unobtrusive so that they can capture the responses of
people throughout the day without creating additional
stress. If a person could know, for instance, that during the
a custom-made keyboard logger in C++ that allowed us to
gather the pressure readings at a sampling rate of 50 Hz.
The capacitive mouse used in this work is the Touch Mouse
from Microsoft, based on the Cap Mouse described in [110].
This mouse has a grid of 13 × 15 capacitive pixels with
values that range from 0 (no capacitance) to 15 (maximum
capacitance). Higher capacitive readings while handling the
mouse are usually associated with an increase of hand contact
with the surface of the mouse. Taking a similar approach to the
one described by Gao et al. [111], they estimated the pressure
on the mouse from the capacitive readings. They made a
custom-made mouse logger in Java that allowed gathering
information for each capacitive pixel at a sampling rate of
120Hz.
In order to examine whether people under stress use the
input devices differently, they designed several tasks that
required the use of the keyboard or mouse under two different
conditions: stressed and relaxed. The chosen tasks are: text
transcription, expressive writing, mouse clicking.
In order to see whether the two versions of the tasks elicited
the intended emotions (i.e., stressed or relaxed), participants
were requested to report their valence, arousal and stress levels
on a 7-point Likert scale after completion of each task. Although stress could be positive or negative, they expected that
high stress levels in the experiment would be associated with
higher arousal and negative valence. Additionally, throughout
the experiment participants wore the Affectiva QTM wrist-band
sensor that scontinuously measured electrodermal activity, skin
temperature, and 3-axis accelerometer with a sampling rate of
8Hz.
In order to examine the differences between relaxed and
stressed conditions, they performed a within-subjects laboratory study. Therefore, all participants performed all the tasks
and conditions during the experiment. After providing written
consent, participants were seated at an adjustable computer
workstation with an adjustable chair, and requested to provide
some demographic information. Next, they were asked to
wear the QTM biosensor on the left wrist and to adjust the
wristband so that it did not disturb them while typing on the
computer. All data were collected and synchronized using a
single desktop computer with a 30 inch monitor. They used
the same pressure-sensitive keyboard and capacitive mouse for
all participants. The room and lighting conditions were also
the same for all users. Fig. 3 shows the experimental setup.
The results of this study indicate that increased levels of
stress significantly influence typing pressure (> 83% of the
participants) and amount of mouse contact (100% of the
participants) of computer users. While > 79% of the participants consistently showed more forceful typing pressure, 75%
showed greater amount of mouse contact. Furthermore, they
determined that considerably small subsets of the collected
data (e.g., less than 4 seconds for the mouse clicking task)
sufficed to obtain similar results, which could potentially lead
to quicker and timelier stress assessments. This work is the first
to demonstrate that stress influences keystroke pressure in a
controlled laboratory setting, and the first to show the benefit
of a capacitive mouse in the context of stress measurement.
The findings of this study are very promising and pave the way
a maximum
ng keyboard
2.8) with a
participants
cience or a
ls for the
school (7),
relaxed and
hin-subjects
erformed all
ment. After
seated at an
stable chair,
information.
or on the left
not disturb
to minimize
mental tasks,
tutorial, in
ece of text
g task. After
e three tasks
l conditions
half of the
ndition and
other half of
ndition and
more, while
een the two
writing and
ced between
red between
of the study.
re instructed
of paradise
think about
using a single desktop computer with a 30 inch monitor.
We used the same pressure-sensitive keyboard and
capacitive mouse for all participants. The room and lighting
conditions were also the same for all users. Figure 4 shows
a photo of the experimental room.
accelerometer. This sensor can be used to detect body movements and gestures. Traditional approaches [82]–[85] have
to attach pressure sensors on chairs, keyboard, mouse, etc.
Accelerometer equipped on smartphones will definitely benefit
the unobtrusive and long-term data collection with respect to
body movement, gestures, and other human behaviors.
30 Inch
Display
Text for
Transcription Task
QTM
biosensor
Capacitive
Mouse
Pressure-sensitive
Keyboard
Headsets for
Transcription Task
Fig. 3.
7
Adjustable
Chair
Figure 4. Photograph of the experimental setup.
The experimental setup.
RESULTS
This section provides the analysis of the collected data
for grouped
the creation
less invasive
systems that
canwecontinuously
into of
several
research questions.
First,
analyze
monitor
stress
in
real-life
settings.
the effectiveness of the tasks to elicit the relaxed and
stressed
Second, the
we study
differences
Table
II states.
summarizes
mainthefeatures
of in
thepressure
presented
for the keyboard
tasks. Third,
we analyzemight
the differences
in
solutions.
Some valuable
conclusions
be considered
capacitance
the mouse
clicking task. Finally, we explore
while
building for
a new
system.
how the
much
datadevelopment
would be necessary
to replicate
thethe
findings
With
rapid
of smartphone
and
abundant
of the previous questions.
of sensors equipped as mentioned in Sec. II, the research
Since some of the affective
data was computing
not normallybecomes
distributed
we and
on smartphone-based
more
utilized
the non-parametric
Wilcoxonthe
Rank
Sumdetection
test (W) to
more
popular.
Next we will discuss
affect
using
evaluate whether two distributions are statistically different
smartphone.
(e.g., the distributions of pressure during stressed and
relaxed conditions), and the non-parametric Kruskal-Wallis
IV.(K)
S MARTPHONE
- BASEDmultiple
AFFECTIVE
COMPUTING
test
to evaluate whether
groups
of variables
This section will discuss the state-of-the-art affective computing using mobile or wearable devices. First, we will briefly
introduce the sensors equipped on modern wearable devices.
Powerful sensors along with the long-term usage of such devices enable the unobtrusive affective data collection, which is
promising to improve traditional affective computing research.
Moreover, the widespread use of smartphones has changed
human lives. Modern technologies transform the impossible to
the nature [112]. As a result, researchers are seeking to explore
new features with respect to interactions (touch behaviors,
usage events, etc.) with mobile wearable devices to infer
affective states. We will summarize the related works in this
aspect.
(a) Smartphone accelerometer
Fig. 4.
(b) Space flight accelerometer
Accelerometers.
Gyroscope: It can be a very useful tool because it moves in
peculiar ways and defy gravity. Gyroscopes have been around
for a century now, and they are now used everywhere from
airplanes, toy helicopters to smartphones. A gyroscope allows
a smartphone to measure and maintain orientation. Gyroscopic
sensors can monitor and control device positions, orientation,
direction, angular motion and rotation. When applied to a
smartphone, a gyroscopic sensor commonly performs gesture recognition functions. Additionally, gyroscopes in smartphones help to determine the position and orientation of the
phone. An on-board gyroscope and accelerometer work in
combination with a smartphone’s operating system or specific
software applications to perform these and other functions.
Fig. 5 give a view of digital gyroscope embedded in
smartphone. Some works [113] take advantages of it to record
users’ moving and position for affect detection.
A. Affect sensing
As we have mentioned above, traditional affective studies rely on camera, microphone, sensors attached to human
bodies and some other human-computer interaction interfaces
to collect raw data for affect detection and analysis. The
widespread use and rich sensors of modern smart wearable
devices overcome the limitations. We briefly introduce the
affective-related sensors and tools equipped on smart devices.
Accelerometer: It measures proper acceleration (acceleration
it experiences relative to free fall), felt by people or objects,
units as m/s2 or g (1g when standing on earth at sea
level). Most smartphone accelerometers trade large value range
for high precision, iPhone 4 range: 2g, precision 0.018g.
Fig. 4 shows the smartphone accelerometer and space flight
Fig. 5.
iPhone’s digital gyroscope.
GPS: Location sensors detect the location of the smartphone
using 1) GPS; 2) Lateration/Triangulation of cell towers or
wifi networks (with database of known locations for towers
and networks); 3) Location of associated cell tower or WiFi
networks. For localization, connection to 3 satellites is required for 2D fix (latitude/longitude), 4 satellites for 3D fix
(altitude) as Fig. 6 shown. More visible satellites increase
precision of positioning. Typical precision is 20 − 50m and
8
References
Sources
Labelling method
Poh et al. [107]
webcam
none
Epp et al. [87]
mouse movements
keystrokes
Zimmermann et
al. [109]
keystrokes and
movements
Method and
classifier
statistical
analysis
and
questionnaire
(5-point Likert
scale)
C4.5 decision
tree
mouse
questionnaire
(self-assessment)
statistical
analysis
Vizer et al. [91]
keystrokes, language parameters
questionnaire
(11-point Likert
scale)
Hernandez et al.
[86]
keystrokes, typing pressures, biosensors, mouse
movements
labels assigned
according to the
tasks
elicited
the
intended
emotions
decision
trees, support
vector
machine,
k-NN,
AdaBoost,
neural
networks
statistical
analysis
Results
conclusion: strong correlation between these parameters derived from webcam recordings and standard reference sensors
accuracy: 77.4 − 87.8%
for confidence, hesitance,
nervousness, relaxation,
sadness, tiredness
conclusion: possible discrimination between neutral category and four others
accuracy:
75.0%
for
cognitive stress (k-NN),
62.5% for physical stress
(AdaBoost, SVM, neural
networks);
conclusion:
number of mistakes made
while typing decreases
under stress
conclusion: increased levels of stress significantly
influence typing pressure
(> 83% of the participants) and amount of
mouse contact (100% of
the participants) of computer users
TABLE II
U NOBTRUSIVE EMOTION RECOGNITION .
maximum precision is 10m. However, GPS will not work
indoors and can quickly kill your battery. Smartphones can try
to automatically select best-suited alternative location provider
(GPS, cell towers, WiFi), mostly based on desired precision.
With the location, we can study the relationship between life
patterns and affective states. For example, most people in
playground feel happy while most feel sad in cemetery.
Fig. 6.
How GPS works.
Location provides additional information to verify the subjective report from participants of affective studies. It may
also help to build a confidence mechanism [114] for subjective
reports. Attaching the location to the subjective report would
produce confident weights to measure the significance of collected subjective reports. For example, a participant reported
that he was happy in cemetery. But, in common sense, people
in cemetery would be sad. Thus, we could set a low weight
(e.g., 0.2) as a confident value to the report.
Microphone: Traditional affect studies need participants
to sit in lab and speak to a microphone. Now the built-in
microphone of smartphones is able to record voice in the wild.
If you are looking to use your phone as a voice recorder,
for recording personal notes, meetings, or impromptu sounds
around you, then all you need is a recording application.
Important things to look for in an application are: 1) the
ability to adjust gain levels; 2) change sampling rates; 3)
display the recording levels on the screen, so you can make
any adjustments necessary; and, perhaps not as important, 4)
save the files to multiple formats (at least .wav and .mp3).
Also very handy is the ability to email the recording, or
save it to cloud storage, such as Dropbox. Some of the
most highly recommended applications for Android include
Easy Voice Recorder Pro, RecForge Pro, Hi-Q mp3 Voice
Recorder, Smart Voice Recorder, and Voice Pro. For iOS,
Audio Memos, Recorder Plus and Quick Record appear to
be good applications. We can choose some of these great
applications to collect the voice for affect analysis.
Camera: The principal advantages of camera phones are
cost and compactness; indeed for a user who carries a mobile
phone anyway, the additional size and cost are negligible.
Smartphones that are camera phones (Fig. 7) may run mobile
applications to add capabilities such as geotagging and image
stitching. A few high end phones can use their touch screen
to direct their camera to focus on a particular object in the
field of view, giving even an inexperienced user a degree of
focus control exceeded only by seasoned photographers using
manual focus. These properties clearly inspire spontaneous
face expression capture for affective studies (e.g., building
spontaneous face expression databases).
9
Fig. 7.
Samsung Galaxy K Zoom.
Touch screen: The increasing number of people using touchscreen mobile phones (as Fig. 8 shown) raises the question
of whether touch behaviors reflect players’ emotional states.
Recent studies in psychology literature [115] have shown
that touch behavior may convey not only the valence of an
emotion but also the type of emotion (e.g. happy vs. upset).
Unfortunately, these studies have been carried out mainly
in a social context (person-person communication) and only
through acted scenarios.
Fig. 8.
Touch-screen mobile phone.
[111] explored the possibility of using this modality to
capture a player’s emotional state in a naturalistic setting of
touch-based computer games. The findings could be extended
to other application areas where touch-based devices are used.
Using stroke behavior to discriminate between a set of affective states in the context of touch-based game devices is very
promising. We should pave the way for further explorations
with respect to not only different contexts but also different
types of tactile behavior. Further studies are needed in a
variety of contexts to establish a better understanding of this
relationship and identify if, and how, these models could be
generalized over different types of tactile behavior, activity,
context and personality traits.
Smartphone usage log: Since the usage of smartphone has
become the natural life of human, some works [43], [116]
have shown the power of application logs for affect detection.
Smartphone usage data can indicate personality traits using
machine learning approaches to extract features. It is also
promising to study relationships between users and personalities, by building social networks with the rich contextual
information available in applications usage, call and SMS logs.
In addition, analyzing other modalities such as accelerometers
and GPS logs remains a topic of mobile affective computing.
B. Mood and emotion detection
After introducing the powerful affect sensing ability of
smart devices, we summarize the state-of-the-art affective
computing research using smart devices. This part will focus
on discussing emotion and mood detection.
First, we need to clarify three terms that are closely intertwined: affect, emotions, and moods. Affect is a generic term
that covers a broad range of feelings that people experience.
It’s an umbrella concept that encompasses both emotions and
moods [117]. Emotions are intense feelings that are directed
at someone or something [118]. Moods are feelings that tend
to be less intense than emotions and that often (though not
always) lack a contextual stimulus. Most experts believe that
emotions are more fleeting than moods [119]. For example,
if someone is rude to you, you will feel angry. That intense
feeling of anger probably goes fairly quickly. When you are
in a bad mood, though, you can feel bad for several hours.
LiKamWa et al. [43] designed a smartphone software system for mood detection, namely MoodScope, which infers
the mood of its user based on how the smartphone is used.
Smartphone sensors that measure acceleration, light, and other
physical properties, MoodScope, however, is a “sensor” that
measures the mental state of the user and provides mood
as an important input to context-aware computing. They
found that smartphone usage correlates well with the users’
moods. Users use different applications and communicate with
different people depending on their moods. Using only six
pieces of usage information, namely, SMS, email, phone call,
application usage, web browsing, and location, they could
build statistical usage models to estimate moods.
In this work, they developed an iOS application that allowed
users to report their moods conveniently Fig. 9(a) shows
the primary GUI of the MoodScope Application. They also
allowed users to see their previous inputs through a calendar
mode, shown in Fig. 9(b), and also through a chart mode.
They created a logger to collect a participant’s smartphone interaction to link with the collected moods. The logger captured
user behaviour by using daemons operating in the background,
requiring no user interaction. The data were archived nightly
to a server over a cell data or Wi-Fi connection. The authors
divided the collected data into two categories: 1) social interaction records, in that phone calls, text messages (SMS) and
emails signal changes in social interactions; 2) routine activity
records, in that patterns in browser history, application usage
10
and location history as coarse indicators of routine activity.
Then they used a least-squares multiple linear regression to
perform the modeling, simply applying the regression to the
usage feature table, labeled by the daily averages of mood.
Sequential Forward Selection (SFS) [120] was used to choose
a subset of relevant features to accelerate the learning process.
(a) Primary GUI
Fig. 9.
(b) Mood calendar view
MoodScope application.
Rachuri et al. [44] proposed EmotionSense, a mobile sensing platform for social psychology studies based on mobile
phones. Key characteristics include the ability of sensing
individual emotions as well as activities, verbal and proximity
interactions among members of social groups. EmotionSense
gathers participants’ emotions as well as proximity and patterns of conversation by processing the outputs from the sensors of smartphones. EmotionSense system consists of several
sensor monitors, a programmable adaptive framework based
on a logic inference engine, and two declarative databases
(Knowledge Base and Action Base). Each monitor is a thread
that logs events to the Knowledge Base, a repository of all
the information extracted from the on-board sensors of the
phones. The system is based on a declarative specification
(using first order logic predicates) of facts, i.e., the knowledge
extracted by the sensors about user behaviour and his/her
environment (such as the identity of the people involved in
a conversation with him/her); actions, i.e., the set of sensing
activities that the sensors have to perform with different duty
cycles, such as recording voices (if any) for 10 seconds each
minute or extracting the current activity every 2 minutes.
By means of the inference engine and a user-dened set of
rules (a default set is provided), the sensing actions are
periodically generated. The actions that have to be executed
by the system are stored in the Action Base. The Action Base
is periodically accessed by the EmotionSense Manager that
invokes the corresponding monitors according to the actions
scheduled in the Action Base. Users can define sensing tasks
and rules that are interpreted by the inference engine in order
to adapt dynamically the sensing actions performed by the
system.
The authors have presented the design of two novel subsystems for emotion detection and speaker recognition which are
built on a mobile phone platform. These are based on Gaussian
Mixture methods for the detection of emotions and speaker
identities. EmotionSense automatically recognizes speakers
and emotions by means of classifiers running locally on offthe-shelf mobile phones. A programmable adaptive system
is proposed with declarative rules. The rules are expressed
using first order logic predicates and are interpreted by means
of a logic engine. The rules can trigger new sensing actions
(such as starting the sampling of a sensor) or modify existing
ones (such as the sampling interval of a sensor). The results
of a real deployment designed in collaboration with social
psychologists are presented. It is found that the distribution
of the emotions detected through EmotionSense generally
reflected the self-reports by the participants.
Gao et al. [111] answered a question whether touch behaviours reflect players emotional states. Since the increasing
number of people play games on touch-screen mobile phones,
the question would not only be a valuable evaluation indicator
for game designers, but also for real-time personalization
of the game experience. Psychology studies on acted touch
behaviour show the existence of discriminative affective profiles. In this paper, finger-stroke features during gameplay
on an iPod were extracted and their discriminative powers
were analyzed. Machine learning algorithms were used to
build systems for automatically discriminating between four
emotional states (Excited, Relaxed, Frustrated, Bored), two
levels of arousal and two levels of valence. Accuracy reached
between 69% and 77% for the four emotional states, and
higher results (89%) were obtained for discriminating between
two levels of arousal and two levels of valence.
The iPhone game Fruit Ninja5 was selected for this study.
Fruit Ninja is an action game that requires the player to squish,
slash and splatter fruits. Players swipe their fingers across the
screen to slash and splatter fruit, simulating a ninja warrior.
Fruit Ninja has been a phenomenon in the app store with large
sales, and has inspired the creation of many similar drawingbased games. As the Fruit Ninja is not an open source app, an
open source variation of it (called Samurai Fruit), developed
by Ansca Mobile, was used instead. It has been developed with
the Corona SDK. In order to elicit different kinds of emotions
from players and to be able to capture players touch behaviour,
a few modifications to the Ansca’s game were implemented.
The developed New Samurai Fruit game software captures
and stores players’ finger-stroke behaviour during gameplay:
the coordinates of each point of a stroke, the contact area of the
finger at each point and the time duration of a stroke, i.e. the
time occurring from the moment the finger touches the display
to the moment the finger is lifted. This information is directly
gathered by using tracking functions of the Corona SDK. The
contact area is used here as a measure of the pressure exerted
by the participants, as the device does not allow pressure to
be measured directly (called feature pressure). In order to take
into consideration physical differences between individuals,
i.e. the dimension of the tip of a finger, a baseline pressure
area is computed before the game starts. Each participant is
asked to perform a stroke. The average of the contact area of
each point of this stroke is used as a pressure baseline for that
participant. The pressure values collected during the game are
hence normalized by subtracting from them the participant’s
baseline value.
The authors introduced and explained the experiment process, the iPod touch device and the game rules of New
Samurai Fruit to participants and answered their questions.
The participants were then asked to play the game and get
familiar with it. After the training session, the participants
were asked to play and try to complete 20 consecutive levels
Intelligent Computing Laboratory, Future IT Research Center
Samsung Advanced Institute of Technology (SAIT), Samsung Electronics Co., Ltd.
Nongseo-dong, Giheung-gu, Yongin-si, Gyeonggi-do, Korea 446-712
{horus.lee, macho, sunjae79.lee, ilpyung.park}@samsung.com
11
of the game. As a form of motivation, they were informed that emotional state directly with emoticons or indirectly with

ideas,
interests
or their
activities
at any respond
time, andtoanywhere
with
the participant
with the highest final score would have been written
text,
thereby
followers
her/his message
Abstract—Awareness of the emotion of those who communicate
their friends.
rewarded with
an others
iTunesis gift
card worthchallenge
25 pounds.
The game
more actively or even empathize with them. In psychology, this
with
a fundamental
in building
affective
More specifically, a SNS user often expresses her/his feeling
was video recorded.
order toEmotion
decide isthea ground
themind
kindorofemotional
phenomenon
is called
emotional
contagion.
intelligentInsystems.
complextruth
state (i.e.
of the
state directly
withthe
emoticons
or indirectly
withThat
influenced
by external
events,
emotional state
associated
with each
level),physiological
participantschanges,
were or
is, communication
between
users would
be
newly message
induced or
written
text,
thereby
their
followers
respond
to
her/his
with others. Because emotions can represent a user’s
first asked torelationships
fill out a self-assessment
questionnaire after every enriched
by the or
sharing
of emotion.
TheyIndefined
thesethis
kinds
more actively
even empathize
with them;
psychology,
internal context or intention, researchers suggested various
game level. methods
To reduce
the effect
the game
results
the of
of communicational
asemotional
affectivecontagion
social communicakind of phenomenonactivities
is called the
[1]. That
to measure
theofuser’s
emotions
from on
analysis
signals,
expressions,
or voice.
between
users wouldsome
be newly
or
labelling, thephysiological
Cued Recall
Brieffacial
approach
[121] was
used.However,
At tion is,
as communication
depicted in Fig.
10. However,
SNSinduced
users hardly
existing methods have practical limitations to be used with
enriched
by
the
sharing
of
emotion.
We
defined
these
kind
of or
the end of theconsumer
20 levels,devices,
the player
was shown a recorded video show what they feel, therefore their friends cannot sense
such as smartphones; they may cause
communicational
activities
as
affective
social
communication
of the gameplay
s/he had
just and
carried
out.
While
the video
to their emotional states appropriately.
inconvenience
to users
require
special
equipment
such as react
a
as depicted in Figure 1.
sensor. Our
is to recognize
emotions
was playing skin
s/heconductance
was encouraged
toapproach
recall his/her
thoughts
of states
the user
inconspicuously
collecting
and analyzing
and emotional
and by
if necessary
relabel
the levels.
user-generated data from different types of sensors on the
The results
of visual
andadopted
featurea machine
extraction
smartphone.
To inspection
achieve this, we
learning
to gather,
analyze and
classify
usage patterns,
showed that approach
finger-stroke
behaviour
allows
fordevice
the discrimideveloped
a social
service client as
forwell
Android
nation of theandfour
affective
statesnetwork
they investigated,
John's Emotion:
John's Social
smartphones which unobtrusively find various behavioral
Happy
Communication Activity
as for the discrimination
two levels
arousal
and a
John
patterns and the between
current context
of users.ofAlso,
we conducted
w/ his Emotion
between twopilot
levels
of valence.
resultsdata
showed
high
study
to gather The
real-world
whichvery
imply
various
andcross
situations
of a participant
her/his sample
everyday life.
performance behaviors
on both
validations
on in
unseen
1. Gather and Analyze Sensor Data
From these data, we extracted 10 features and applied them to
2. Recognize User Emotion Unobtrusively
data. To investigate
person-independent
emotion
recognition
3. Append User Emotion to Communication
build a if
Bayesian
Network classifier
for emotion
recognition.
models couldExperimental
reach similar
results,
results
showthree
that modelling
our system algorithms
can classify user
Shawn: Hey, what makes you happy?
emotions
into 7 classes
such as happiness,
surprise, anger,
disgust,
were selected:
Discriminant
Analysis
(DA), Artificial
Neural
John: My paper accepted for CCNC 2012!!
sadness, fear, and neutral with a surprisingly high accuracy. The
J Kim: Congratulation man 
Network (ANN)
with
Back
Propagation
and
Support
Vector
J.Y: me too!!
proposed system applied to a smartphone demonstrated the
John: Dinner is my treat!
Machine (SVM)
classifiers.
The lastemotion
two learning
algorithms
feasibility
of an unobtrusive
recognition
approach and a
[Communication induced by the user emotion]
John's Followers
user
scenario
emotion-oriented
socialgreat
communication
between
Fig. 1. Conceptual Diagram of Affective Social Communication.
were selected
for
their for
popularity
and their
ability for
users.
generalization.
The models for Arousal produced the best cor- Fig. 10. Diagram of Affective Social Communication.
However, some SNS users hardly show what they feel;
rect recognitionIndex
results
ranging from computing,
86.7% (kernel
SVM)mediated
to
Terms—Affective
Computer
therefore their friends cannot sense or react to their emotional
paper
presented This
a machine
learning
approach
88.7% (linearcommunication,
SVM). In theEmotion
case of Valence,
theMachine
performances
recognition,
intelligence, This
states
appropriately.
is probably
because
they to
are reclearning
emotional
states
of
a
smartphone
user
without
ranged fromSupervised
82% (linear
SVM) to 86% (kernel SVM). The ognize
unfamiliar with the expression of emotion or are not aware of any
or extra
cost for
equipping
sensors.
results for the discrimination of the 4 emotion labels were also inconvenience
their own emotions
enough.
Possible
solutionadditional
for this problem
I. INTRODUCTION
is adopting
emotion
recognition
technologies
are being
current
emotion
of the
user, theywhich
gathered
various
well above chance level; ranging from 69% (kernel SVM) to To perceive
OCIAL networking service (SNS) like Twitter, Facebook or
the affective
computingThese
researchsensor
societydata
typesextensively
of sensorstudied
data by
from
the smartphone.
77% (linear SVM).
Google+ is an online service platform which aims for
emotions
of types
the user.
emotion and
categorized
into two
suchExisting
as behaviour
Overall, this
work has investigated the possibility of using can tobe determine
building and managing social relationships between people. On recognition technologies can be divided into three major
context
of
the
user,
and
it
was
collected
while
the
user used
stroke behaviour
to
discriminate
between
a
set
of
affective
the SNS, users represent themselves in various ways such as categories depending on what kinds of data is analyzed
for
a
certain
application
on
her/his
smartphone.
As
the
proof
states in the profiles,
context pictures,
of touch-based
game
devices.
The
results
or text messages, and interact with other recognizing human emotion: physiological signals, facial
of concept,
they
developed
an
Android
application
named
are very promising
and pavetheir
theacquaintances.
way for further
explorations
people including
By using
various SNS
expressions, or voice. Physiological emotion recognition
mobile devices,
furthermore,
can share
affective
client which
collects
sensor
with respect applications
to not onlyon different
contexts
but also users
different
showstwitter
acceptable
performance
butabove
has mentioned
some critical
whenever that
the user
sends
a text message
(i.e.,
to the
types of tactile behaviour. Further studies are needed in a data weaknesses
prevent
its widespread
use; they
are tweet)
obtrusive
to users
special
devices.they
For example,
Viaand
theneed
pilot
studyequipment
for two orweeks,
collectedif 314
variety of contexts to establish a better understanding of this Twitter.
relationship and identify if, and how, these models could be dataset including self-reported emotional states of the user,
generalized over different types of tactile behaviour, activity, and utilized it to build Bayesian Network classifier which
is a powerful probabilistic classification algorithm based on
context and personality traits.
Lee et al. [122] proposed an approach to recognize emo- the Bayes theorem. Through repetitive experiments including
tions of the user by inconspicuously collecting and analyzing preprocess, feature selection, and inference (i.e., supervised
978-1-4577-2071-0/12/$26.00
©2012on
IEEE
260
the classifier can classify each tweet written by the
user-generated data
from different types of sensors
the learning),
smartphone. To achieve this, they adopted a machine learning user into 7 types with a satisfactory accuracy of 67.52% on
approach to gather, analyze and classify device usage patterns, average. Classifiable types of emotions include Ekmans six
and developed a social network service client for Android basic emotions [123] and one neutral state.
To determine the current emotion of users, they adopted a
smartphones which unobtrusively found various behavioural
machine learning approach consisting of the data collection,
patterns and the current context of users.
Social networking service (SNS) like Twitter, Facebook or data analysis, and classification process. For these tasks, they
Google+ is an online service platform which aims for building used a popular machine learning software toolkit named Weka
and managing social relationships between people. On the [124]. Data collection process gathered various sensor data
SNS, users represent themselves in various ways such as from the smartphone when a participant used AT Client
profiles, pictures, or text messages, and interact with other installed on their device. In more detail, if a participant felt a
people including their acquaintances. By using various SNS specific emotion at the certain moment in their everyday life,
applications on mobile devices, furthermore, users can share they would write some short text messages and were asked
ideas, interests or activities at any time, and anywhere with to report their current emotion via AT Client. Meanwhile,
their friends. A SNS user often expresses her/his feeling or AT Client collected various sensor data during this period.
S
with the emotion. All of these features are formalized as an
attribute-relation file format (ARFF) for Weka. List and
s, we adopted a descriptions of features are summarized in Table I.
data collection,
Next, we analyzed collected training data using Weka. As the
these tasks, we first step, we discretized features which have a continuous
kit named Weka numerical value such as a Typing Speed. This procedure is
for making
all features
numerical
and any
It is necessary
also possible
that some
userssuitable
write afortweet
without
statistical
computation
on
the
Weka
(i.e.,
preprocess
step).
emotions; this emotional state might be a neutral, and they
Then, we ranked all features by measuring information gain
also with
tookrespect
this as
training data because the neutral emotion
to each class. Information gain is one of the most
itselfpopular
is a frequently
observed and
state
in emotion
metrics of association
correlation
amongrecognition
several
[125].
From variables
gathered[13].
sensor
data,onthey
features
random
Based
the extracted
result of 14
attribute
evaluation,
finally
selectedcorrelations
10 features which
more
which
seem towehave
potential
with have
the emotion.
strong
correlation
emotion toasbuild
an inference model file
All of
these
featureswith
are the
formalized
an attribute-relation
(i.e., feature selection step). Shadowed cells of Table I are
format (ARFF) for Weka.
selected features and reasonable explanations for this feature
They
analyzed
collected
training
data using
selection
are presented
in the
experimental
study Weka.
section. By
By the
repetitive
experiments,
they using
chose10-fold
Bayesian
classifier
the repetitive
experiments
crossNetwork
validation,
we
eet to Twitter when
chose model
BayesianforNetwork
classifier because
as an inference
as anfinally
inference
their system,
it showed
ociated sensor data
model
for our system,
because it
showedother
highest
classification
ilized for the data
highest
classification
accuracy
among
machine
learning
accuracysuch
among
machine
learning
classifiers
such
a
classifiers
as other
a Nave
Bayes,
decision
tree,
orasneural
Naïve Bayes, decision tree, or neural network. Sample
network.
Fig.Network
11 shows
sample
Figure 2. In this
Bayesian
is drawn
at theBayesian
Figure 3. Network.
the smartphone
their device. In
emotion at the
ould write some
ort their current
Client collected
lso possible that
s; this emotional
as training data
Data Record per Each Tweet
12
They applied a Bayesian approach to reconstruct the neural
sources, demonstrating the ability to distinguish among emotional responses reflected in different scalp potentials when
viewing pleasant and unpleasant pictures compared to neutral
content. This work may not only facilitate differentiation of
emotional responses but also provide an intuitive interface
for touch based interaction, allowing for both modelling the
mental state of users as well as providing a basis for novel
bio-feedback applications.
The setup was based on a portable wireless Emotiv Research
Edition neuroheadset (http://emotiv.com) which transmits the
EEG and control data to a receiver USB dongle, originally
intended for a Windows PC version of the Emotiv research
edition SDK. They connected the wireless receiver dongle to
a USB port on a Nokia N900 smartphone with Maemo 5 OS.
Running in USB hostmode they decrypted the raw binary EEG
data transmitted from the wireless headset, and in order to
synchronize the stimuli with the data they timestamped the first
and last packets arriving at the beginning and end of the EEG
recording. Eight male volunteers from the Technical University
of Denmark, between the ages of 26 and 53 participated in the
experiment.
on
are generated
vior on the default
EditText. From
habits of users in
All values are
Fig. 3. Sample Bayes. Network drawn by GeNIe (http://genie.sis.pitt.edu/)
software
Fig. 11.
Sample Bayesian Network.
mes user invokes
in EditText to
At last, our inference model was continuously updated by the
ons like cursor
Through
a pilot study for two weeks, they gathered 314
unknown real-world data and classified it into 7 emotional
on, and copy &
real-world
dataset
from
a participant
erical.
states (i.e.,
learning
& inference
step). and utilized it to validate
evice is shaken; it the emotion recognition approach by measuring classification
C. System Architecture
accuracy. One subject from the research group members (one
A block diagram of AT Client is shown in Figure 4. Data
male in
his 30s) was recruited for the pilot study; he wrote
Aggregator
gathers various data from internal/external sensors:
n the formula of tweets and indicated his emotional state whenever he felt a
it
collects
sensor data or information like coordinates of touch
w) + 15 where Ta
emotion
in his
daily life.of the device, current location,
positions,
degree
of movement
F), Tw = wet-bulb certain
umerical value.
weather received
from webvia
through
running software
in
Byor evaluating
all features
information
gain attribute
Entertain., Etc
smartphones.
Then,
Data
Aggregator
divides
data
into
two
evaluation algorithm and concerning some facts, they selected
ning, Night
classes such as user behavior-related and context-related data,
defined by the 10 features as primary attributes of training data for the
and delivers these data to User Behavior Analyzer and User
Bayesian
Network
The feature which has the highest
Context
Analyzerclassifier.
respectively.
se, Anger, Disgust,
correlation to emotions was the speed of typing; in addition,
lengths of inputted text, shaking of the device, or user location
were also important features for emotion recognition.
The average classification accuracy is 67.52% for 7 emotions. For emotions such as happiness, surprise, and neutral,
the approach performed appreciably better than chance (i.e.,
262 and in the case of anger and disgust, it also outper50%),
formed at least random classification. Classification accuracy
is irregular for emotion types, but they found a general
correlation between the number of observation cases and
classification accuracy. This finding means that low accuracy
in classification for sadness and fear may be improved with
additional data collection.
Petersen et al. [126] rendered the activations in a 3D brain
model on a smartphone. They combined a wireless EEG
headset with a smartphone to capture brain imaging data
reflecting the everyday social behaviour in a mobile context.
htness; it has a
Fig. 12. The wireless neuroheadset transmits the brain imaging data via a
receiver USB dongle connected directly to a Nokia N900 smartphone.
Fig. 12 shows the smartphone-based interface. Applying
a Bayesian approach to reconstruct the underlying neural
sources may thus provide a differentiation of emotional responses based on the raw EEG data captured online in a
mobile context, as the current implementation is able to
visualize the activations on a smartphone with a latency of
150ms. The early and late ERP components are not limited to
the stark emotional contrasts characterizing images selected
from the IAPS collection. Whether they read a word with
affective connotations, came across something similar in an
image or recognized from the facial expression that somebody
looks sad, the electrophysical patterns in the brain seem to
suggest that the underlying emotional processes might be the
same. The ability to continuously capture these patterns by
integrating wireless EEG sets with smartphones for online
processing of brain imaging data may offer completely new
opportunities for modelling the mental state of users in real life
scenarios as well as providing a basis for novel bio-feedback
applications.
Kim et al. [113] investigated human behaviours related with
13
References
Data
Sensor monitors
LiKamWa et al.
[43]
SMS, email, phone call,
application usage, web
browsing, and location
Location, voice records,
sensing data
GPS
Rachuri
[44]
et
al.
GPS,
Accelerometer,
Bluetooth,
Microphone
Touch panel
Gao et al. [111]
Touch pressure, duration,
etc.
Lee et al. [122]
Touch behaviour, shake,
illuminance, text, location,
time, weather
Touch
panel,
GPS,
accelerometer
Petersen et al.
[126]
EEG data
Kim et al. [113]
Movement, position and
application usage
Wireless
EEG
headset,
touch
panel
Accelerometer,
touch
panel,
gyroscope,
application logs
Modeling technique and
classifier
Multiple linear regression,
Sequential forward selection
Gaussian Mixture Model
classifier, Hidden Markov
Models, Maximum A Posteriori Estimation
Discriminant
Analysis,
Artificial Neural Network
with Back Propagation,
Support Vector Machine
Bayesian Network classifier
Bayesian EM approach
Naive bayesian, bayesian
network, simple logistic,
best-first decision tree,
C4.5 decision tree and
naive bayesian decision
tree
Participants
32
18
15
314
realworld dataset
from
1
participant
8
N/A
TABLE III
M OBILE A FFECT D ETECTION .
the touch interface on a smartphone as a way to understand
users emotional states. As modern smartphones have various embedded sensors such as accelerometer and gyroscope,
they aimed to utilize data from these embedded sensors for
recognizing human emotion and further finding emotional
preferences for smartphone applications. They collected 12
attributes from 3 sensors during users’ touch behaviours, and
recognized seven basic emotions with off-line analysis. Finally,
they generated a preference matrix of applications by calculating the difference between prior and posterior emotional
states to the application usage. The pilot study showed 0.57
of an average f1-measure score (and 0.82 with decision tree
based methods) with 455 cases of touch behaviour. They
discovered that a simple touching behaviour has a potential
for recognizing users’ emotional states.
This paper proposed an emotion recognition framework for
smartphone applications and the analysis of an experimental
study. The framework consists of an emotion recognition
process and emotional preference learning algorithm which
learns the difference between two emotional state; the prior
and the posterior emotional state for each entities in a mobile
device such as downloaded applications, media contents and
contacts of people. Once the entities emotional preference
has been made, they can recommend smartphone applications,
media contents and mobile services that fit to users current
emotional state. From the emotion recognition process, they
tracked the temporal changes of the users emotional states.
The proposed emotion based framework has a potential to
help users to experience emotional closeness and personalized
recommendations.
Fig. 13 shows an overall process that consists of monitoring, recognition and emotional preference learning module.
The monitoring module collects devices sensory data such
as touch and accelerometer with its occurrence time. The
emotion recognition module processes the collected data to
A N E XAMPLE O
BASED O
Relation
R1
Tr
R2
Fig. 1. The Conceptual Process of Emotion Recognition, Generation and
Fig. 13.
The Preference
Conceptual
Process of Emotion Recognition, Generation and
Emotional
Learning
B
N
B
Life
C
Emotional Preference Learning.
speech and contextual information are frequently used to recognize human emotion. Many researchers proposed practical
inferapproaches
the humans
emotional
state.
necessary,
emotion
to apply
emotions
to When
machines,
especiallythe
robots,
recognition
module
can havecharacters.
additional
attributes
such as
virtual agents
and embodied
According
to recent
Definition 1. (E
studies,interaction
machine’s and
embedded
emotional
aspect and
memory,
moodshuman-like
to enhance
its accuracy
{u1 , u2 , . . . , uk }
increases
positive
reactions
from
the
human
users[6].
Beyond
performance. As in the bottom part of Fig. 13, the emotional
and a set of r
robotics, learning
several studies
tried is
to integrate
emotional
to an
preference
module
responsible
for factors
building
e = hU
ˆ, A
tuple E
consumer electronics device[4][5].
ˆ × Aˆ wh
emotional preference matrix for the smartphone applications
R2 ⊆ U
(and orIII.
their
embedding
objects)ON
byMlearning
the difference
respectively.
E MOTION
R ECOGNITION
OBILE P LATFORM
of quantificated
emotional
statea typical
between
prior
and posterior Unlike the bi
Emotion based
systems have
process
of recognition
behaviours.
of human emotion and generation of machine emotion. The relationship R2
include text, mu
processhave
has several
stagespreferences.
for the recognition
- classification,
People
their own
To discover
this, many
droid market[16
quantification
and
mapping
stages.
Through
this,
a
machine
technologies are depended on user or item based stochastic
can recognize a user’s emotion, and optionally, the machine belonging to ea
approaches. But almost cases ignore the users’ emotional
can generates its own emotional state or make recommenda- mentioned abov
factor.
From
the emotion
recognition
gave each
tion such as Fac
tions
for users.
Fig.1 shows
an overallmodule,
process they
that consists
application
and
object
an
emotional
preference
by
learning
the application
of monitoring, recognition and emotional preference learning the
as an object.
differences
users’ emotional
statedevice’s
duringsensory
the usedata
of apmodule. of
Thethe
monitoring
module collects
such as
touch and accelerometer
occurrence
plication.
Smartphone
users interactwith
withitstheir
devicestime.
through
The emotion
recognition module
processes
data
B. Emotion Rec
sequential
or non-sequential
activities
suchthe
as collected
dragging,
flickinfer the
human’s applications,
emotional state.viewing
When necessary,
ing, to
tapping,
executing
contents the
and so As we already
emotion recognition module can have additional attributes
on. such as memory, interaction and moods to enhance its accuracy nition process as
and performance. As in the bottom part of Fig.1, the emotional
preference learning module is responsible for building an
emotional preference matrix for the smartphone applications
and mapping sta
for classifying s
touch action can
multi-touching.
14
For the training of user’s emotional state, they periodically
received emotion feedback. They tried to minimize noises
during the experiment. Because users moving and position can
influence accelerometer or gyroscope, they guided a subject sit
and keep the same position whenever he performs an action
with the smartphone. With the emotion recognition module,
they found each applications emotional preferences by learning
differences of the emotion attributes.
Although the experimental result showed comparatively
low performance, they showed the possibility of touch-based
emotion recognition method which has wide applicability
since modern smart devices are dependent on touch interaction. They argued that they could enhance the recognition
performance by additional data collection and found remedies
for practical issues for the future works. Collecting the self
reported emotion data is one of the biggest challenge for the
research. Some emotions such as anger, fear and surprise are
not frequently occurring in real life situation and they are hard
to be caught during the experiments.
In summary, this work proposed an emotion recognition
based smartphone application preference learning. From the
experiment, they discovered that a simple touching behaviour
can be used for recognizing smartphone users’ emotional
state, and reducing the number of emotions can improve the
recognition accuracy. They also found that the window size
for the recognition depended on the type of emotions and
classification methods.
V. C HALLENGES AND OPPORTUNITIES
The unobtrusive affect detection requires sensors deployment on common used devices. While the mobile affective
computing takes advantage of embedded sensors in smart
devices. Even researchers have put a lot of efforts, we only
touched the tip of the iceberg. Although the accuracy of some
works has reached a relative high level, we still are very
far away from the usefulness. This section will highlight the
challenges on MAC and discuss some research potentials.
The existing spontaneous affective data were collected in
the following scenarios: human-human conversation, HCI, and
use of a video kiosk. Human-human conversation scenarios
include face-to-face interviews [17], [127]. HCI scenarios include computer-based dialogue systems [21], [79]. In the video
kiosk settings, the subjects’ affective reactions are recorded
while the subjects are watching emotion-inducing videos [47],
[128], [129]. In most of the existing databases, discrete
emotion categories are used as the emotion descriptors. The
labels of prototypical emotions are often used, especially in
the databases of deliberate affective behaviour. In databases
of spontaneous affective behaviour, coarse affective states
like positive versus negative, dimensional descriptions in the
evaluation-activation space and some application-dependent
affective states are usually used as the data labels. Currently,
several large audio, visual, and audiovisual sets of human
spontaneous affective behaviour have been collected, some of
which are released for public use.
Friendly application design is also challenging for mobile
emotion detection. Some works (e.g., [130]) have designed
some interesting applications (Fig. 9). The friendly design
can make users feel more comfortable for affective feedback
collection. The arising problem is the self-interference that
such design could incur some effects on user’s affect. It is nontrivial to evaluate the self-interference and find a compensation
to the results. Besides, the widespread use of smart devices
may change user’s social behaviour, which affects the tradition
emotional social relationship.
The lack of the coordination mechanism of affective parameters under multi-model condition quite limits the affective
understanding and the affect prompts. The amalgamation of
different channels is not just the combination of them, but
to find the mutual relations among all channel information.
The mutual relation could make better integration of the
different channels during interaction phases for both recognition/understanding and information generation.
B. Potential research
A. Challenges
So far, the research stays on the superficial of affective
computing problems. The conclusion either proposed a high
correlation between collected data and results or showed
reasonable accuracy of affect detection under studies with very
few participants. All these give us hopes and inspire more and
more talents to be devoted to solving remaining problems.
The primary challenge should be the ground truth establishment. To our knowledge, researchers recruited participants
to fill some questionnaires for obtaining ground truth. Alternatively, some appropriate databases (audio, visual expressions) can be used to train the classifier. However, most
of databases are built by recording emotion expressions of
actors. Psychological research suggests that deliberate emotion
expressions differ in spontaneous occurring expressions [15],
[16]. Spontaneous emotion expression database establishment
is quite expensive and inefficient, in that we have to manually
label the raw data. Nevertheless, researchers have conquered
the difficulties to build large spontaneous affective data sets.
The smart devices have enough power and popularity to
offer an ubiquitous affect detection, in that they keep tracking
people’s everyday life to constantly collecting affective data.
It is possible that through the mobile application markets (e.g.,
Apple Store, Google Play), we could deploy almost unlimited
the affect detection applications among users. This enables
an individual affective services, in which the application
will learn individual mood and emotion patterns and provide
the affect detection in custom. In this way, we can keep
collecting people’s spontaneous affective data in their daily
lives. This will eventually efficiently and effectively build a
scalable in custom spontaneous affective databases,which are
helpful to extract common affective features of human and
find distinguishable fingerprints among different users.
In addition, lots of new HCI and smart wearable devices
have gain their popularity so that the affective fingerprints
(e.g., voice, facial expression, gesture, heart rate and so on)
could be fully investigated by combining multiple wearable
devices. Therefore, designing an affective eco-system that
15
coordinates all devices for data fusion is very promising
to boost the affect detection performance and explore the
possibility of further affective computing techniques.
Moreover, the emergence of the World Wide Web, the
Internet and new technologies (e.g., mobile augmented reality
[131], [132]) have changed life styles of human. It is necessary to discover new emotional features, which may exist in
application logs, smart device usage patterns, locations, order
histories, etc. There is a great need to thoroughly monitor
and investigate the new emotional expression fingerprints. In
other words, establishing new affective databases in terms of
new affective fingerprints should be a very significant research
topic.
VI. C ONCLUSION
Research on the human affect detection has witnessed a
good deal of progress. At those time, affective computing
research relied on well-controlled experiments in lab. Available
data sets are either deliberate emotional expressions from
actors or small scale spontaneous expressions established at
a substantial cost. The obtrusive data collection methods in
lab also restrict the effectiveness of the research.
Today, new technologies bring us new opportunities of
affect detection in an unobtrusive and smart way. A number
of promising methods for unobtrusive tool based (webcam,
keystroke, mouse) and smartphone based of human affect
detection have been proposed. This paper focuses on summarising and discussing these novel approaches to the analysis
of human affect and reveal the most important issues yet to
be addressed in the field include the following:
• Building a scalable in custom databases, which could
extract common affect features and distinguishable affect
fingerprints of human,
• Devising an affective ecosystem that coordinates different
smart wearable devices, which takes into consideration
the most comprehensive affective data for boosting affect
detection accuracy,
• Investigating potential new affect-related features due to
the changing life styles.
Since the complexity of these issues concerned with the
interpretation of human behaviour at a very deep level is
tremendous and requires a highly interdisciplinary collaboration, we believe the true break-throughs in this field can be
experienced by establishing an interdisciplinary international
program directed toward computer intelligence of human behaviours. The research on unobtrusive and mobile affective
computing can aid in advance the research in multiple related
research fields including education, psychology and social
science.
R EFERENCES
[1] R. W. Picard, Affective Computing. Cambridge, MA, USA: MIT Press,
1997.
[2] C. Darwin, The expression of the emotions in man and animals. Oxford
University Press, 1998.
[3] W. James, What is an Emotion? Wilder Publications, 2007.
[4] W. D. Taylor and S. K. Damarin, “The second self: Computers and
the human spirit,” Educational Technology Research and Development,
vol. 33, no. 3, pp. 228–230, 1985.
[5] J. C. Hager, P. Ekman, and W. V. Friesen, “Facial action coding
system,” Salt Lake City, UT: A Human Face, 2002.
[6] R. Cowie, E. Douglas-Cowie, and C. Cox, “Beyond emotion
archetypes: Databases for emotion modelling using neural networks,”
Neural networks, vol. 18, no. 4, pp. 371–388, 2005.
[7] R. Banse and K. R. Scherer, “Acoustic profiles in vocal emotion
expression.” Journal of personality and social psychology, vol. 70,
no. 3, p. 614, 1996.
[8] L. S.-H. Chen, “Joint processing of audio-visual information for the
recognition of emotional expressions in human-computer interaction,”
Ph.D. dissertation, Citeseer, 2000.
[9] R. Gross, “Face databases,” in Handbook of Face Recognition.
Springer, 2005, pp. 301–327.
[10] T. Kanade, J. F. Cohn, and Y. Tian, “Comprehensive database for facial
expression analysis,” in Automatic Face and Gesture Recognition, 2000.
Proceedings. Fourth IEEE International Conference on. IEEE, 2000,
pp. 46–53.
[11] H. Gunes and M. Piccardi, “A bimodal face and body gesture database
for automatic analysis of human nonverbal affective behavior,” in
Pattern Recognition, 2006. ICPR 2006. 18th International Conference
on, vol. 1. IEEE, 2006, pp. 1148–1153.
[12] M. Pantic, M. Valstar, R. Rademaker, and L. Maat, “Web-based
database for facial expression analysis,” in Multimedia and Expo, 2005.
ICME 2005. IEEE International Conference on. IEEE, 2005, pp. 5–pp.
[13] L. Yin, X. Wei, Y. Sun, J. Wang, and M. J. Rosato, “A 3d facial
expression database for facial behavior research,” in Automatic face
and gesture recognition, 2006. FGR 2006. 7th international conference
on. IEEE, 2006, pp. 211–216.
[14] J. J. Gross and R. W. Levenson, “Emotion elicitation using films,”
Cognition & Emotion, vol. 9, no. 1, pp. 87–108, 1995.
[15] C. Whissell, “The dictionary of affect in language.” Emotion: Theory,
Research, and Experience. The Measurement of Emotions, vol. 4, pp.
113–131, 1989.
[16] P. Ekman and E. L. Rosenberg, What the face reveals: Basic and
applied studies of spontaneous expression using the Facial Action
Coding System (FACS). Oxford University Press, 1997.
[17] M. S. Bartlett, G. Littlewort, M. Frank, C. Lainscsek, I. Fasel, and
J. Movellan, “Recognizing facial expression: machine learning and
application to spontaneous behavior,” in Computer Vision and Pattern
Recognition, 2005. CVPR 2005. IEEE Computer Society Conference
on, vol. 2. IEEE, 2005, pp. 568–573.
[18] A. B. Ashraf, S. Lucey, J. F. Cohn, T. Chen, Z. Ambadar, K. M.
Prkachin, and P. E. Solomon, “The painful face–pain expression recognition using active appearance models,” Image and Vision Computing,
vol. 27, no. 12, pp. 1788–1796, 2009.
[19] J. F. Cohn and K. L. Schmidt, “The timing of facial motion in posed and
spontaneous smiles,” International Journal of Wavelets, Multiresolution
and Information Processing, vol. 2, no. 02, pp. 121–132, 2004.
[20] M. F. Valstar, M. Pantic, Z. Ambadar, and J. F. Cohn, “Spontaneous
vs. posed facial behavior: automatic analysis of brow actions,” in Proceedings of the 8th international conference on Multimodal interfaces.
ACM, 2006, pp. 162–170.
[21] C. M. Lee and S. S. Narayanan, “Toward detecting emotions in spoken
dialogs,” Speech and Audio Processing, IEEE Transactions on, vol. 13,
no. 2, pp. 293–303, 2005.
[22] A. Batliner, K. Fischer, R. Huber, J. Spilker, and E. N¨oth, “How to
find trouble in communication,” Speech communication, vol. 40, no. 1,
pp. 117–143, 2003.
[23] S. Greengard, “Following the crowd,” Communications of the ACM,
vol. 54, no. 2, pp. 20–22, 2011.
[24] R. Morris, “Crowdsourcing workshop: the emergence of affective
crowdsourcing,” in Proceedings of the 2011 annual conference extended abstracts on Human factors in computing systems. ACM,
2011.
[25] M. Soleymani and M. Larson, “Crowdsourcing for affective annotation
of video: Development of a viewer-reported boredom corpus,” in
Proceedings of the ACM SIGIR 2010 workshop on crowdsourcing for
search evaluation (CSE 2010), 2010, pp. 4–8.
[26] S. Klettner, H. Huang, M. Schmidt, and G. Gartner, “Crowdsourcing
affective responses to space,” Kartographische Nachrichten-Journal of
Cartography and Geographic Information, vol. 2, pp. 66–73, 2013.
[27] G. Tavares, A. Mour˜ao, and J. Magalhaes, “Crowdsourcing for
affective-interaction in computer games,” in Proceedings of the 2nd
ACM international workshop on Crowdsourcing for multimedia.
ACM, 2013, pp. 7–12.
16
[28] B. G. Morton, J. A. Speck, E. M. Schmidt, and Y. E. Kim, “Improving
music emotion labeling using human computation,” in Proceedings of
the acm sigkdd workshop on human computation. ACM, 2010, pp.
45–48.
[29] S. D. Kamvar and J. Harris, “We feel fine and searching the emotional
web,” in Proceedings of the fourth ACM international conference on
Web search and data mining. ACM, 2011, pp. 117–126.
[30] “Amazon mechanical turk,” https://www.mturk.com/mturk/welcome.
[31] “Worldwide smartphone user base hits 1 billion,” http://www.cnet.com/
news/worldwide-smartphone-user-base-hits-1-billion/.
[32] J. F. Cohn, “Foundations of human computing: facial expression
and emotion,” in Proceedings of the 8th international conference on
Multimodal interfaces. ACM, 2006, pp. 233–238.
[33] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias,
W. Fellenz, and J. G. Taylor, “Emotion recognition in human-computer
interaction,” Signal Processing Magazine, IEEE, vol. 18, no. 1, pp. 32–
80, 2001.
[34] M. Pantic, A. Pentland, A. Nijholt, and T. S. Huang, “Human computing and machine understanding of human behavior: a survey,” in
Artifical Intelligence for Human Computing. Springer, 2007, pp. 47–
71.
[35] A. Pentland, “Socially aware, computation and communication,” Computer, vol. 38, no. 3, pp. 33–40, 2005.
[36] C. Darwin, The expression of the emotions in man and animals.
University of Chicago press, 1965, vol. 526.
[37] B. Parkinson, Ideas and realities of emotion. Psychology Press, 1995.
[38] J. J. Gross, “Emotion regulation: Affective, cognitive, and social
consequences,” Psychophysiology, vol. 39, no. 3, pp. 281–291, 2002.
[39] ——, “The emerging field of emotion regulation: An integrative
review.” Review of general psychology, vol. 2, no. 3, p. 271, 1998.
¨
[40] A. Ohman
and J. J. Soares, “”unconscious anxiety”: phobic responses
to masked stimuli.” Journal of abnormal psychology, vol. 103, no. 2,
p. 231, 1994.
[41] ——, “On the automatic nature of phobic fear: conditioned electrodermal responses to masked fear-relevant stimuli.” Journal of abnormal
psychology, vol. 102, no. 1, p. 121, 1993.
[42] M. S. Clark and I. Brissette, “Two types of relationship closeness and
their influence on people’s emotional lives.” 2003.
[43] R. LiKamWa, Y. Liu, N. D. Lane, and L. Zhong, “Moodscope: Building
a mood sensor from smartphone usage patterns,” in Proceeding of the
11th annual international conference on Mobile systems, applications,
and services. ACM, 2013, pp. 389–402.
[44] K. K. Rachuri, M. Musolesi, C. Mascolo, P. J. Rentfrow, C. Longworth,
and A. Aucinas, “Emotionsense: a mobile phones based adaptive
platform for experimental social psychology research,” in Proceedings
of the 12th ACM international conference on Ubiquitous computing.
ACM, 2010, pp. 281–290.
[45] B. Fasel and J. Luettin, “Automatic facial expression analysis: a
survey,” Pattern Recognition, vol. 36, no. 1, pp. 259–275, 2003.
[46] O. Pierre-Yves, “The production and recognition of emotions in speech:
features and algorithms,” International Journal of Human-Computer
Studies, vol. 59, no. 1, pp. 157–183, 2003.
[47] M. Pantic and M. S. Bartlett, “Machine analysis of facial expressions,”
2007.
[48] M. Pantic and L. J. Rothkrantz, “Toward an affect-sensitive multimodal
human-computer interaction,” Proceedings of the IEEE, vol. 91, no. 9,
pp. 1370–1390, 2003.
[49] M. Pantic, N. Sebe, J. F. Cohn, and T. Huang, “Affective multimodal
human-computer interaction,” in Proceedings of the 13th annual ACM
international conference on Multimedia. ACM, 2005, pp. 669–676.
[50] N. Sebe, I. Cohen, and T. S. Huang, “Multimodal emotion recognition,”
Handbook of Pattern Recognition and Computer Vision, vol. 4, pp.
387–419, 2005.
[51] A. Samal and P. A. Iyengar, “Automatic recognition and analysis of
human faces and facial expressions: A survey,” Pattern recognition,
vol. 25, no. 1, pp. 65–77, 1992.
[52] Y.-L. Tian, T. Kanade, and J. F. Cohn, “Facial expression analysis,” in
Handbook of face recognition. Springer, 2005, pp. 247–275.
[53] A. Kleinsmith and N. Bianchi-Berthouze, “Affective body expression
perception and recognition: A survey,” Affective Computing, IEEE
Transactions on, vol. 4, no. 1, pp. 15–33, 2013.
[54] J. Tao and T. Tan, “Affective computing: A review,” in Affective
computing and intelligent interaction. Springer, 2005, pp. 981–995.
[55] Z. Zeng, M. Pantic, G. I. Roisman, and T. S. Huang, “A survey of affect
recognition methods: Audio, visual, and spontaneous expressions,”
Pattern Analysis and Machine Intelligence, IEEE Transactions on,
vol. 31, no. 1, pp. 39–58, 2009.
[56] B. Pang and L. Lee, “Opinion mining and sentiment analysis,” Foundations and trends in information retrieval, vol. 2, no. 1-2, pp. 1–135,
2008.
[57] S. Brave and C. Nass, “Emotion in human-computer interaction,”
The human-computer interaction handbook: fundamentals, evolving
technologies and emerging applications, pp. 81–96, 2003.
[58] K. R. Scherer, T. B¨anziger, and E. Roesch, A Blueprint for Affective
Computing: A sourcebook and manual. Oxford University Press, 2010.
[59] D. Gokcay, G. Yildirim, and I. Global, Affective computing and
interaction: Psychological, cognitive, and neuroscientific perspectives.
Information Science Reference, 2011.
[60] R. A. Calvo and S. K. D’Mello, New perspectives on affect and
learning technologies. Springer, 2011, vol. 3.
[61] J. A. Russell, “Core affect and the psychological construction of
emotion.” Psychological review, vol. 110, no. 1, p. 145, 2003.
[62] J. A. Russell, J.-A. Bachorowski, and J.-M. Fern´andez-Dols, “Facial
and vocal expressions of emotion,” Annual review of psychology,
vol. 54, no. 1, pp. 329–349, 2003.
[63] L. F. Barrett, “Are emotions natural kinds?” Perspectives on psychological science, vol. 1, no. 1, pp. 28–58, 2006.
[64] L. F. Barrett, B. Mesquita, K. N. Ochsner, and J. J. Gross, “The
experience of emotion,” Annual review of psychology, vol. 58, p. 373,
2007.
[65] T. Dalgleish, B. D. Dunn, and D. Mobbs, “Affective neuroscience: Past,
present, and future,” Emotion Review, vol. 1, no. 4, pp. 355–368, 2009.
[66] T. Dalgleish, M. J. Power, and J. Wiley, Handbook of cognition and
emotion. Wiley Online Library, 1999.
[67] M. Lewis, J. M. Haviland-Jones, and L. F. Barrett, Handbook of
emotions. Guilford Press, 2010.
[68] R. J. Davidson, K. R. Scherer, and H. Goldsmith, Handbook of affective
sciences. Oxford University Press, 2003.
[69] M. Minsky, The Society of Mind. New York, NY, USA: Simon &
Schuster, Inc., 1986.
[70] S. DMello, R. Picard, and A. Graesser, “Towards an affect-sensitive
autotutor,” IEEE Intelligent Systems, vol. 22, no. 4, pp. 53–61, 2007.
[71] M. S. Bartlett, G. Littlewort, B. Braathen, T. J. Sejnowski, and J. R.
Movellan, “A prototype for automatic recognition of spontaneous facial
actions,” in Advances in neural information processing systems. MIT;
1998, 2003, pp. 1295–1302.
[72] J. F. Cohn, L. I. Reed, Z. Ambadar, J. Xiao, and T. Moriyama,
“Automatic analysis and recognition of brow actions and head motion
in spontaneous facial behavior,” in Systems, Man and Cybernetics, 2004
IEEE International Conference on, vol. 1. IEEE, 2004, pp. 610–616.
[73] A. Kapoor, W. Burleson, and R. W. Picard, “Automatic prediction
of frustration,” International Journal of Human-Computer Studies,
vol. 65, no. 8, pp. 724–736, 2007.
[74] G. C. Littlewort, M. S. Bartlett, and K. Lee, “Faces of pain: automated
measurement of spontaneousallfacial expressions of genuine and posed
pain,” in Proceedings of the 9th international conference on Multimodal
interfaces. ACM, 2007, pp. 15–21.
[75] I. Arroyo, D. G. Cooper, W. Burleson, B. P. Woolf, K. Muldner, and
R. Christopherson, “Emotion sensors go to school.” in AIED, vol. 200,
2009, pp. 17–24.
[76] L. Devillers, L. Vidrascu, and L. Lamel, “Challenges in real-life
emotion annotation and machine learning based detection,” Neural
Networks, vol. 18, no. 4, pp. 407–422, 2005.
[77] L. Devillers and L. Vidrascu, “Real-life emotions detection with
lexical and paralinguistic cues on human-human call center dialogs.”
in Interspeech, 2006.
[78] B. Schuller, J. Stadermann, and G. Rigoll, “Affect-robust speech
recognition by dynamic emotional adaptation,” in Proc. speech prosody,
2006.
[79] D. J. Litman and K. Forbes-Riley, “Predicting student emotions in
computer-human tutoring dialogues,” in Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004, p. 351.
[80] B. Schuller, R. J. Villar, G. Rigoll, M. Lang et al., “Meta-classifiers
in acoustic and linguistic feature fusion-based affect recognition,” in
Proc. ICASSP, vol. 5, 2005, pp. 325–328.
[81] R. Fernandez and R. W. Picard, “Modeling drivers speech under stress,”
Speech Communication, vol. 40, no. 1, pp. 145–159, 2003.
[82] S. Mota and R. W. Picard, “Automated posture analysis for detecting
learner’s interest level,” in Computer Vision and Pattern Recognition
Workshop, 2003. CVPRW’03. Conference on, vol. 5. IEEE, 2003, pp.
49–49.
17
[83] S. D’Mello and A. Graesser, “Automatic detection of learner’s affect
from gross body language,” Applied Artificial Intelligence, vol. 23,
no. 2, pp. 123–150, 2009.
[84] N. Bianchi-Berthouze and C. L. Lisetti, “Modeling multimodal expression of users affective subjective experience,” User Modeling and
User-Adapted Interaction, vol. 12, no. 1, pp. 49–84, 2002.
[85] G. Castellano, M. Mortillaro, A. Camurri, G. Volpe, and K. Scherer,
“Automated analysis of body movement in emotionally expressive
piano performances,” 2008.
[86] J. Hernandez, P. Paredes, A. Roseway, and M. Czerwinski, “Under
pressure: sensing stress of computer users,” in Proceedings of the
32nd annual ACM conference on Human factors in computing systems.
ACM, 2014, pp. 51–60.
[87] C. Epp, M. Lippold, and R. L. Mandryk, “Identifying emotional states
using keystroke dynamics,” in Proceedings of the SIGCHI Conference
on Human Factors in Computing Systems. ACM, 2011, pp. 715–724.
[88] I. Karunaratne, A. S. Atukorale, and H. Perera, “Surveillance of humancomputer interactions: A way forward to detection of users’ psychological distress,” in Humanities, Science and Engineering (CHUSER),
2011 IEEE Colloquium on. IEEE, 2011, pp. 491–496.
[89] P. Khanna and M. Sasikumar, “Recognising emotions from keyboard stroke pattern,” International Journal of Computer Applications,
vol. 11, no. 9, pp. 8975–8887, 2010.
[90] F. Monrose and A. D. Rubin, “Keystroke dynamics as a biometric for
authentication,” Future Generation computer systems, vol. 16, no. 4,
pp. 351–359, 2000.
[91] L. M. Vizer, L. Zhou, and A. Sears, “Automated stress detection using
keystroke and linguistic features: An exploratory study,” International
Journal of Human-Computer Studies, vol. 67, no. 10, pp. 870–886,
2009.
[92] E. Niedermeyer and F. L. da Silva, Electroencephalography: basic principles, clinical applications, and related fields. Lippincott Williams
& Wilkins, 2005.
[93] O. AlZoubi, R. A. Calvo, and R. H. Stevens, “Classification of eeg
for affect recognition: an adaptive approach,” in AI 2009: Advances in
Artificial Intelligence. Springer, 2009, pp. 52–61.
[94] A. Heraz and C. Frasson, “Predicting the three major dimensions of
thelearner-s emotions from brainwaves.”
[95] A. Bashashati, M. Fatourechi, R. K. Ward, and G. E. Birch, “A survey
of signal processing algorithms in brain–computer interfaces based on
electrical brain signals,” Journal of Neural engineering, vol. 4, no. 2,
p. R32, 2007.
[96] F. Lotte, M. Congedo, A. L´ecuyer, F. Lamarche, B. Arnaldi et al.,
“A review of classification algorithms for eeg-based brain–computer
interfaces,” Journal of neural engineering, vol. 4, 2007.
[97] C. O. Alm, D. Roth, and R. Sproat, “Emotions from text: machine
learning for text-based emotion prediction,” in Proceedings of the
conference on Human Language Technology and Empirical Methods
in Natural Language Processing.
Association for Computational
Linguistics, 2005, pp. 579–586.
[98] T. Danisman and A. Alpkocak, “Feeler: Emotion classification of text
using vector space model,” in AISB 2008 Convention Communication,
Interaction and Social Intelligence, vol. 1, 2008, p. 53.
[99] C. Strapparava and R. Mihalcea, “Learning to identify emotions in
text,” in Proceedings of the 2008 ACM symposium on Applied computing. ACM, 2008, pp. 1556–1560.
[100] C. Ma, H. Prendinger, and M. Ishizuka, “A chat system based on emotion estimation from text and embodied conversational messengers,” in
Entertainment Computing-ICEC 2005. Springer, 2005, pp. 535–538.
[101] P. H. Dietz, B. Eidelson, J. Westhues, and S. Bathiche, “A practical
pressure sensitive computer keyboard,” in Proceedings of the 22nd
annual ACM symposium on User interface software and technology.
ACM, 2009, pp. 55–58.
[102] I. J. Roseman, M. S. Spindel, and P. E. Jose, “Appraisals of emotioneliciting events: Testing a theory of discrete emotions.” Journal of
Personality and Social Psychology, vol. 59, no. 5, p. 899, 1990.
[103] I. J. Roseman, “Cognitive determinants of emotion: A structural
theory.” Review of Personality & Social Psychology, 1984.
[104] “Smartphone,” http://en.wikipedia.org/wiki/Smartphone.
[105] “Natural
usage
of
smartphone,”
http:
//www.marketingcharts.com/wp/online/
8-in-10-smart-device-owners-use-them-all-the-time-on-vacation-28979/.
[106] K. Dolui, S. Mukherjee, and S. K. Datta, “Smart device sensing
architectures and applications,” in Computer Science and Engineering
Conference (ICSEC), 2013 International. IEEE, 2013, pp. 91–96.
[107] M.-Z. Poh, D. J. McDuff, and R. W. Picard, “Advancements in noncontact, multiparameter physiological measurements using a webcam,”
Biomedical Engineering, IEEE Transactions on, vol. 58, no. 1, pp. 7–
11, 2011.
[108] J. Allen, “Photoplethysmography and its application in clinical physiological measurement,” Physiological measurement, vol. 28, no. 3,
p. R1, 2007.
[109] P. Zimmermann, S. Guttormsen, B. Danuser, and P. Gomez, “Affective
computing–a rationale for measuring mood with mouse and keyboard,”
International journal of occupational safety and ergonomics, vol. 9,
no. 4, pp. 539–551, 2003.
[110] N. Villar, S. Izadi, D. Rosenfeld, H. Benko, J. Helmes, J. Westhues,
S. Hodges, E. Ofek, A. Butler, X. Cao et al., “Mouse 2.0: multi-touch
meets the mouse,” in Proceedings of the 22nd annual ACM symposium
on User interface software and technology. ACM, 2009, pp. 33–42.
[111] Y. Gao, N. Bianchi-Berthouze, and H. Meng, “What does touch tell us
about emotions in touchscreen-based gameplay?” ACM Transactions
on Computer-Human Interaction (TOCHI), vol. 19, no. 4, p. 31, 2012.
[112] “Transforming the impossible to the natural,” https://www.youtube.
com/watch?v=3jmmIc9GvdY.
[113] H.-J. Kim and Y. S. Choi, “Exploring emotional preference for smartphone applications,” in Consumer Communications and Networking
Conference (CCNC), 2012 IEEE. IEEE, 2012, pp. 245–249.
[114] G. Tan, H. Jiang, S. Zhang, Z. Yin, and A.-M. Kermarrec,
“Connectivity-based and anchor-free localization in large-scale 2d/3d
sensor networks,” ACM Transactions on Sensor Networks (TOSN),
vol. 10, no. 1, p. 6, 2013.
[115] M. J. Hertenstein, R. Holmes, M. McCullough, and D. Keltner, “The
communication of emotion via touch.” Emotion, vol. 9, no. 4, p. 566,
2009.
[116] G. Chittaranjan, J. Blom, and D. Gatica-Perez, “Who’s who with bigfive: Analyzing and classifying personality traits with smartphones,” in
Wearable Computers (ISWC), 2011 15th Annual International Symposium on. IEEE, 2011, pp. 29–36.
[117] J. M. George, “State or trait: Effects of positive mood on prosocial
behaviors at work.” Journal of Applied Psychology, vol. 76, no. 2, p.
299, 1991.
[118] N. H. Frijda, “The place of appraisal in emotion,” Cognition &
Emotion, vol. 7, no. 3-4, pp. 357–387, 1993.
[119] P. E. Ekman and R. J. Davidson, The nature of emotion: Fundamental
questions. Oxford University Press, 1994.
[120] A. Jain and D. Zongker, “Feature selection: Evaluation, application, and
small sample performance,” Pattern Analysis and Machine Intelligence,
IEEE Transactions on, vol. 19, no. 2, pp. 153–158, 1997.
[121] T. Bentley, L. Johnston, and K. von Baggo, “Evaluation using cuedrecall debrief to elicit information about a user’s affective experiences,”
in Proceedings of the 17th Australia conference on Computer-Human
Interaction: Citizens Online: Considerations for Today and the Future.
Computer-Human Interaction Special Interest Group (CHISIG) of
Australia, 2005, pp. 1–10.
[122] H. Lee, Y. S. Choi, S. Lee, and I. Park, “Towards unobtrusive
emotion recognition for affective social communication,” in Consumer
Communications and Networking Conference (CCNC), 2012 IEEE.
IEEE, 2012, pp. 260–264.
[123] P. Ekman, “Universals and cultural differences in facial expressions
of emotion.” in Nebraska symposium on motivation. University of
Nebraska Press, 1971.
[124] “Weka 3,” http://www.cs.waikato.ac.nz/ml/weka/.
[125] R. Tato, R. Santos, R. Kompe, and J. M. Pardo, “Emotional space
improves emotion recognition.” in INTERSPEECH, 2002.
[126] M. K. Petersen, C. Stahlhut, A. Stopczynski, J. E. Larsen, and L. K.
Hansen, “Smartphones get emotional: mind reading images and reconstructing the neural sources,” in Affective Computing and Intelligent
Interaction. Springer, 2011, pp. 578–587.
[127] E. Douglas-Cowie, N. Campbell, R. Cowie, and P. Roach, “Emotional
speech: Towards a new generation of databases,” Speech communication, vol. 40, no. 1, pp. 33–60, 2003.
[128] A. J. O’Toole, J. Harms, S. L. Snow, D. R. Hurst, M. R. Pappas,
J. H. Ayyad, and H. Abdi, “A video database of moving faces and
people,” Pattern Analysis and Machine Intelligence, IEEE Transactions
on, vol. 27, no. 5, pp. 812–816, 2005.
[129] N. Sebe, M. S. Lew, Y. Sun, I. Cohen, T. Gevers, and T. S. Huang,
“Authentic facial expression analysis,” Image and Vision Computing,
vol. 25, no. 12, pp. 1856–1863, 2007.
[130] R. LiKamWa, Y. Liu, N. D. Lane, and L. Zhong, “Can your smartphone
infer your mood?” in Proc. PhoneSense Workshop, 2011.
18
[131] T. H¨ollerer and S. Feiner, “Mobile augmented reality,” Telegeoinformatics: Location-Based Computing and Services. Taylor and Francis
Books Ltd., London, UK, vol. 21, 2004.
[132] Z. Huang, P. Hui, C. Peylo, and D. Chatzopoulos, “Mobile
augmented reality survey: a bottom-up approach.” CoRR, vol.
abs/1309.4413, 2013. [Online]. Available: http://dblp.uni-trier.de/db/
journals/corr/corr1309.html#HuangHPC13