Gianluca Stringhini, Christopher Kruegel, Giovanni Vigna University of California, Santa Barbara 26

Gianluca Stringhini, Christopher Kruegel, Giovanni Vigna
University of California, Santa Barbara
26th ACSAC(December, 2010)
Outline
Introduction
 The Popular Social Networks
 Data Collection
 Analysis of Collected Data
 Spam Profile Detection

A Seminar at Advanced Defense Lab
2
Introduction

Sites such as Facebook, MySpace, and
Twitter are consistently among the top
20 most-viewed web sites of the Internet.

In 2008, 83% of the users of social
networks have received at least one
unwanted friend request or message.
A Seminar at Advanced Defense Lab
3
Introduction

A previous study showed that 45% of
users on a social networking site readily
click on links posted by their “friend”
accounts, even if they do not know that
person in real life.
A Seminar at Advanced Defense Lab
4
Security Threat[link]
A Seminar at Advanced Defense Lab
5
The Popular Social Networks

Facebook
 the Facebook administrators claim to have
more than 400 million active users all over
the world.
 The default privacy setting was to allow all
people in the same network to view each
other’s profiles.
○ Facebook deprecated geographic networks in
October 2009.
A Seminar at Advanced Defense Lab
6
The Popular Social Networks

MySpace
 MySpace pages are public by default.

Twitter
 It is designed as a microblogging platform,
where users send short text messages (i.e.,
tweets) that appear on their friends’ pages.
 Users are identified only by a username and,
optionally, by a real name.
A Seminar at Advanced Defense Lab
7
Data Collection

We created 900 profiles on Facebook,
MySpace, and Twitter, 300 on each
platform.

Due to the similarity of these profiles to
honeypots, we call these accounts
honey-profiles.
A Seminar at Advanced Defense Lab
8
Honey-Profiles

Facebook
 we joined 16 geographic networks, using a
small number of manually-created accounts.
 For each network, we crawled 2,000
accounts at random, logging names, ages,
and gender.
A Seminar at Advanced Defense Lab
9
Honey-Profiles

MySpace
 we crawled 4,000 accounts in total.

Twitter
 the only information required for signing up
is a full name and a profile name.
A Seminar at Advanced Defense Lab
10
Collection of Data

We did not send any friend requests, but
accepted all those that were received.

Our scripts ran continuously for 12
months for Facebook (from June 6,
2009 to June 6, 2010), and for 11
months for MySpace and Twitter (from
June 24, 2009 to June 6, 2010).
A Seminar at Advanced Defense Lab
11
Analysis of Collected Data
Looking for
“local” friends
A Seminar at Advanced Defense Lab
12
Analysis of Collected Data: Facebook
A Seminar at Advanced Defense Lab
13
Analysis of Collected Data: Twitter
A Seminar at Advanced Defense Lab
14
Spam Bot Analysis

Displayer
 Bots that only display some spam content
on their own profile pages.

Bragger
 Bots that post messages to their own feed.
Poster
 Whisperer

A Seminar at Advanced Defense Lab
15
Spam Bot Analysis
Facebook
MySpace
Twitter
Displayer
2
8
0
Bragger
163
0
341
Poster
8
0
0
Whisperer
0
0
20
A Seminar at Advanced Defense Lab
16
Spam Bot Analysis

On Facebook, the average lifetime of a
spam account was four days, while on
Twitter, it was 31 days.

During our observation, we noticed that
some bots showed a higher activity
around midnight.
A Seminar at Advanced Defense Lab
17
Spam Bot Analysis

We identified two kinds of bot behavior
 Greedy :416
 Stealthy: 98

Most spam profiles we observed, both
on Facebook and Twitter, sent less than
20 messages during their life span.
A Seminar at Advanced Defense Lab
18
Spam Bot Analysis

While observing Facebook spammers,
we also noticed that many of them did
not seem to pick victims randomly.

80% of bots we detected on Facebook
used the mobile interface of the site to
send their spam messages.
A Seminar at Advanced Defense Lab
19
Spam Profile Detection

FF ratio (R)
 The feature compares the number of friend
requests that a user sent to the number of
friends she has.
○ Unfortunately, the number of friend requests
sent is not public on Facebook and on
MySpace.
○ R = following / followers
A Seminar at Advanced Defense Lab
20
Spam Profile Detection

URL ratio (U)
 The feature to detect a bot is the presence
of URLs in the logged messages.
 U = messages containing urls / total
messages.
A Seminar at Advanced Defense Lab
21
Spam Profile Detection

Message Similarity (S)


Friend Choice (F)

A Seminar at Advanced Defense Lab
22
Spam Profile Detection

Messages Sent (M)

Friend Number (FN)

We used the Weka framework with a
Random Forest algorithm[link] for our
classifier.
A Seminar at Advanced Defense Lab
23
Detection on Facebook

Using 1,000 profiles
 173 spam bots that contacted our honey-
profiles
 827 manually checked profiles

From 790,951 profiles
 Detected: 130
 False positive: 7

From 100 profiles
 False negative: 0
A Seminar at Advanced Defense Lab
24
Detection on Twitter
We picked 500 spam profiles.
 We also picked 500 legitimate profiles.


Twitter limited our machine to execute
only 20,000 API calls per hour.
 we executed Google searches for the most
common words in tweets sent by the already
detected spammers
A Seminar at Advanced Defense Lab
25
Detection on Twitter

From March 06, 2010 to June 06, 2010,
we crawled 135,834 profiles, detecting
15,932 of those as spammers.
 False positive: 75
A Seminar at Advanced Defense Lab
26
Identification of Spam
Campaigns

We consider two bots posting messages
with URLs pointing to the same site as
being part of the same campaign.
A Seminar at Advanced Defense Lab
27
Spam Campaigns Observed
A Seminar at Advanced Defense Lab
28
A Seminar at Advanced Defense Lab
29