How to train observe your bot! Affan Syed Associate Professor,

How to train observe your bot!
Affan Syed
Associate Professor,
Systems and Networking (SysNet) Lab,
National University of Computer & Emerging Sciences,
Islamabad, Pakistan
1
What happens at the SysNet …. Should not stay at SysNet!
Broad spectrum of systems research
Funded by
&
Computer Science
Electrical Engineering
*e-Energy
SDN for ISP networks
Sigcomm CCR
Botnets
DSN, *S&P, *CCS
Underwater WUWNet
networks Energy harvesting
and transference
Smart-home &
Smart-Grid
IPSN
Distributed Systems
*= poster or demo
2
What is a Bot/Zombie without connectivity?
"Zombie / Botnet network" by
Sophos Presseinfo
http://www.youthedesigner.com/wpcontent/uploads/2011/10/Zombie-Photo-01.jpg
http://howto-get-rid-of.com/wp-content/uploads/2014/03/troian-virusworm.jpg
Credit: Dreamworks “Shrek 2”
Network activity is the defining feature of a botnet
3
Motivation: Boot-strapping Botnet Research
Credit: www.sparkfun.com
Credit: http://ok.gov/osbi/Forensic_Laboratory/Forensic_Services/Latent_Prints
How can hobbyist/researchers easily observe live bot behavior where it manifests
its true color?
4
Industrial solutions and disclosures
Proprietary and expensive
Limited release of bot behavior as
blogs or papers (IP reasons?)
Need a non-textual, technical, report for *our fav* bot!
Other academic tools?
Deployment and testbed details?
High deployment cost and management/operational overhead
Need a system with low deployment overhead and cost!
Online services?
Lets look at some online
systems
• Anubis
• Comodo
• Malwr
(cuckoosandbox)
Limited and sketchy network details;
Also can’t change the execution environment
Need to trigger all responses of a bot!
7
Goals of our work
 Faithfully capture all network behavior
 Do so at low-cost and operational overhead
 Extensibility for evolving botnet landscape
Key contributions
 Operational contexts
 Implementation of a system to
– effects these contexts
– low-cost, and low- operational overhead
• to bootstrap effective academic research
Generating a faithful fingerprint
OPERATIONAL CONTEXTS & BOTS
10
Operational contexts*: In-the-wild
 Network Configuration
– Public vs Private
 Machine Type
Public
VM’s
– VM vs Bare-metal
 User Activity
Bare-metal
Private
– No activity vs Specific visits
*Operational Context = Environmental conditions that can impact a bot’s network
fingerprint
Operational contexts: Under Observation
 Observation Duration (time consideration)
– Boot-strapping phase vs Quiescent
 Containment Policy (ethical consideration)
– Conservative vs liberal (or ethical vs useful!)
Faithful Network Fingerprint: Operational Contexts
Attack
C&C
Context difference
Metadata
Anti Analysis Stepping stone
Network
connections
Ports opened
Information
stealing
Banking Social
media
Sample Fingerprint
Generating faithful fingerprints, varying operational contexts
TITAN ARCHITECTURE
15
Titan: Design decisions
GOALS
DECISIONS
 Faithful fingerprints
Multiple execution for intra-context feature
extractions
 Ease of management
Semi-automated containment
 Low-cost design and deployment
Few and low-end machines for testbed
Serial fingerprint generation
 Provides flexibility to extend its
capabilities
Modularity built into its design
Open Sourced and community involvement
Web Frontend
 Binary submission
 Operational context selection
–execution machine
–network configuration
–user activity
 Configuration forwarding to system manager
System Manager and Execution Network





Chooses the machine to infect
Converts configurations into commands
Starts and Stops the experiment
Facilitates in automation of containment
Setup of host machines
Containment and Logging Modules
 Semi-automation of traffic filtering rules
 Implement containment policy
 Log containment decisions and user traffic
Fingerprint Generation Engine
 XML fingerprint schema
 Performs various operations on collected logs
 Provides to the user for display
Titan Architecture: Workflow
Multiple iterations for a single operational context;
Operational contexts to execute selected by user!
3
2
1
Selecting context
1010
0110
9
Forwarding
configuration
Representation of
Results
Web
Front
End
Configuration of system
& network
Containing and forwarding
traffic
4
System
Manager
Containment
Module
6
Generation of
8
fingerprints
Resetting the
7a
Network
Fingerprint Generation
Engine
10
Selecting rule set (optional)
Logging
traffic
Logging
5
Logging
decisions
Internet
7b
Updating rule
set
Data Traffic
Control &
Management Traffic
How we achieve low-cost, and reduce deployment complexity
IMPLEMENTATION
22
Titan Implementation
 Minimum of 2 desktop-class machines
– Three, now, with support for bare-metal execution
– Can increase machines (virtual and physical) to generate LAN connectivity and
scanning observation (TBD)
 Connected via a L2 switch
Now: Bare Metal Machine bootup and multiple 23OS
System Manager and Execution Platform
 Execution Platform: XEN hypervisor
– Clones VM from a base VM
 System manager: Python & shell scripts
– responsible for creating operational context
24
Recreating Operational contexts
 User emulation  AutoIt Scripts
 Network context  using two-stage NAT
– Fool bot into thinking it has a public IP
Public IP
Private IP
208.x.x.x
10.1.x.x
AutoIt scripts
Public IP
208.x.x.x
Containment Module
 Pox controller with Open vSwitch on the Gateway
 All traffic also passes through attack sensors
–
–
–
–
–
–
exe_detect
DoS_detect
netscan_detect
spam_detect
inject_detect
info_detect
Possible throughput hit; but who cares!
 All traffic is malicious
– More reliable attack sensors
Now: custom sensors can be added
organically to the containment module
26
Containment Methodology: First Iteration
 Allow only TCP handshakes
―Fools bot into thinking it has access but CnC not responding in the
“right” way
―Detect CnC contact mechanisms
Containment Module
C&C 1
Network Traffic
TCP sensor
Attack sensor
Bot infected machine
C&C 2
Drop attack traffic,
add to blacklist
Containment Methodology: Second Iteration
 C&C communication allowed
 Only TCP handshake for new traffic
C&C 1
Containment Module
TCP sensor
TCP handshake
Network Traffic
Other m/c
Attack sensor
Bot Infected machine
C&C 2
Drop attack traffic, add
to blacklist
Containment Methodology: Third Iteration
 Observer bot behavior for long interval
 Detect any network attacks launched
C&C 1
Containment Module
TCP sensor
Connection Establishment
Network Traffic
Other m/c
Attack sensor
Bot Infected machine
C&C 2
Drop attack traffic, add
to blacklist
Advanced users: go for 4th and beyond
If attack sensor are inaccurate, malicious traffic might leak!
Titan Implementation: Fingerprint Generation
 Process raw traffic to generate context-specific features
― attack features from logs
― C&C features from bro, nmap, google safe browsing api, MaxMind GeoIP db,
Python Requests library
 Looks at differences between fingerprints to extract the context-specific
behavior.
EVALUATION
32
Validating Effective Fingerprint Generation
 Deployed a zeus botnet infrastructure
-- our zeus bot reports network configuration
-- performed experiment in both Public & Private setting
Private Context
Public Context
Fingerprint
Cryptolocker : Then
 Fingerprint of CryptoLocker
--analyzed a binary of the CryptoLocker botnet
We detected a DGA feature
Cryptolocker: Now
Ans: DGA updated to avoid detection --exactly the information we wanted to
observe ourself!
So what happened?
Conclusion
 A tool the can recreate operational contexts that impact a bot’s
network behavior
 Is open sourced and extensible
– Some of the attack and CnC feature are raw and need update
 Lots of --- and continuous --- update need to keep the tool
relevant
– Invite people to consider hosting this and carrying the idea forward!
– The project funding (and any funneling) ends in Aug!
Acknowledgements




Osama Haq
Waqar Ahmed
Mujtaba Ahmed
Ashad
37
For more information and source code visit
http://sysnet.org.pk/w/Titan
Contact person: osama.haq@sysnet.org.pk
Live demo at: titan.bot.nu:8888
Questions
38
42
Fingerprint: A Taxonomy
C&C
Attack
Context difference
Metadata
Single Experiment features
Intra-context features
Command & Control Features
Attack
Context
difference
Type
HTTP
IRC
Metadata
C&C
IP
Location
Connection frequency
UR
L
Redirection
Safe browsing status
Service
Port
Evasion
Volume
IP flux
Domain flux
Attack Features
C&C
Context
difference
DoS
Metadata
Scan
Attack
Spam
Info steal
SQL Inject
Exe transfer
Experiment Metadata
Attack
C&C
Context
difference
Metadata
Binary MD5
Experiment
Info
Bot Family
Network Start time Duration
connections