what third parties will know about our library customers

Universitätsbibliothek
Using your library software – what third parties will get to know
about our library customers
Dr. Andreas Sabisch
FU Berlin Universitätsbibliothek
Garystr. 39
13469 Berlin
andreas.sabisch@fu-berlin.de
Agenda
Agenda …
Motivation for this investigation
Webcommunication for dummy's
Examples of third parties communication:
What to do
Andreas Sabisch
2
Why we must deal with
 We must protect the digital privacy of our patrons

EU laws, national laws, university rules

question from patrons, university boards, secure research, …
 We (especially in Germany) have to describe how we deal with the
patrons data

Data protection rules describtion (Datenschutzerklärungen)

Avoid data producing, storage and propagation

Right of informational self-determination (BVerfG) (Recht auf
informationelle Selbstbestimmung)
 We have a monopol with our library systems

loan, EZ-Proxy access, course material,…
 How we can do this

Analysis

Describtion

Avoid
Andreas Sabisch
3
Http-Communication
Andreas Sabisch
4
Weblogs and cookies
 What is in an webserver-log: the apache log file

130.133.152.192 - - [10/Apr/2014:09:16:44 +0200] "GET /docs/images/poweredby.gif HTTP/1.1" 200 2376 "http://160.45.152.195/docs/content/below/index.xml" "Mozilla/5.0 (X11; Linux x86_64;
rv:28.0) Gecko/20100101 Firefox/28.0"

IP of the requested host:

When:

What (request):/docs/images/poweredby.gif

Technical information: Success-code and Transfered volume :

Where comes the request from (refferer) :http://160.45.152.195/docs/content/below/index.xml"

(Browser)information:
130.133.152.192
10/Apr/2014:09:16:44 +0200
200 2376
"Mozilla/5.0 (X11; Linux x86_64; rv:28.0) Gecko/20100101 Firefox/28.0"
 Recognition from the webserver: the cookie file

Cookie Textfile

Name: JSESSIONID

Value: 7AE6B0776E8F4D75BAC8B46189F419FB

HOST: primo.kobv.de

PATH: /primo_library/libweb

Sending for: Each connection type

Valid until: End of session

Just the webserver which send the cookie can read it.

But each third party, which involved in the request, can set a cookie
 Flashcookies – hard to detect, no example found yet in an library
enviroment
 Scripts, which send additonal information
Andreas Sabisch
5
A picture in pieces
 Loggin one request is a pice of information
 Logging a lot of request give a story line
 Logging a lot of request from different server give the whole live
 Thats what Google and Co. will do

To X-ray one person (i.e to give you personalized services and advertising)

To get statistical evidence for a whole group (i.e. people, who are
interested in this, are interested in this as well)
Andreas Sabisch
6
How to analyse data traffic (sniffen)
 Professionell tools

tcpdump für automatic processing

Wireshark with graphical interface
 Analysies with Wireshark (suggestion for profis)
•
Create a filter (Broadcast/own IP; just TCP or http...)
•
Doing one action in the browser, start with analyse. If necessary, repeate
•
Anaylse a whole session is a hard work. You can do this best, if you check
for special issues in this session, i.e. which hosts will participate in this
session.
 Browsertools (for a quick glimpse)
•
i.e. Firefox => Extras-> Webtools ->Network; limit to http, no TCP und
TLS connection
•
I will use this Browsertools for some examples
Andreas Sabisch
7
Aleph-Catalog with tracking-bugs
dbs.pixel.hbz-nrw.de : DBS Tracking bug
legal, describe
Recommander.bibtex.de :
Bib tip recommander System
legal, but not describe
Andreas Sabisch
8
Primo including a second source (library blog)
RSS-Feed from our library block
ajax.googleapis.com
Formating from rss to jason
Andreas Sabisch
9
… and without google: no Biblioblog entry
Blocking Google:
no information any more
Andreas Sabisch
10
Primo result site
books.google.com
exlibris-pub.s3.amazonaws.com
images.amazon.com
Andreas Sabisch
11
bX in Primo
recommande-bx.hosted.exlibrisgroup.
bX service, integrate in Primo
beacon01.alma.exlibrisgroup.com
A tracking bug from ExL
no description available
Andreas Sabisch
12
An licencesed journal web site
Imagic17.247realmedia.com
metric.sciencemag.org
now.eloqua.com
www.google-analytycs.com
Andreas Sabisch
13
Short-term work in library
 Check with tools for third party request
 Test the functionality of your site with blocking the request
 Remove the third party request

With other/own functions

By comment out in code or websites

With help from your provider (i.e. ExL)
 Describe necessary third party request for your patrons; includes
data protection policy of the third party
 Describe users possibility to protect their data
 Help users with a proxy server (i.e. the university computer
department)
Andreas Sabisch
14
Patron Option at the Moment
 Blocking programms like Adblocker or Ghostery
 Pro: selected third party requests
 Contra: Lack of functionalyties
 Using proxie server
 Opt-Out Option – Data protection law conform (Datenschutzkonforme
Herangehensweise) but much efford
 Thor – anonymous surfen
Andreas Sabisch
15
Long-term issues in librarys
 We must accomplish a ‚Opt in‘ culture

Core functions must be in data save structures

Add ons must be choosen by the patrons with knowledge of third partys
involved (Opt in process)
 The library infrastructure and systems must support this strategy
Andreas Sabisch
16
Summerise
 Modern library software include often third party requests
 Third party get information about your patrons via refferer
information
 This violate the patrons ‚right of informational self-determination‘
 Analyse your software enviroment
 Try to be law-conform: Avoid or describe
 Long term: accomplish a ‚Opt in‘ culture
Andreas Sabisch
17
Highlights
 Each http-requests give information like ip-adress and referrer to the
websever they are requested
 A website includes very often requests to third parties. This requests
will send the same information to third party server and is nearly
unvisible to the user
 We, as the provider of the library systems, are responsible for the
data privacy policy for the users of our systems
 We must take care about the sending of user data to third parties
and should always use options for a save privacy policy
 To do this is important to give our users the rights to their private
data back (in german: ‚Bewahrt das Recht auf informationelle
Selbstbestimmung‘)
 Thanks to Dr. Voss, HU and Uwe TU, who found the back tacks of
hosted.exlibris.com and give the impulse for this investigation
Andreas Sabisch
18