Identifying Potentially Privileged Documents

CASE
STUDY
A Language-Based Analytics Case Study
Identifying Potentially Privileged Documents
Challenge
An international insurance client involved in a reinsurance matter needed to identify all potentially privileged material
as quickly and cost effectively as possible out of millions of documents. The client’s traditional approach to identifying
potentially privileged documents, by using only simple keywords and metadata, had historically resulted in the
unnecessary and costly review of too many non-privileged documents. This was especially the case when the client
included words like “privileged” and “confidential” in its keyword list.
In addition, many potentially privileged documents had previously been overlooked during review because they were not,
per se, communications between the company and its attorneys even though they contained complex privileged content.
Approach
The client had developed a list of metadata criteria and keywords that they believed would identify potentially
privileged documents, but neither list could be used in a standalone manner. The keyword list included words that
typically appeared in all email footers the company sent or received. The challenge was to limit the keyword search to
just the text contained in the body of the document. To accomplish that, RenewData made an inventory of the footers
and treated the language used in each footer as a Logical Expression (LEX). We then indexed the documents in the
collection, but excluded the aforementioned LEXs from the index. As a result, the search engine couldn’t “see” the
language of the LEX. Thus, a search for the word “privilege” considered only the language contained in the body of the
document and not the language contained in its disclaimer footer.
The second half of the problem was more complicated. Occasionally the company’s attorneys would send a privileged
communication via email to one of the company’s executives. The executive would then write a new email to staff that
effectively included the privileged language of the first email, but without referencing the attorneys. The privileged
language was often complex and the emails to staff were not picked-up by the standard privilege metadata criteria.
Capturing these privileged emails had proved an insurmountable problem for the client.
To solve the problem, RenewData put all documents deemed potentially privileged based on metadata into its
Vestigate review platform. As these documents were reviewed, the attorneys highlighted the exact language in
each document that created the privilege. Vestigate’s Automatic Query Builder grabbed this highlighted language
and transformed it into Boolean queries that were run against the entire population of documents. In this manner,
documents and emails that contained the attorney’s privileged language, but did not have the attorney metadata, were
identified and held back from production.
Outcome
By employing these two techniques, the client was able to rapidly identify documents containing privilege-related
terms contained only in the body of the emails, thereby vastly reducing the volume of false positives. In addition, the
client was able to identify documents containing complex privileged language that had not been sent to or from the
company’s counsel. The entire privilege review was completed by one attorney over the course of two days -- a fraction
of the time otherwise required with a traditional approach.
www.renewdata.com
888- 811-3789
info@renewdata.com
Copyright © 2013 Renew Data Corp. All Rights Reserved.