Is the Bates Stamp Really Dead?


In the course of talking about the docNative Paradigm, I’ve done several webinars for Anacomp that were rather facetiously titled with some variation of the above heading. Of course we didn’t take that position but held, rather, that there is no need to Bates stamp every document at the beginning of the discovery process.
For years, litigators have relied on a system of scanning and sequentially numbering individual document pages, extracting the text electronically and producing single page TIFF files as the standard method of working with electronic documents in the discovery process. 


I’ve called this process of uniquely identifying each page the “pagecentric” approach but it is commonly called Bates numbering in reference to Edwin G. Bates who patented the original document numbering machine. The Bates Manufacturing Co. was eventually acquired by Edison Phonograph Works and their machine so dominated the market that numbers imprinted on multi-page documents became generally referred to as Bates numbers.


Used sporadically for years by lawyers, Bates numbering became more widespread in the 1980’s with the rapid increase in the number of lawsuits involving large numbers of documents which was fueled by the growth of word processors and personal computers. The marking of each document with a unique alphanumeric identifier helped track discovery responses, refer to specific pages of a document during depositions and even better track which documents were withheld on the basis of privilege or work product.


The routine method of handling documents in large cases became the following: 1) copy the documents, 2) manually stamp Bates numbers the copies and then 3) copy the Bates-numbered pages. Indeed, The Manual for Complex Litigation published by the Federal Judicial Center (The Manual) specifically states at Section 11.441, Identification Systems,:


Counsel should be informed that consecutive numbering is usually the practicable method; blocks of numbers are assigned to each party in advance to make the source of each document immediately apparent. Every page of every document is Bates-stamped consecutively. The document’s number may be later used to designate it; if the document is identified differently in the course of a deposition or on an exhibit list, the stamped number should be included as a cross-reference. If other means of designation are used, no designation should be assigned to more than one document, and the same document should not receive more than one designation unless counsel have reason to refer to different copies of the same document. In multitrack depositions, a block of numbers should be assigned to each deposition in advance. To avoid later disputes, a log should record each document produced and should indicate by, to whom, and on what date production was made. A record of the documents produced by a party and copied by an opposing party may also be useful.


As computer systems came to routinely use electronic images of documents which were indexed in computerized databases, electronic Bates numbering was instituted and the page-centric paradigm continued. Each separate page was scanned into a single page TIFF format, electronic Bates numbers were added during scanning and that electronic Bates number was used to link each image to a specific record in the database.


If a record referred to a multi-page document captured in multiple TIFF images, the software creating the images generated a “load file” which specified the range of images by beginning and ending Bates number for the document. The advantage to this process was that one could immediately locate any page of a document that had thousands of pages by referring to the Bates number and thus introduce only those pages needed as exhibits at a deposition or trial, rather than introducing the entire document and making the witness page through it to the needed pages.


This worked fairly well at first but was most efficient for relatively small amounts of pages of traditional legal documents. With the advent of electronic documents and new document types such as multi-page TIFFs and PDFs, emails, excel spreadsheets and audio files, the page-centric approach quickly falters. Indeed, even The Manual states in Section 11.441:


However, databases containing millions of data elements, none of which are meaningful alone, can be difficult or impossible to break down and organize in a way directly analogous to conventional document collections. Special consideration should be given to their identification and handling.


And the growing preference for using native files in productions makes Bates numbering problematic, since native files cannot be Bates numbered. The alternative employed by many e-discovery vendors is to generate TIFF images from the native files and Bates number those images. But this process complicates native file review, and at anywhere from $0.08 -$0.20 per TIFF, also adds considerable cost to the process.


Why is this true?  I asked John Turner, Senior Vice President and Chief Technology Officer of Anacomp, who told me  “To fully understand why, I need to go back to the start of the discovery process at what the EDRM project calls the “Processing” stage. At a high level this stage can be broken down into the following steps: (1) Data is received from the customer; (2) This data is culled and deduplicated; (3) The metadata is extracted; (4) The text is extracted; and (5) The document is TIFFed. (This can be done as single page or multi page, but is usually single page.)”. 


He continued, “One consequence of this is that the relationship of the pages to themselves and to the document is artificially broken.  It also breaks the relationship of an email with its attachment or of a document with an embedded file, or ZIP file. All of them then have to be recreated in the review platform.”

Which means that if the Bates number isn’t actually dead, the cost of keeping it alive at the front end of the EDRM model makes it too expensive to live there much longer.


Welcome to the initial posting of the CaseLogistix docNative Paradigm

Welcome to the initial posting of the CaseLogistix docNative Paradigm.  The purpose of this blog is to generate discussion around the concept of the new paradigm and find out what people think about this approach to handling litigation support documents. If you’ve heard any of the webinars that I’ve done over the past few months with my old friend Browning Marean of DLA Piper, you’ll know I’m talking about. But in case you haven’t been able to attend any of the webinars (which by the way are all available on the CaseLogistix website at here’s a recap of what we have been saying:


