TAR + Advanced AI: The Future is Now

Issue link:

Contents of this Issue


Page 3 of 12

WHITE PAPER: TAR + ADVANCED AI – THE FUTURE IS NOW | 4 Between 2006 and 2010, ediscovery technology advanced into the first form of TAR, or what is now referred to as TAR 1.0, with TAR 2.0 following shortly thereafter. This introduction created a buzz in the industry, as it gave litigation teams the ability to handle growing ESI volumes with much more efficiency and a fraction of the cost of manual review. TAR 1.0 TAR 1.0 uses supervised machine learning, where a small number of highly trained subject matter experts review and code a randomly selected group of documents called a control set. The control set provides an initial overall estimated richness metric and establishes the baseline against which the iterative training rounds are measured. Through the training rounds, the machine develops a classification model. Once the training rounds no longer improve the classification, the system is considered to have reached stability. At this point, the computer applies scores to all the documents in the dataset, with lower scores indicating documents less likely to meet the criteria set out by the experts in the training session. Using statistical measures, a cutoff point or score is determined and validated, above which the desired measure of relevant documents will be included. The remaining documents below that score are deemed not relevant and therefore do not require any additional review. 4 As previously noted, it was during the late 2000s that TAR 1.0 began to be used in a limited number of larger document reviews – in part due to influential bodies, such as the Text Retrieval Conference (TREC) and the Sedona Conference, issuing papers and studies on discovery search and text retrieval methods in the electronic discovery space. 5 In 2011, Maura Grossman and Gordan Cormack published a research paper titled, "Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review." 6 This paper was important in the history of TAR, not only because it evaluated the efficacy and efficiency of the TAR 1.0 technology and methodology as we know it today, but also because it found that TAR 1.0 could actually "yield results superior to those of exhaustive manual review." 7 In 2012, Judge Peck, then a Magistrate Judge in the Southern District of New York, issued his seminal opinion in Da Silva Moore v. Publicis Groupe, approving the utilization of "computer-assisted coding" in federal court where both parties agreed to its use. 8 Importantly, in his decision, Judge Peck quotes an article he wrote about the subject, where he noted that many attorneys were waiting to use TAR until it was approved by a federal court. This decision, he stated, would be that long-awaited approval. 9 During that same year (2012), the RAND Corporation issued a groundbreaking research brief titled "The Cost of Producing Electronic Documents in Civil Lawsuits," which called out the inefficiency, inaccuracy, and expense of conducting ediscovery the old way (i.e., via manual review). The authors of that brief looked to the "nascent technology" of predictive coding as a possible solution, lauding its consistency and efficiency. 10 The Da Silva Moore opinion and industry buzz helped open the floodgates for attorneys and their clients to start learning about and using TAR 1.0 in more cases. With more cases, more court decisions followed that approved its use in a variety of different scenarios. TAR 2.0 As TAR 1.0 took hold, it became clear that it could be useful in specific types of datasets. However, because it used simple machine learning, TAR 1.0 use cases were limited to cases where all documents that needed to be reviewed were available at the outset of the matter. TAR 1.0 was found to be less beneficial in cases where document collection is ongoing or "rolling," because the system must be retrained with each addition. TAR 1.0's requirement for highly trained subject matter experts to train control sets also meant the cost-value ratio was substantial enough to justify its use only in large static data volumes – because those experts' review work was more expensive than regular attorney review. 11 The second iteration of TAR, TAR 2.0, uses the same supervised machine learning technology as TAR 1.0, but rather than the simple learning of TAR 1.0, TAR 2.0 utilizes active learning. With active learning, the system will continuously learn from reviewer decisions. This means that TAR 2.0 does not require a one-time set of decisions by the subject matter expert team

Articles in this issue

view archives of Marketing - TAR + Advanced AI: The Future is Now