TAR + Advanced AI: The Future is Now

Issue link:

Contents of this Issue


Page 4 of 12

WHITE PAPER: TAR + ADVANCED AI – THE FUTURE IS NOW | 5 on a control set to train the system at the outset of the matter in the way that the TAR 1.0 process requires. Instead, regular attorney review teams can immediately start reviewing documents in a TAR 2.0 model. This also means that the review set does not need to remain static in TAR 2.0. The software continues to learn as attorneys review and code documents (no matter when those documents are added to the dataset) and will continue to score those documents. In turn, this allows case teams to prioritize their workflow to ensure that the most responsive documents are always being reviewed next. 12 TAR 1.0 vs. TAR 2.0 Both TAR 1.0 and TAR 2.0 are still widely used in ediscovery today – and both the technology and processes remain relatively unchanged despite more than a decade of utilization. Each has its own advantages and disadvantages depending on the matter parameters. As previously noted, a TAR 2.0 model can be more beneficial than a TAR 1.0 model in matters where review needs to start immediately, or when attorneys want to introduce TAR to a review that is already underway. 13 TAR 2.0 is also more flexible in implementation and can be used for a variety of different use cases. For example, TAR 2.0's prioritization review workflow (wherein the highest scoring documents are continuously pushed to the front of the review as the technology learns from reviewers, creating a loop where the most up-to-date model identifies which documents should be reviewed next) can be helpful for matters that have short, rolling production deadlines. 14 TAR 1.0 still has its advantages over TAR 2.0 – especially in its ability to stabilize a model and require fewer overall documents to be reviewed when compared to a TAR 2.0 model. For example, Lighthouse has seen cases where as few as 5,000 documents were required to stabilize a TAR 1.0 model for a 1-million total document population. It can also lead to smaller overall review set in a shorter amount of time. TAR 1.0 is also still the primary workflow used in regulatory investigations, like Hart-Scott-Rodino Act (HSR) Second Requests (the discovery process by which the federal government investigates potential mergers and acquisitions for anticompetitive behavior) due to several of its advantages over TAR 2.0. The nature of these investigatory discovery matters involves a very short discovery period with large data volumes and rigorous production requirements that, if not met, involve harsh penalties and risks. This means that quickly culling data from a dataset (in a way that is acceptable to regulatory bodies like the Department of Justice (DOJ) or the Federal Trade Commission (FTC)), is critical. TAR 1.0 is particularly effective for this task because it's a much easier workflow to negotiate culling data. With a TAR 1.0 workflow, case teams can negotiate a statistically validated cut-off score with the regulatory body at the outset of the investigation, below which they will not have to produce any documents. Then, once the model stabilizes, the case team can simply produce everything above the negotiated cutoff without further responsiveness review. This contrasts with a TAR 2.0 workflow, which does not involve a control set nor does it involve the standard recall precision statistical metrics that can validate stopping reviewing after a certain point to regulatory body. TAR 1.0's longer history also helps attorneys feel more comfortable with that workflow over TAR 2.0, making its use more prevalent. However, because the "machine learning" technology used in both TAR 1.0 and TAR 2.0 is older and not built for big data, both processes are quickly becoming limited in their applications with modern datasets. The future of TAR At Lighthouse, we believe that TAR is finally on the verge of a sea change. As data volumes continue to explode and become more varied and complex, the supervised machine learning technology behind TAR is becoming inadequate to manage modern data. A decade ago, TAR was generally able to handle that era's data volumes and limited variety of data sources. Email was the standard form of communication, and there was much less volume and diversity within data – meaning that a TAR 1.0 or TAR 2.0 workflow solely utilizing supervised machine learning could be effective enough to accurately classify the most common forms of data at the time, and thereby greatly reduce the amount of time spent on eyes-on review. But today's datasets are vastly different than they were a decade ago. Today's employees use myriad applications to communicate and work (think: chat systems, smartphones, cloud-based collaboration tools that incorporate a dozen different

Articles in this issue

Links on this page

view archives of Marketing - TAR + Advanced AI: The Future is Now