Lex Machina’s Karl Harris On Success With AI [Sponsored]


Given the sudden recent fascination with AI, it can be tempting for AI enthusiasts to get carried away by speculation over what AI might one day be able to accomplish instead of its current capabilities and their more immediate implications.

Karl Harris, CEO of Legal Analytics giant Lex Machina, takes an optimistic view of AI’s potential while maintaining a degree of pragmatism. As the leader of a legal tech company that has always strived for groundbreaking ways to bring analytics to all areas of the law, Harris knows that while all technology comes with benefits and risks, with considered application, it can also be harnessed to achieve creative new solutions to longstanding problems.

Rather than getting tangled in the current hype that overestimates AI’s capabilities, or becoming susceptible to the inevitable disillusionment that follows, Harris believes it’s crucial to rise above the noise and invest in the point at which public expectations level off and intersect with technological reality. “At Lex Machina, that’s what we aim to do — invest in the point at which what people think AI can do actually matches what it can do.”

karlHarris_wp_thumbAbove the Law recently caught up with Harris to get his take on the future of AI in legal tech, as well as his unique analysis of which factors prime an organization for greater success in the increasingly competitive AI space.

Using Machine Learning (Before It Was Cool)

Artificial intelligence — or machine learning — has been at the core of Lex Machina’s offerings since the company expanded from its roots at Stanford University’s law school.

The company pioneered Legal Analytics, utilizing extractive machine learning to process data from federal district courts, appellate courts, bankruptcy courts, and other legal data sources. Attorneys use Lex Machina’s analysis of that data to create better arguments, more effectively advise their clients, craft competitive pitches for new business, and gain an edge in the courtroom and the boardroom.

Most recently, Lex Machina has engaged in a mission to process the mountains of data at the state court level, successfully launching Legal Analytics for an expanding number of state courts in major population centers including Los Angeles, Houston, Atlanta, and New York City.

To clean and enhance all of that non-uniform data and turn it into something useful for legal professionals, the company consistently evolves and refines its utilization of large language models (LLMs) and machine learning.

“We’ve been achieving really amazing things with advanced technology, like our large language models, for a very long time,” Harris says. “Before recently, our customers primarily cared about the value we brought to them, not all the work we were doing behind the scenes to achieve that value. It’s only recently that we’ve started fielding questions about whether we use AI. It’s kind of fun to be able to reply, ‘Of course we are — we’ve been using it all along.’”

AI: What Is It Good For?

“At Lex Machina, the internal discussion is really all about the big picture: what’s going on with generative AI, what are its capabilities, and what is the difference between what it can actually do and what people think it can do,” Harris says.

As Harris likes to clarify, programs like ChatGPT and Google Bard are really wide-scale generative AI programs trained on the entire internet to do one thing: guess the most likely next word in a sentence based on their training set.

However, Harris notes, when the training set is a diverse range of internet text, there’s a wide sample size of information that runs the gamut from excellent persuasive writing to wildly incorrect facts. If not carefully monitored, a generative AI program trained on the internet can end up stating wildly incorrect facts in a persuasive, well-written manner.

This means if a company wants to use a large language model to help reliably solve its customers’ problems, it will need to ensure two things, Harris notes.

“First, make sure the model is trained on the most accurate, relevant dataset possible to ensure the output will provide the maximum value to the customer,” he says.

“Second, in addition to ensuring the training dataset is of the highest quality, also make sure the model’s collective output is accurate and true. At Lex Machina, this includes keeping our legally trained analysts in the loop for any necessary human review.”

“There’s another way to think about this using two terms of art,” Harris explains. “One is specialization, as in training a specific model specialized in a particular topic. For Lex Machina, this is when we train our model on caselaw, legal writing, and legal texts. For a medical model, it might be trained on medical journals. For ChatGPT, it’s specialized to feel like a human, so it’s trained on the vast amount of writing poured onto the internet by humans.”

“Then there’s augmentation, which is when you interact with the existing model and augment it with another data source to verify whether it’s factually correct,” explains Harris. “As part of LexisNexis, we see this demonstrated effectively through Lexis+ AI. For example, if Lexis+ AI is asked to cite the controlling law in a case, the augmentation stage is when the output — the cited law — is run through its proprietary search technology to determine, is this a real case? Is it good law? Is it relevant? ChatGPT skips this crucial second step.”

According to Harris, legal writing tends to be well suited for training LLMs, thanks to its standardized nature and the rules of civil procedure that act as guardrails for the content and format of different documents. Put another way, this means that a legal-specific context makes it easier for the model to guess the correct next word in a sentence because there are limits imposed on what that word might be.

Lex Machina has taken advantage of this phenomenon to make its model significantly more specialized and helpful in a legal context than, for example, ChatGPT, which is trained on a huge range of data from across the internet.

“Our large language model is smaller than ChatGPT, of course, and therefore can perform a smaller variety of tasks,” Harris observes, “but the actual task we ask it to do, it does remarkably well.”

Now, as Lex Machina explores the possibilities of generative AI, Harris observes that the use cases with the most promise for improving legal research and writing workflows fall into two categories: efficiency-maximizers and possibility-expanders.

“I think of generative AI as providing two major advantages, when applied correctly,” he says.

“First, it can enable you to accomplish what you normally would, but faster and more efficiently. For example, let’s say you’re writing a motion for summary judgment in the Southern District of California. I could ask the AI to provide a template for a summary judgment motion in the Southern District of California that has all of the boilerplate language every motion is required to have. Once you finish the brief, you end up with the same output, but you get there faster.”

The attorney working on this example could then focus on leveraging their own knowledge and creativity to draft the best possible argument in favor of their motion rather than spending time manually filling in the basics, Harris suggests.

“The second potential advantage of generative AI is that it can enable you to accomplish tasks you otherwise couldn’t because they’d be too time consuming or logistically impossible. For example, once you’ve written your motion for summary judgment, you could ask the AI to write a reply that sounds like one written by your opposing counsel, in order to anticipate the next moves in the litigation. Without generative AI, this task would be impossible, or at least highly improbable or excessively expensive.”

Which AI Companies Will Succeed?

There are two types of technology providers Harris believes will be successful in this AI-prevalent world: “Those who benefit from the dramatically increased demand for the cloud and computing infrastructure that powers AI, such as the Nvidias of the world, and those who possess unique content that nobody else has in order to do things with these large language models that nobody else can do.”

Lex Machina is well prepared to enter the fray from the second approach, harnessing its unique Legal Analytics content to create and train the types of LLMs that other companies simply can’t produce. This is because Lex Machina, like other companies and products in the LexisNexis portfolio, possesses the valuable legal content and analytics others just don’t have.

“To be truly successful in leveraging AI, it’s not enough for a company to simply know how to use large language models and train them to do things,” Harris says.

“A lot of companies are capable of this. What sets apart certain companies is if they have a unique angle or extraordinary content that no one else has, and they use that to train their large language models and augment the output of those models. This is the situation with Lex Machina and LexisNexis — we have the unmatched and verified content necessary to build and deploy our AI in ways that others cannot.”

Harris adds that “in addition to the technology providers themselves, there is a third category of companies that will succeed in a new AI-dominant world — those companies who continue to focus on their existing core competencies, but now can do it better because they leverage generative AI.”

Harris notes that this subset of companies will include law firms: “I firmly believe that although generative AI won’t replace lawyers, lawyers who use generative AI will eventually replace those who don’t.”

Leave a Comment