Using AI to Extract, Read, and Summarize Documents
Oakleaf’s growing usage of AI and large language models for its internal projects has led to the development of tools that can be of use to any organization looking to organize, consolidate, and search voluminous documents on a large-scale basis. These solutions exemplify Oakleaf’s commitment to leveraging its AI capabilities to help clients meet their objectives on a time- and cost-effective basis.
Oakleaf is leading the way in document analysis. Most models and AI/databases work efficiently with large amounts of data. Oakleaf is setting itself apart from other mortgage analytics firms by using AI to organize, analyze and interpret documents. Oakleaf has honed its skills in document retrieval, text extraction, and summarization. Users can quickly identify key provisions in documents and conduct comparative analysis without having to read and copy an entire document. Plus, we can do this on a large-scale basis for voluminous documents.
Case in Point
A key project underway at Oakleaf is analyzing pooling and servicing agreements (PSAs) for residential mortgage-backed securities (RMBS) to help its clients identify potential termination payment shortfalls as previously described in prior commentaries (see Oakleaf’s Oct 27, 2023 commentary, “ NY Court Follows CA Ruling Finding Termination Price Should Include Deferred Principal”).
Oakleaf retrieved almost 4,000 pre-2008 public RMBS PSAs from the EDGAR database, parsed out each document and extracted relevant sections and defined terms using GPT and other AI tools. Oakleaf leveraged GPT’s capabilities by asking it to read a particular section, identify the term we wanted to analyze (i.e. Termination Price), and find other definitions referenced in that term. Taking it a step further, we leveraged GPT to map these definitions to a separate part of the PSA. All the documents are stored in a database that will allow similar queries for other provisions by anyone in the company with assigned permissions.
While text documents are typical of an unstructured data format where GPT can be effective, Oakleaf employed additional AI tools to significantly reduce the time and manual input often needed to parse a large number of unstructured text documents, each of which can be 200-300 pages long. Oakleaf’s use of various AI tools had tremendous time saving benefits, reducing the time our developers had to spend manually debugging code to parse out documents from 30 minutes/1 hour per iteration to 10-20 seconds. The tools also reduced the costs of using GPT, which yielded thousands of dollars in savings since the total volume of PSA text was over 400 million tokens.
Oakleaf can employ this technology for other clients looking to retrieve, compare and analyze information from external sources (i.e. EDGAR, PACER, regulators’ databases and websites) or from internal databases in a timely and cost-efficient manner and build a database to house those documents for future analyses and reporting. Organizations can query a certain provision from these documents and compare across all documents on a time and cost-effective basis.
Connect with us to explore how Oakleaf’s AI-driven solutions can transform your data management. Contact Suzanne Mistretta at [email protected] or Skylar Deutsch at [email protected].
Sign Up for Newsletter Updates
Oakleaf at a Glance
See Who We Are | Meet Our Leadership Team
Join The Oakleaf Team
Join Oakleaf and put your talents and skills to work with our leading financial, banking, and mortgage client organizations.
See The Work We Do
See how we support our clients and their teams in tackling their most complex matters. Or contact us if you want to discuss anything further.