Novel Methods of Guideline Reconstruction in Forensic Mortgage Underwriting Due Diligence

Often overlooked, digital file receipt is an integral and increasingly technologically sophisticated aspect of Mortgage Underwriting Due Diligence. It serves as the gateway through which all documents must enter the platform, and therefore has a tremendous impact on workflow. Formats for mortgage underwriting due diligence vary, however, and ad hoc development for non-standard productions is common. One such production received by Oakleaf Group consisted of several thousand pages, a web-based interface, embedded documents, and hyperlink dependencies. The complexity and size of this production, compounded by metadata anomalies, resulted in multiple issues:


1) The metadata included both the guideline, as well as the individual program guides shuffled together, receiving updates concurrently with updated metadata in shared columns; 2) The document updates were piecemeal and asynchronous. An update to credit policies on Monday, income and asset requirements on Wednesday, and another credit update on Friday was conceivable. To further complicate things, there was no singular update-document type index, leaving updates to ostensibly the same update-document non-standardized; 3) Would-be embedded documents had entirely different metadata notation to the update-documents, rendering their organization ambiguous; 4) These updates included both Active and Draft versions – a distinction not denoted in the metadata.

Our Approach

The document needed to be reimagined as not simply a disaggregation of updates, but as a living document, singular and dynamic, subbing in and out sections as they were updated and became obsolete. This paradigm shift reoriented our development strategy towards compactness, linearity, and singularity.


The first step in realizing this new strategy required teasing out the two distinct documents (the guideline and program guides). By identifying substantively different structures in the naming conventions of the apparent bookmarks and checking the non-overlapping values on the rest of the fields, it became clear that the so-called ‘Replica Id’ separated the metadata into distinct major documents. With the two singular documents identified in the metadata, we then needed to construct an internal document index within each of the major documents. This involved a mix of splitting and combining numeric data indexes from apparent bookmark fields. This update-document index, verified by our subject matter experts, enabled our planned substitution scheme.


Next, the problem of floating document attachments was realized and resolved by identifying a grouping index, an apparent attachment identifier field. Using this, we pushed the constructed document index, along with the bookmark and date data from the source update-documents, down to the apparent attachments.


Finally, we needed to identify draft versions and active versions of the documents, or else incorrectly replace an active version with an archived or draft version. With no apparent metadata to support this distinction, we turned to the files themselves. After manually identifying examples of both kinds of documents, we used their extracted text as keys to mine the text and classify each update-document. We then merged this constructed active/draft index with the original metadata to subset and purge the draft update-documents, along with their respective embedded dependencies. With the metadata sufficiently enriched, we needed to turn this one static metadata document into thousands, an instruction set for every day in the range of updates. After applying a text overlay to the compiled document-updates, we compiled the whole-document guidelines, incorporating or substituting update-documents as they became active. We further included the apparent document index and document classifiers as bookmarks for ease in navigation for the user.


  • The results were true-to-form versions of the guideline and program guide for every day in the provided metadata date range, text searchable and bookmark-indexed as intended.
  • The attached embedded documents lay in adjacent, bookmark-based folders, active according to their master update-document, essentially creating a longitudinal view of the guideline and program guides over time.
  • From this amalgam of metadata, we constructed an archival format of the guideline, its program specific guideline components, as well as the subprime manual.
  • It took a paradigm shift in our approach, accompanied by several technologically enabled tweaks, including text mining, and data aggregation/analysis, but the resulting reconstructed documents enabled the Mortgage Underwriting Due Diligence team to provide superior work in an expedited time frame.

Explore Oakleaf

About Us

Our team and values work together to help clients succeed and deliver better outcomes.


Oakleaf is where employees can be themselves, do their best, and thrive at work and life.

Contact Us

Connect with our team to discuss how we can help your organization succeed with Oakleaf.

Some Of Our Clients

We serve many of the top financial institutions, agencies and GSEs, banks and mortgage companies.


Feel Free to Contact Us Now! Call us (202) 684-2800