Enterprise search as a systems problem: How we're building the discovery layer at Iron Mountain

Blogs and Articles

Stop blaming the data. Iron Mountain's discovery layer uses parallel RAG and SQL retrieval with intent-based reranking to deliver auditable, traceable enterprise search answers.

Iron Mountain logo with blue mountains
Aditya Sarwade
Associate Machine Learning Engineer
12 May 20267  mins
What's All This Talk About Digital Transformation?- Pointing Digital Transformation concept | Iron Mountain

The failure mode everyone recognises

A compliance officer needs to verify whether a specific customer has a documented exception to the organisation's standard seven-year data retention policy. The exception, if it exists, could be in a contract amendment, a signed addendum stored as a scanned PDF, or a record in the compliance management system. She searches the internal portal. The top results are a three-year-old version of the general retention policy, a training slide deck, and a video walkthrough of the policy framework - none of which contain what she needs. She runs the same query in the document management system, then in the compliance tool. Nothing surfaces the right record. She eventually messages a colleague who remembers seeing the amendment during onboarding.

The organisation had the answer. The discovery layer could not find it.

The instinct in this situation is to blame the data - index more content, clean up the taxonomy, add more metadata. In most organisations we have worked with, data quality is not the root problem. What is missing is a discovery layer capable of reaching across different content formats and systems to return the right answer from whichever of them holds it. Iron Mountain addresses this challenge by building a definitive discovery layer that uses parallel retrieval signals to return the right, auditable answer regardless of where the data is stored.

Why a single retrieval strategy is insufficient

Most enterprise search picks one retrieval method and applies it to all queries. The problem is that different question types require structurally different retrieval paths, and committing to a single method means returning poor results for a significant portion of queries.

Consider three cases:

"Show me all employees earning more than $50,000 per month" has a precise, deterministic answer in a structured table. Keyword and semantic retrieval are both wrong here - they surface documents that discuss compensation rather than contain the actual records. A SQL query against the relevant table is the correct retrieval path for this question.

"Summarise the compliance landscape for this account" is structurally different. There is no single database field holding a compliance summary. The right approach is retrieval-augmented generation: identify the most relevant document chunks using vector similarity over embeddings, inject those chunks into the LLM context window, and generate a response grounded in the retrieved content.

"What are the standard payment terms for retail vendors, and have there been any exceptions?" - the first part has a structured answer in a database record, the second requires reading across contracts, amendments, and correspondence. Neither SQL alone nor RAG alone handles this adequately. Both retrieval paths need to run and their outputs need to be ranked coherently against each other.

That third case represents a substantial portion of real enterprise queries. A system designed around a single retrieval method has no good answer to it. This is what motivated the core architectural decision: run both retrieval paths simultaneously on every query at enterprise scale and resolve them at reranking.

Dig a little deeper: parallel retrieval and reranking

When a query is submitted in Iron Mountain InSight® DXP, SQL retrieval and RAG-based retrieval execute simultaneously across the database stack. Both return candidate result sets. An intent scoring engine evaluates the query in real time - classifying whether it points at a structured record, requests synthesis, or requires both.

The ranking logic:

This architecture is rooted in a robustness design. The system collects the raw result sets from each retrieval path (RAG, SQL, etc.). The intent engine then performs two critical steps:

  1. Deduplication: Merging results that may have been surfaced by multiple engines to produce a clean final list.
  2. Intent-based reranking with additive bias: The final result set is reranked using generated intent scores. This process "boosts" assets originating from the predicted favorite retrieval method. Crucially, top results from the "unfavorite" path are also kept. This additive bias improves performance without the risk of losing relevant content.

A simpler design would be a query router: classify the query first, pick one retrieval path, execute it. The problem with a router is that it requires the classifier to be correct before any results are seen. When the classifier is wrong and on genuinely hybrid queries, no classifier is always right. The correct answer is simply not retrieved.

Running both paths simultaneously changes the failure mode. While there is a higher computational cost associated with running parallel paths, latency is not an issue as they execute in parallel. This represents a highly effective performance-cost tradeoff, as it significantly improves search performance and accuracy for engaged users. A misclassification at reranking costs the best result its top position; it does not cost the user the answer. The correct result is still in the candidate set, still returned, just ranked slightly lower than it should be.

The intent score does not gate retrieval - both engines always run - but determines which results lead to the final ranked output.

The classifier operates on query semantics rather than keyword patterns, and was developed iteratively against real query logs from production usage. "What does person X earn?" and "What can you tell me about person X's compensation history?" look structurally similar. The first is answered correctly by a SQL row lookup. The second benefits from RAG over document content - performance reviews, project records, correspondence. A classifier trained on surface features routes both queries identically; evaluating semantic intent is what correctly distinguishes them.

Retrieval signals driving our search today

Three retrieval signals are currently empowering our users to search across their content in InSight DXP.

Keyword matching with proximity weighting

Utilises vector similarity for word matching and synonyms. The system does not require exact lexical matches to return a result; instead, it identifies conceptually similar terms and employs proximity analysis so these concepts appear in close relation within the document.

Semantic retrieval via vector embeddings

Document chunks are embedded at index time. At query time, nearest-neighbour search identifies chunks whose semantic content aligns with query intent, independent of exact term overlap. "Vendor agreement" and "supplier contract" resolve to the same results not through synonym mapping but because their vector representations are geometrically close in the embedding space.

Structured SQL retrieval

For queries directed at relational records - employee data, financial records, compliance logs - the system routes to SQL with structured filtering: ordering by last modified date for recency, pre-filtering by document type, asset subtype, or timeframe. These are operations that keyword and vector retrieval handle poorly or cannot perform at all.

Two additional properties are worth noting. First, the system supports conversational follow-up with source attribution. Prior query context and retrieved chunks are carried into subsequent retrieval and LLM prompting as session context, so follow-up queries are interpreted against what came before rather than as isolated lookups. Responses carry reference tags linking back to source content so each answer is traceable to its origin - which is what makes results auditable rather than merely plausible. Second, retrieval quality improves as the content corpus grows. As more documents are ingested, the embedding index expands with additional chunks, increasing the probability that a semantically close chunk exists for any given query. SQL coverage deepens as more structured records are added.

From architecture to answers

The compliance officer searching for a customer's data retention exception should get the right answer on the first attempt - whether it lives in a scanned contract amendment, a compliance database record, or a signed addendum - without needing to know which system holds it or which retrieval path found it. This is the core discovery problem Iron Mountain is helping to solve today: retrieving the right, auditable answer regardless of its format or location.


Collaborators: Jimmy Jose, Abhishith Muchipalli, Zhi Chen, Akshat Sharma