Cloudficient Blog | Cloudficient

Understanding What are Unindexed Items in eDiscovery

Written by Shelley Bougnague | Jul 10, 2023 10:09:40 AM

“What are unindexed items in eDiscovery?" This question often arises among business leaders and decision makers, especially those managing large volumes of data. Unindexed (or partially indexed) items can significantly impact the efficiency and accuracy of content searches, leading to potential consequences if crucial data is missed. 

What are Unindexed Items in eDiscovery? 

Unindexed items, also known as “partially indexed items”, are documents or emails in Exchange Online, SharePoint, or OneDrive that cannot be fully processed for indexing. According to Microsoft’s updated guidance, this typically occurs for the following reasons: 

  • Unsupported or unrecognized file types 
  • Encrypted or password-protected files (including non-Microsoft encryption technologies) 
  • Large file size or too many attachments 
  • File corruption or errors during the indexing process 

What are Unindexed Items? 

An unindexed item is essentially an incomplete piece of data in your organization’s information repository. Although they may contain critical content, they appear as “unindexed” in content search results reports, which can cause confusion or inaccuracies during critical discovery stages. 

Why do Certain Files Become Unindexed? 

There are many reasons why indexing can fail, and not all of them sit neatly in Microsoft’s bucket of explanations. Beyond encrypted or oversized files, failures can also stem from poor data hygiene practices, outdated or incompatible file formats still lingering in archives, or even temporary system issues during ingestion and crawl. Inconsistent metadata, corrupted headers, or custom applications that generate proprietary formats can also frustrate indexing engines. In other cases, environmental factors like throttling limits, storage latency, or misconfigured connectors contribute to items slipping through the cracks and ending up unindexed. 

Unindexed items are like a book disrupted not just by missing pages, but by printing errors, torn chapters, or even incompatible bindings. You might get the gist of the story, but critical details are broken, incomplete, or inaccessible. In discovery, the same holds: technical issues, bad data hygiene, or environmental glitches create blind spots. These gaps add complexity and require proactive management strategies to prevent small flaws from undermining the bigger investigative picture. 

Impact of Unindexed Items on Search Results 

Partially indexed items frequently appear in content search reports as “unindexed” because they were only partially processed. This creates gaps in eDiscovery investigations that need to be addressed. 

How do Unindexed Items Affect Search Accuracy? 

These items introduce uncertainty into your discovery process because they are not fully processed for indexing. They may: 

  • Fail to appear in search results 
  • Omit relevant content, lowering accuracy 
  • Limit accessibility for review teams 

Microsoft’s guidance indicates that while partially indexed items are typically less than 1% by volume, they may represent up to 12% of total data by size due to large files. This means the risk lies more in the weight of the missed data than in its frequency. 

Potential consequences of missing crucial data 

Missing key data during discovery can have serious legal or compliance consequences. In extreme cases, organizations risk failing to meet legal obligations or losing cases due to missing evidence. Proactively managing partially indexed content ensures search defensibility and compliance readiness. 

Investigating Unidentified Items with PowerShell and Advanced Indexing 

When it comes to eDiscovery, visibility into your data is critical. Microsoft provides tools to help:

1. PowerShell Reporting 

PowerShell remains a valuable tool for reporting on partially indexed items. Administrators can generate reports on item types, sizes, and counts to better understand the scope of the issue. 

2. Advanced Indexing in Microsoft Purview

A major in 2023 was the introduction of Advanced Indexing in eDiscovery (Premium). When custodians or data sources are added to a case, Microsoft automatically attempts to re-index previously partially indexed or error-flagged items. This significantly reduces blind spots in investigations. 

Note: Unindexed items generated by monitoring or system reporting software usually have less impact on investigations than communication data from executives or key custodians. 

The Prevalence and Impact of Partial Indexing in Large Volumes 

The challenge is not just scale, but also content type. For example, encrypted messages (OME or sensitivity-labeled), oversized graphics, and zipped files are consistently difficult to index. 

Common Document Types Prone to Partial Indexing 
  • Encrypted or sensitivity-labeled items 
  • Graphic-heavy files (e.g., large PDFs, images) 
  • Compressed/zipped archives 

Understanding these patterns allows organizations to anticipate where partial indexing is most likely and develop mitigation strategies. 

Strategies for Minimizing Partial Indexing Occurrences 

By identifying root causes and leveraging Microsoft’s modern eDiscovery tools, organizations can reduce partially indexed items. 

Efficient Management Tactics 
  • Audit Your Data: Regularly check repositories for file types or sizes known to cause indexing failures. 
  • Improve Data Hygiene: Eliminate duplicates, archive non-essential data, and ensure files comply with Microsoft indexing limits. 
  • Leverage Advanced Indexing: Use Microsoft Purview’s re-indexing features in Premium eDiscovery cases to recover partially indexed items. 
  • Educate Staff: Provide training on storing and sharing files in formats that are more index-friendly. 

These strategies improve discovery efficiency and reduce the chances of critical data being missed during investigations.

Retirement of Legacy eDiscovery Tools 

It’s important to note that Microsoft is retiring legacy eDiscovery tools. Content Search (Legacy) and eDiscovery Standard (Legacy) have already been retired, and eDiscovery Premium (Legacy) is scheduled for retirement in August 2025. Organizations are strongly encouraged to transition to the modern Microsoft Purview eDiscovery (Unified) experience. 

Transition to Modern Purview eDiscovery 

With the retirement of legacy tools, Microsoft is steering organizations towards the modern Microsoft Purview eDiscovery (Unified) experience. This unified platform not only consolidates capabilities but also introduces: 

  • Enhanced search and review workflows that handle partially indexed items more effectively. 
  • Integration with Microsoft Graph APIs and PowerShell for automation and reporting. 
  • Advanced indexing and reprocessing features that improve the discoverability of challenging file types. 
  • Improved scalability and compliance controls that align with evolving regulatory requirements. 

How Cloudficient Products Can Help 

While Microsoft Purview provides the core platform for managing partially indexed items, organizations can further strengthen their governance and discovery posture with complementary solutions from Cloudficient. Expireon and CaseFusion work hand-in-hand to address both the upstream and downstream challenges of unindexed content. 

Expireon enforces defensible retention and deletion policies, helping to reduce the buildup of legacy or redundant data. By cleaning repositories and ensuring only relevant, properly managed information remains, it minimizes the likelihood of oversized, corrupted, or obsolete files that frequently cause indexing failures. At the same time, CaseFusion provides legal and compliance teams with an integrated case management solution that ensures any partially indexed items that do persist are captured, tracked, and incorporated into the eDiscovery process. This allows investigators to maintain visibility and reduce risks even when data cannot be fully indexed. 

Together, Expireon and CaseFusion create a continuous lifecycle approach: Expireon proactively improves data hygiene and archiving practices before discovery, while CaseFusion ensures downstream case-centric investigations remain comprehensive and defensible. By intertwining these two solutions, organizations enhance accuracy, reduce legal and compliance risks, and achieve far greater efficiency when managing unindexed items. 

Conclusion 

Unindexed and partially indexed items continue to present challenges in eDiscovery. However, Microsoft’s updated guidance and advanced indexing tools provide organizations with more effective ways to manage and mitigate the issue. At the same time, the message from Microsoft is clear: the legacy Purview eDiscovery tools are retired or will be soon. Content Search and eDiscovery Standard are already gone, and eDiscovery Premium (Legacy) will not be supported beyond August 2025. The old ways of working are no longer viable. 

Business leaders need to understand that although unindexed items are rare by volume, their potential impact is disproportionately large. Simply relying on outdated tools is a risk to compliance and litigation readiness. By adopting proactive management tactics, leveraging reporting and lifecycle management, and combining modern discovery capabilities with solutions like Expireon and CaseFusion, organizations can ensure comprehensive coverage across all data types and maintain defensibility, without overreliance on Microsoft’s ecosystem alone. 

Now is the time to act. Explore how solutions like Expireon and CaseFusion can complement your Purview migration, delivering stronger governance upstream and more efficient case management downstream. Don’t let unindexed items or retired tools put your organization at risk.