Cloudficient Blog | Cloudficient

Why Identification is the Nexus of Defensibility in eDiscovery

Written by Brandon D'Agostino | Feb 6, 2026 12:00:00 PM

Spoliation tends to get lumped in with failed preservation efforts, and while this is technically true, the real culprit is usually improper or incomplete identification. Here are a few examples we have seen a lot of recently:

  • Mobile device settings not changed from defaults resulting in ephemeral messaging data being deleted every 30 days
  • Chat data not adequately preserved – again, settings not changed to prevent automatic deletion, or the licensing level of the platform does not allow for proper eDiscovery operations to be performed
  • Additional custodians not identified timely (or at all), so data is not preserved
  • Custodian self-collection is not adequately supervised

In each of these scenarios, more rigorous or robust identification would yield more defensible results. While we as practitioners cannot force clients to comply, better identification would lead to better outcomes in most cases.

I have also observed that up-front identification is often a rushed process with the least supporting information (compared to downstream phases).

But this is where risk is introduced. If identification is flawed or incomplete:

  • Preservation gaps may look like tooling issues
  • Over-collection looks like prudence
  • Review inefficiency feels unavoidable

The Static Custodian Myth

Traditional identification relies on a deeply flawed assumption:

"A custodian is a person, frozen in time."

In modern organizations, this simply is no longer true. Modern organizations do not work that way.

People do the following:

  • Change roles
  • Participate on cross-functional teams
  • Contribute to shared repositories they do not own
  • Rely on documents stored elsewhere to make decisions

When custodians are identified based on current titles or current org charts, the historical reality that really matters is lost.

Consider Jane, a product manager who works remotely from her home office in New Jersey. Jane is asked to collaborate on product positioning briefs by a product marketing manager, Jim. Jane and Jim begin sharing documents back and forth on Teams via hyperlinks. Those documents are stored in each of their OneDrive repositories, on the Marketing SharePoint, and in the Marketing team’s Notion dashboard.

If we were asked to preserve all custodians who worked on the new messaging project in M365 Purview, we might miss documents stored in Jane’s OneDrive that are referenced in the Teams messages and emails of our custodians (unless Jane is explicitly added as a custodian).

What is missing in this scenario is the context that Jane worked cross-functionally on an ad hoc team for the project in question.

I would want to interview Jane, Jim, and the other custodians to determine if:

  • Any other employees or contractors were brought into this project from other teams
  • Documents or data were stored in any other non-Microsoft apps (besides Notion)
  • The members of the project team used other communication channels besides Teams such as iMessage, WhatsApp, Slack, etc.

Discovery requests are drafted with the relevant timeframe in mind, not today’s org charts or roles.

Repositories Do Not Define Scope, Behavior Does

Another common pitfall is repository-based scoping.

We may assume:

  • A SharePoint site maps neatly to a team
  • A OneDrive reflects an individual’s work
  • A Teams channel represents a bounded group
  • Employees and contractors are adhering to (or even know about) information governance policies

In practice, however:

  • Repositories are reused
  • Access outlives purpose
  • Names are misleading
  • Usage patterns evolve
  • Employees think in terms of messages for organization rather than folder structures

A repository’s relevance is defined by how it was used, by whom, and when – not by its label or permissions hierarchy. This is behavioral context, and without it, scoping is guesswork.

Over-Collection is an Identification Fail

Over-collection is often considered the safe bet. “Just get everything, and we will sort it out in processing or review.” Sound familiar? Later, that over-collection is framed as a cost problem, when it is really a confidence problem.

If you cannot trust the context available during scoping, the duty to preserve and thus, avoiding spoliation, requires over-collection. In some cases, proportionality concerns have not even been raised or discussed because the impact to downstream review and hosting cost is not yet known. And even if those concerns were raised, you may not have had enough information to support the claim.

When identification is grounded in context:

  • Scope narrows naturally
  • Decisions are data-driven and easier to justify
  • Proportionality becomes defensible, not aspirational

Context-Aware Identification as a Discipline

Context-Aware eDiscovery™ treats identification as a reconstruction exercise, not a checklist.

It asks:

  • Who were the relevant actors at the time?
  • What data did they actually interact with?
  • Where did collaboration occur in practice?
  • What context needs to be preserved before it disappears?

Identification becomes an act of defensible reasoning, supported by evidence, not assumptions.

Why Identification is the Nexus of Defensibility

Everything in eDiscovery flows downstream from identification. It is the moment where defensibility is won or lost.

When identification improves:

  • Preservation crosses custodian boundaries and includes the real actors
  • Collection is more targeted and proportional
  • Review volume drops without sacrificing recall
  • Defensibility improves organically with an audit trail

Context-Aware eDiscovery starts where risk begins. It brings together identity, behavior, and document context in a reconstruction-grade record that reflects what actually happened, not merely what content exists today.

This article is part of Cloudficient’s Context-Aware eDiscovery series leading up to Legalweek.