ENow Blog | M365 - SharePoint/OneDrive Center

Performing eDiscovery Against a Specific Folder

Written by Vasil Michev MVP | Aug 15, 2017 1:00:00 PM

One common request when performing an eDiscovery or using the Search-Mailbox cmdlet has long been the option to search a specific folder only. Possible scenarios for such requests are to delete/purge items from a (sub)folder or copy them to a different mailbox, etc.  Similarly, an “exclude folder” functionality can sometimes be useful. With the latest additions to the service, we are now able to scope eDiscovery/Content searches to specific mailbox, SharePoint or OneDrive for Business folders!

This new functionality is made possible thanks to two new keywords accepted by the KQL syntax. Namely, the folderid keyword, corresponding to mailbox folders and the path keyword, responsible for SharePoint and OneDrive for Business folders. There are more differences between the two than just the keyword name, so let’s go over the different scenarios.

Searching Specific Mailbox Folders

For this scenario, the folderid keyword must be used and the values it accepts are… strange. For whatever reason, the devs decided to avoid using an easy to obtain property such as the actual folder ID and instead require you to transform it from string to binary, then perform some bitwise operations and finally convert it back to string. I imagine it would have been easy enough to code this “server-side”, so that the necessary transforms are performed after the user provides the “regular” folder ID.

Anyway, the process is detailed in the documentation and a sample script is provided to generate the necessary folderids. For the purposes of this article, a somewhat cleaner version of the code is used to build a function that generates the folder id based on mailbox and folder name input.

Using either the sample script or this function, one can generate the folderid value for the folder in question and use it to run the search query. For example, if I want to scope the search to my Inbox folder only, I would have to use the folderid value of “483DF0CCC723FF458858F0CF319803D500000000010D0000”. Once the ID is generated, you simply follow the same steps you would with any other content search or eDiscovery query, be it via the Security and Compliance Center UI or PowerShell. You can of course combine the folderid keyword with other keywords as needed, but if you want to simply list all items in a particular folder, you can use the following example:

Once the search has completed, there are few things to note in the results. First and most importantly, no results from any subfolders are returned. It is important to remember this not only because of the behavior differs compared to SharePoint Online or OneDrive for Business searches, but because it might cause you to miss some of the results in an important real-life eDiscovery case. This is illustrated on the screenshots below:

The left image corresponds to the result from my Inbox folder, where 4007 items totaling 447.50MB were discovered. The result matches what I see in Outlook (middle inset), however, it clearly does not include the 307 items found in the “Book” subfolder. To return those items, a separate search must be performed, using the folderid of the “Book” subfolder (right image). Alternatively, one can obtain a list of all subfolders via the Get-MailboxFolderStatistics cmdlet and their corresponding folderid, then combine them via a logical OR when performing the search:

Next, you will probably note that 2 mailboxes were searched in the examples above and you can guess that the second one is the archive mailbox associated with my account. The question that pops up here is whether search results from the corresponding folder in the Archive mailbox included––the answer is no, as this folder has a different folderid. For the same reason, even if you include all mailboxes in the search, only results from my Inbox will be returned. Unfortunately, this means that while it is certainly possible to create a search that covers only the Inbox folders in multiple mailboxes, the process will be a bit more complicated than expected. In addition, the different folderid values across different mailboxes mean that it is not possible to configure a search permission filter that limits users to only running searches against specific folders.

Another thing to note is that information about unindexed items is returned in the results estimate.

Searching SharePoint or OneDrive for Business Folders

Luckily, searching SharePoint or OneDrive for Business folders doesn’t require you to use any complicated scripts––all you need to do is provide the path to the folder. There are multiple ways to obtain the path, ranging from simply copying the address from your browser or using the Copy Link functionality once you have selected a folder, all the way to programmatically enumerating all the folders via CSOM. The example script in the documentation takes another approach: it creates a Content search for the targeted Site with the contenttype:folder keyword, in order to return a list of all the folder paths.

Regardless of the way you obtain the folder path, in order to create a Content search scoped to particular folder(s) only you will need to use the path keyword. Unlike the mailbox folder searches, subfolders will be included in the results, including system/hidden folders (which might return zero matches, but will still be present in the results list). You can, of course, combine the path keyword with any additional keywords relevant to the case at hand in order to narrow down the results. Or, simply exclude a particular folder by using the NOT operator.

In the following example, I have run a search against my OneDrive for Business site, focusing on the “O365” folder. On the second image, the search criteria are adjusted to exclude a subfolder from the results. Again, this is a different behavior compared to the Exchange mailbox search case, where subfolders are excluded by default.

Another thing to consider when running searches against SharePoint or OneDrive for business locations is unindexed items. In my case, I’ve run a simple search query that only uses the path keyword, thus any unindexed items found in the target folder(s) will still match the query. Those items are not counted towards the number of unindexed items in the estimate results section, which explains the zeroes in the above screenshots. If you export the search statistics together with the unindexed items stats however, you will be able to get information about those matches. More details on how unindexed items are treated can be found in the official documentation.

Summary

In this article, we did a short overview of the newly introduced functionality that allows us to perform an eDiscovery or content searches against specific folders in Office 365. This functionality is supported for both Exchange mailboxes and SharePoint or OneDrive for Business sites, and although in some cases preparing the search query might require some additional steps, it’s relatively easy to use. Some additional quirks are also observed, but overall it’s a useful addition to the ever-growing set of compliance tools available to Office 365 administrators.

Unfortunately, the Search-Mailbox cmdlet doesn’t recognize the new query keywords and this functionality can only be used via the Search and Compliance Center UI or PowerShell cmdlets. The examples provided here and in the official documentation about the feature should be enough to get you started, although they might be a bit too simplistic for any real-life eDiscovery case. Still, the feature will definitely have its uses for situations where you want to further limit the search results or to delete all items in a particular folder.