Home > Troubleshooting, Windows 8 > Contents of .pdf files not being indexed in Windows 8/8.1

Contents of .pdf files not being indexed in Windows 8/8.1

Summary

I broke Windows Search indexing because I didn’t give the SYSTEM user full permission on the folders I wanted indexed.

Background

We’ve been using a Fujitsu ScanSnap S1300i to scan all incoming paperwork (e.g. receipts, bills) since October 2012.  The ScanSnap includes ABBY FineReader OCR functionality (not ABBY FineReader itself but instead ScanSnap links to ABBY .dll’s to do the actual OCR task).  When the “Searchable PDF” option is used, ABBY OCR’s the .pdf and embeds the searchable text.

Copy and paste of selected text in PDF (

A collection of searchable-PDFs is only useful if something indexes them and you can search that index.  For most Windows users, the built-in Windows Search feature more than handles the task.

A few months ago I had profiled Windows boot performance to find out why initial-logon was slow only to find out Windows Search itself appeared to be aggressively reading the disk so I culled the list of indexed folders to lessen the load.  At the same time I rearranged folders to optimize disk usage and to simplify backups.  But around that time I noticed that searching on keywords that previously returned .pdf results now instead returned “No items match your search”.

image

If at this point I had tried searching for known PDFs in other locations (outside of my D:\Scans directory) I might have found out they were being returned.  However since the vast majority of all my PDFs are within D:\Scans I didn’t even bother checking.  Since other document types turned up in search results I assumed it was just a PDF-indexing problem.

Troubleshooting steps I tried (which might help you)

1) Double-checked I hadn’t removed my scan folder from Indexing Options. I also tried removing and re-adding that folder.  I clicked the “delete and rebuild index” button between some changes thinking it’d make a difference.

It didn't, it just made the whole process take longer.

2) Ran the Windows Search troubleshooter– Control Panel > search “windows search” and clicked “Troubleshooting: Find and fix problems with Windows Search”.  I checked the “Files don’t appear in search results” checkbox though I now suspect this is just a CEIP checkbox.  I always got “Issue not present” on each of the issues checked, including “Incorrect permissions on Windows Search directories”, ha!

3) Checked and changed the HKEY_CLASSES_ROOT\.pdf\PersistentHandler registry value per these steps from Adobe: http://helpx.adobe.com/acrobat/kb/pdf-search-breaks-110-install.html .  I spent a while on this step (and the next few) because I had installed the guilty version of Adobe Acrobat Reader before and I had even installed the Adobe PDF iFilter (v11.0.01) before I learned that Windows 8 includes PDF indexing out of the box. (It’s now uninstalled because the built-in Windows PDF indexing is just fine)

4) Reset Windows Search settings.  Setting REG_SZ value SetupCompletedSuccessfully to “0” at HKLM\SOFTWARE\Microsoft\Windows Search\ reset all “Index these locations” folders in Windows Search.  Re-adding the scan directory still didn’t help get those PDFs indexed.  While I was at it I configured Windows Search to index more aggressively since I was spending time waiting for index rebuilds.

5) Checked CLSIDs and .dll registration for .pdf indexing.  (or if “Filter Description” isn’t “Reader Search Handler” and you don’t have the Adobe PDF iFilter installed)

image

To do this:

a) Stop the Windows Search service.  Open services.msc, find “Windows Search” and right-click it to stop.

b) Default value at HKEY_CLASSES_ROOT\.pdf\PersistentHandler should be {1AA9BF05-9A97-48c1-BA28-D9DCE795E93C}

c) Default value at HKEY_CLASSES_ROOT\CLSID\{1AA9BF05-9A97-48c1-BA28-D9DCE795E93C}\PersistentAddinsRegistered\{89BCB740-6119-101A-BCB7-00DD010655AF} should be {6C337B26-3E38-4F98-813B-FBA18BAB64F5}

d) If you’re running Windows 8x:

  • Default value at HKEY_CLASSES_ROOT\CLSID\{6C337B26-3E38-4F98-813B-FBA18BAB64F5}\InProcServer32 should be %systemroot%\system32\glcndFilter.dll
  • In an administrative command prompt, run: regsvr32 %systemroot%\system32\glcndFilter.dll  and confirm you get “DllRegisterServer in C:\WINDOWS\system32\glcndFilter.dll succeeded.

d) If you’re running Windows 10:

  • Default value at HKEY_CLASSES_ROOT\CLSID\{6C337B26-3E38-4F98-813B-FBA18BAB64F5}\InProcServer32 should be %systemroot%\system32\Windows.Data.Pdf.dll

f) Restart the Windows Search service

g) If you made any changes to the registry values, rebuild your search index

 

6) Checked the contents of the Windows Search ESE database (windows.edb) to verify if this is an issue with the indexer not seeing or erroring-out on indexing of the files in question or an issue of storing the indexed values into the database.  Windows.edb is a standard ESE/JET Blue database.

I also reset Windows Search again (see step #4 above) and only configured it to have Windows index a few small directories, including a sub-folder of my much larger D:\scans directory just to keep indexed values to a minimum.

Then you:

  • Stop the Windows Search service (via services.msc)
  • Copy file Windows.edb found at C:\ProgramData\Microsoft\Search\Data\Applications\Windows to another location.
  • Download and run ESE Database View (note: this isn’t my file and I cannot 100% attest to its safety but at least you don’t need to elevate when running it).  Open the previously copied Windows.edb file.
  • From the drop-down, choose “SystemIndex_PropertyStore” and do a CTRL-F search for files which should be indexed.  If they show up then the file has been indexed if not, then the file hasn’t been indexed.
  • Note: if you get “0 record(s)” after selecting the “SystemIndex_PropertyStore” table, it’s possible your windows.edb file is too large or that table is too large.  My smaller windows.edb file is 232MB, but now that I’ve got Windows indexing a much larger set of files it’s now 2.4GB.  It’s possible ESEDatabaseView cannot open ESE databases over a certain size.  Still a handy utility to know about.

image

7) Lastly, since I didn’t see the PDF files I wanted indexed in the windows.edb database, I compared this workstation to a known-working one where PDF indexing worked.  I compared all the above HKLM\HKCR values between the two systems– no difference.  I then compared the file security permissions between two files—one file on the working system which turned up in search results and one on my busted system which didn’t.  At the same time I checked the permissions which the SearchIndexer.exe process runs

image

There’s the problem—SearchIndexer.exe runs as SYSTEM and I didn’t add SYSTEM to D:\Scans’s security permissions.

image

I quickly granted the SYSTEM user permissions on all directories I wanted indexed and then rebuilt the search database.  Very quickly thereafter I started getting the PDFs showing up in search results.

Add SYSTEM user either via the “Edit…” dialog in the folder Properties > Security tab, or run something like the following in an elevated command prompt:

icacls "D:\Scans" /grant SYSTEM:(OI)(CI)F

Additionally, in my forum-crawling for solutions, I saw that others had success with copying files into the directory again to get indexing to work.  I suspect this might work for them because Windows doesn’t use the source file’s ACL but instead rebuilds it based off the folder which might be accessible to the SearchIndexer.exe process.

Advertisements
Categories: Troubleshooting, Windows 8
  1. October 29, 2014 at 5:16 am

    Great post, thanks. Seemed following a few of these tips fixed my issue.

  1. No trackbacks yet.

Comment?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: