Sunday, April 06, 2008

The case of "thumbs.db" file

My Windows skills were recently challenged by a blog post of Cédric Blancher about the "thumbs.db" file internals.

It is widely documented that this file is an OLE container for holding thumbnail information, when the corresponding Explorer option is checked (which is the default configuration). Some Open Source tools even exist to parse the "thumbs.db" file.

However, there is one more question that has been left unanswered: "how is custom image ordering preserved?".

Naive approach

A quick test yields the following empirical result:
  • Within Explorer, browse a folder which has a sub-folder where some images are stored. A "thumbs.db" file is created in this sub-folder, if necessary.
  • Enter the sub-folder and move images around. "Thumbs.db" file size increases.
  • Backup the existing "thumbs.db" file with the following commands:
attrib -r -s -h thumbs.db
copy thumbs.db backup.db

  • Shuffle images again. Compare the new "thumbs.db" file with the backup, using the following command:
attrib -r -s -h thumbs.db
fc /b backup.db thumbs.db


Files should be exactly the same!

First trail

In a sense, this is perfectly logical in Windows world, since image ordering is a per-user setting. Two users sharing the same computer could order images differently without affecting each other's view. It would make no sense to store this is information in a single, shared file.

Per-user settings could be stored in a configuration file (e.g. ".ini" file) inside the %UserProfile% directory, but this is very "Windows 3.1" style.

At this point, we rather suspect that settings are stored under the HKCU registry hive.

Chasing the culprit

Process Monitoring the Explorer process could quickly become exhausting, given the amount of registry keys that are accessed during normal system operation. We will rather try to pinpoint the system component that manages the "thumbs.db" file.

A fast and efficient approach is to search for string references in system directories:

C:\WINDOWS\SYSTEM32>strings *.dll | findstr /i thumbs.db

C:\WINDOWS\SYSTEM32\mydocs.dll: thumbs.db
C:\WINDOWS\SYSTEM32\shell32.dll: Thumbs.db
C:\WINDOWS\SYSTEM32\shell32.dll: thumbs.db
C:\WINDOWS\SYSTEM32\wmp.dll: thumbs.db


In this case, we use strings.exe from SysInternals, which has the big advantage over various "grep" ports to be able to handle ANSI and Unicode strings all together.

A quick look inside mydocs.dll shows a string reference from CleanupSystemFolder() function, which does not seem to be related to our matter. Windows Media Player library (wmp.dll) does not seem to be a valid candidate either. Therefore, the core processing should be done in shell32.dll.

Shell32 internals

Shell32 is a rather old and complex system component - a complete analysis is out of question.

However a quick look inside this component yields interesting information:
  • It makes heavy use of the COM model.
  • It holds many interestingly named C++ classes, like CThumbnailMenu, CThumbStore and CEnumThumbStore.
  • Thumbnail processing is done in the background by a worker thread. Therefore changes are not immediately reflected, which hampers the Process Monitoring approach.
The key point is that CThumbStore class seems to implement IPersistFile, IPersistFolder, IPersistStorage and IShellImageStore interfaces, among others.

This should ring a bell about property bags, which is the standard way for a COM object to store opaque, persistent data. Therefore we will make a great leap forward, and search directly for the "bags" keyword inside the binary file.

Beginning to see the light

The search for "bags" is successful: there is very few references, with very interesting content.

The first reference comes from this registry key:
HKCU\Software\Microsoft\Windows\ShellNoRoam\Bags

It is referenced from:
CDefView::_SaveGlobalViewState()
CDefView::_ResetGlobalViewState()


The second reference comes from this registry sub-key:
DUIBags\ShellFolders\{00000000-0000-0000-0000-000000000000}

It is referenced from:
CDUIView::_InitializeShellFolderPropertyBag()

Under the "ShellNoRoam" registry key, we could find thousands of numeric subkeys, which in turn hold values of interest, like the coordinates of the image. After digging a little more, we gather the following information:
  • Monitoring the "ShellNoRoam" key with Process Monitor reveals that registry information is updated only when exiting the folder.
  • Custom image ordering will be used on folder re-opening only if "remember each folder's view settings" is checked in Explorer configuration. Otherwise default layout is used.
  • There is a bug in Windows XP pre-SP2 that prevents creating more than 200 custom views :)
At this point, there are still open questions, like "how does the ItemPos binary blob relates to effective image position?". This would require in-depth analysis of CViewState::LoadPositionBlob() maybe.

But most of the question is answered for now!

Conclusion

With a minimal amount of code analysis, we were able to pinpoint the code block that manages the "thumbs.db" file, and how persistent image location data is internally managed by the Explorer process.

Final note: this article relates to Windows XP SP2 only. Windows Vista might exhibit different behaviour.

3 comments:

Maggy said...

When I'm browsing through my picture folders, several folders containing thousands of gif's or jpeg's each, Windows Explorer is very slow in building it's thumbs.db. It would be nice if there would be a tweak that would force Explorer to build the entire thumbs.db for an entire folder at one the moment you enter the folder. Another nice solution would be a dedicated utility that would be able to create thumbs.db files for several folders in the background even before I open them.

Anonymous said...

Useful publication and excellent presentation!

Anonymous said...


The web site loading speed is amazing.