Accessing photo metadata with the Windows Runtime (WinRT) is a fairly simple matter. StorageFile contains functions for accessing various types of properties – photo-related ones included.
Some properties, however, need to be accessed a different manner – specifically, the XMP metadata by Adobe, cannot be accessed directly from StorageFile, but by using the BitmapDecoder class. This metadata area is important because it’s what companies such as Google (via Picasa) and Microsoft use to embed rich inside photos. Using it, once you figure out the syntax, is actually fairly easy. You instantiate a BitmapDecoder with the stream of a file and you call GetPropertiesAsync() on it with the appropriate XML path (similar to XPath, but not exactly… :))
This is what the call looks like, for example, to get the collection of Microsoft specific facial recognition rectangles (as put there by tools like Live Photo Gallery) inside a photo:
Once you get the regions, you can use the returned value to iterate over the contained XML elements and extract the information you want from them.
That’s all fine and good.. There is one catch though..
The memory leak
When you use this mechanism, you will find that you are leaking a little less than 20k per BitmapEncoder. That means that if you expect your user to only open a handful of photos and do stuff with them, you can safely ignore the rest of this post. At worst, your app will leak a few hundred Ks and next time it restarts all will be fun and dandy..
However, if you want to scan an arbitrary collection of photos, you will quickly run out of memory – scanning 2000 photos will leak about 150-170mb of memory.
Working around the issue requires you to do two things. One obvious, the other less so..
First the obvious, you need to access direct primitive values – doing that seems to remove some of the leak present. This is done in a fashion very similar to the previous calls. The only catch is that you cannot really interrogate the XML for it’s structure (since you cannot get it) and so you need to make some assumptions and look for completion in other fashions. For example, to get the person name and face rectangle of the first person from the Microsoft XMP extensions, you need to make the following call:
In this fashion, you will get the two primitive values (both strings) into the BitmapTypedValue and you can use them to determine what the name of the first person is and where they are located on the photo.
Now, to iterate on all the people in the photo, you will need to have a running counter starting out from 0 and make these calls, each time incrementing the counter by one. When the call to prop.TryGetValue() returns false, you will know that there are no more faces to be had and you can stop iterating.
This, however, does not solve the leak. It makes it less pronounced – at about 4k per call. Much better, but still not great if you want to scan 50,000 photos.
To solve the leak completely (or at least, to such a degree that scanning 50,000 photos will have no discernable effect), you need to do something weird. You need to wrap the stream returned to you from StorageFile by a stream of your making. The stream you create simply needs to forward the calls to the original stream. This is what the wrapper looks like:
As you can see – it has no logic of its own other than the construct/dispose – all the actual interface calls simply call into the m_stream field.
Once you pass this stream into BitmapDecoder.CreateAsync() you will see the leaks as good as go away.