A cautionary tale: New French president’s portrait contains lots of metadata, but all the wrong kinds
Last week’s release of new French president Emmanuel Macron’s official portrait, by photographer Soazig de la Moissonnière, caused a stir on Twitter. Because, well, what doesn’t? Metadata on the version of the photo released on the government’s website revealed that somebody had the picture open in Photoshop for some fifteen hours. Oh my! That’s certainly something to set Twitter tongues a-wagging.
On the other hand, the Macron photo lacked a few details that normal people might want or need. Like, who is the dude in the picture? What’s his job? Under what rights terms is the picture being released? …..Crickets…… None of that was written down for us. Oops.
It looks like the Élysée Palace messed up their metadata – coming and going.
Secrets revealed?
What about that Photoshop logging data? (The illustration at the top of this post is but a small sampling.) Is there really evidence of a Photoshop conspiracy in there?
It’s not like metadata is authenticated, either. Anyone can cut and paste or edit metadata. If I wanted to impress someone with how hard I work, I could paste that 3,500-word Macron Photoshop log onto one of my photos. (It’s in French, but what the heck.) In fact, how do we know that somebody didn’t do that to the Macron picture?
If logging data doesn’t tell us anything useful, we can ignore it, right?
No. Because the Twitterverse won’t.
Twitterers move on
As it happened, citizens of Twitterville didn’t obsess for long teasing imaginary conspiracies out of the Photoshop log on Macron’s picture. They quickly moved on to the more pressing (by Twitter standards) work of cutting Macron’s body out of La Moissonnière’s photo and pasting it onto all sorts of amusing backgrounds.
Same metadata, sadly a different story
Swedish photographer Paul Hansen didn’t fare as well as La Moissonnière and Macron. After he won the 2012 World Press Photo contest, he was victimized by an internet troll, who “analyzed” a copy of his photo, and based largely on what he thought he saw in the XMP metadata, proclaimed that “This year’s “World Press Photo Award” wasn’t given for a photograph. It was awarded to a digital composite that was significantly reworked.” It was clear in context that he was claiming that his sleuthery had unmasked a forgery. A storm of derogatory (and false) internet conversation and news coverage ensued.
Hansen’s photo was subsequently examined by the contest organizers and two different (real) forensic examiners and declared genuine. It turns out that when Hansen had prepared his photo for the contest, he used an old-school burn-and-dodge technique that relies on layering the file. That bit of craftsmanship caused notations in the Photoshop logging metadata that the troll imagined was consistent with whatever he wanted it to be consistent with and the whole sorry affair went viral.
I’m pretty sure that none of us want to go through what Hansen did.
Read Wired’s account of the matter here.
What to do
We need to think about what metadata we do and don’t include when we release or publish photos that have the potential for viral unpleasantness. Every photo does need caption and copyright information. Not that many have the potential to become troll-bait, but those that do could do with having that potential bomb defused.
Some background, then a How-To
Some issues to consider:
What does logging metadata actually record about the processing of an image and should we leave it alone in the interest of transparency?
When would it be appropriate to strip that metadata off an image?
And finally, how, specifically can we do that?
What’s recorded in the metadata?
There are different kinds, actually.
Exif metadata is mostly logging information from the camera. It tells us the serial number of the camera that shot an image, what the settings were, and that sort of thing. It also holds GPS coordinates if geotagging is enabled on the camera (or phone) that made the picture.
In the XMP metadata, applications, like Adobe Photoshop and Adobe Lightroom, record logging or processing information, among other things. XMP is an open platform. Any application can write pretty much whatever it wants as XMP metadata.
XMP metadata also holds IPTC metadata, the stuff we need and/or are legally required not to alter. That means we have to be careful about a scorched-earth approach to redacting XMP metadata. If GPS data is vital to us, we’ll need to be careful of the otherwise largely useless Exif, too.
Photoshop
If you ask it to, Photoshop will record a history log in the XMP metadata. The log can be fairly detailed, or you can choose to only record when a file is saved.
That was the setting used in the Macron photo – just saves. (But lots and lots of them.) Maybe at the time, for the people involved, all those save times and the Adobe IDs of the file’s ancestors had some actionable meaning. But for us, now, they’re just wasted bandwidth and fodder for speculation.
Similarly, the Exif data might have served the photographer well if she was troubleshooting a malfunctioning camera, but at this point, it’s just excess baggage.
Lightroom
If you use Adobe Lightroom instead of Photoshop, you should be aware that Lightroom will (if enabled) record to the XMP (in the [XMP-crs] properties specifically) its Develop Module settings for the current state of a file.
This allows a file that is taken out of Lightroom and re-imported to appear as it last did in Lightroom. The file may be in color, but Lightroom knows that if was last seen as a black and white within Lightroom, it will present it in black and white. That’s generally a good thing.
If a photo is exported from Lightroom though, the Develop data is still seen in the metadata of the newly-written export file. In the black and white example, the exported file really will be black and white. It no longer can be turned back into color. At that point, knowing the settings that made it black and white doesn’t serve a purpose that I can see.
So, Lightroom users can face the same challenge as Photoshop users: once useful, but now obsolete, metadata wasting space and muddying up the waters.
When can we expunge metadata?
The important concept here is that metadata needs to be considered in the context of the lifecycle of the photo.
Regular readers of this blog know that I take archival data preservation seriously. But at the point in its lifecycle when a copy of a photo is ready to be published, it’s by no means an archival document. A file at that point in its life has been toned, resized, and sharpened in anticipation of display on certain devices. It has been through at least one vicious round of lossy compression.
(Need I point out that in the previous paragraph, the key word is “copy”? Don’t ever do any of this stuff – resizing, sharpening, compressing – or the redacting trick I’m about to share – to an original or archive copy of a photo!)
Getting rid of “bad metadata” without losing “good metadata”, step-by-step
An ounce of prevention never hurts: If you use Photoshop, consider whether you’ll ever need that history log
And while you’re there in Photoshop’s preferences, if you’re preparing photos for the web, you might consider turning off embedded previews or setting Photoshop to ask before it adds them, as well.
NOTE: With the history log turned off, Photoshop will still make metadata entries that can be potential Twitter fodder.
(That’s Photoshop ‘Preferences > History Log’, for the log, and ‘Preferences > File Handling’ for the previews.)
Lightroom users should probably NOT turn off the embedding of that develop data. See this post for an explanation. However, there is a setting in Preferences that allows you not embed Develop data, while embedding normal metadata, if that’s the way you want to go.
Cleaning up your metadata
Removing wasteful or potentially harmful metadata that’s already on your photo can be easy or less easy, depending on your needs and what other metadata is already on the photo.
If the photo hasn’t had its IPTC metadata applied yet, just use the ‘Delete Metadata’ function in Photo Mechanic or any of a dozen other programs and strip out all the metadata. Then, apply your important IPTC metadata and you’re good to go.
In the more likely case that the photo already has caption and copyright metadata, or maybe if you want to preserve the GPS coordinates, things get trickier.
The steps
Let’s say there’s IPTC metadata already on the photo, and you want to be rid of the XMP-Photoshop data. But remember, your IPTC data lives in the XMP, too. In this case, you’ll need to copy out all the IPTC metadata, including the parts that live in XMP, and paste it back later.
(Yes, a subset of IPTC data also exists elsewhere in the file in the old IIM format. We can ignore that right now. See this post for an explanation.)
If you work in Photo Mechanic, choose ‘IPTC Snapshot > Take’ from the right-click context menu for your photo. This will copy only IPTC metadata. Then, in the main menu, go to ‘Tools > Delete Metadata’ and delete all the metadata. Now go back to IPTC Snapshot and paste back your valuable IPTC metadata. You will now have all the “good” metadata and none of the obsolete junk.
NOTE: Not all metadata is stripped away. A very minimal subset, some of which is needed for the file to function, will remain.
If you use XnView, the procedure is similar. Instead of making an IPTC Snapshot, you overwrite a sacrificial metadata template with IPTC data from your photo and load it back again after using the ‘Tools > Metadata > Clean’ function to strip the unwanted data.
NOTE: XnView cannot write extended IPTC metadata to XMP. So, you’ll only get basic IPTC metadata when you paste back. If there is information in the extended IPTC fields, it will be lost.
In other programs, you will most likely have to cut and paste from the metadata to a text file and back again, field by field.
Practice on a file that doesn’t matter to you (a copy of a copy) before you try any of this for real!
Preserving GPS data
Let’s say you want to preserve GPS information from the Exif metadata.
For Photo Mechanic users, I have a fairly simple suggestion. But it won’t work for everybody.
In Photo Mechanic, we can create an IPTC Stationary Pad template (Snapshot) that appends variables for GPS data – and whatever supporting text we want – to the caption of a picture. The template will look something like this:
(In Caption. Remember to select “Append”)
Latitude: {latitude} Longitude: {longitude} Altitude: {altitude}
Once applied to the photo(s) (this step can be done in a batch), the caption will contain GPS information, like so:
Now if you do the ‘IPTC Snapshot >Take’ and ‘IPTC Snapshot > Paste’ procedure from above, all the Exif and XMP data will be gone. The GPS coordinates will appear in the caption, and you’ll be good to go.
Good to go, that is, if GPS data in the caption works for you. What if you want to use a script or a WordPress plugin that looks for the GPS data in the Exif?
In that case, you could just decide to leave all the Exif data alone. In most cases, it is pretty innocuous. (Just sayin’, ’cause removing all the rest of the Exif and leaving the GPS info won’t be fun.)
You could copy the GPS data out to a text file and write it back into its proper place in the Exif metadata, one field at a time. Adobe Bridge can do this. It can write values back to the Exif GPS fields, but it has no multi-field copy and paste and no metadata stripping function. It looks tedious. But if you’ve got to, you’ve got to.
An even more powerful tool
Then there’s ExifTool. ExifTool is a powerful command line program. It can read and write virtually any property, and can, in theory, do any darned thing you want to photo metadata. (I used ExifTool extensively in the preparation of this post.) It appears that ExifTool can strip away all the Exif tags from our metadata, while leaving the GPS tags intact and in place. Here’s the command: exiftool -EXIF:all= –tagsfromfile @ -GPS:all FileOrDirectory I’ll go into detail in a future post. In the meantime, a search on the Eixtool users’ forum should find the thread where a very kind person helped me find the command.
Not an everyday threat, but stay mindful
It can be little fiddly, but if you have a picture with internet flame potential, apply an ounce of prevention and you should be able to keep the trolls at bay while giving users and suppliers the meta-information they need and deserve.
If you do want to use ExifTool to prepare photos for publication, post in the comments and I’ll do some research and post what I find. If you know some great ways to use ExifTool, please share them with us.
Yes, there are “off-brand” desktop programs that edit and strip metadata, probably dozens of them. To be blunt, they pretty much all suck. It gives me a headache looking at them. If you find one that doesn’t suck, please, by all means, post in the comments and I’ll check it out.
It turns out that you CAN strip Exif data and leave your GPS information intact. I updated the ExifTool paragraph accordingly. It looks like you can also strip all XMP data while leaving the IPTC information that lives in XMP in place, too. But, for most people, the copy-strip-paste back method that I outlined in this post would be a better choice.
Thanks to the very helpful people on the ExifTool forum for turning me on to the commands to accomplish this magic.
I’ll have an ExifTool “How-To Lite” post shortly that will explain how to use the tool to handle a couple tasks that our GUI tools have trouble with.