26 comments

  1. Jim says:

    Nice article, thank you. I have a question though about WordPress and metadata. When WordPress makes versions of your images at smaller sizes, does the metadata, like copyright and author, stay with the new images? Often it’s a smaller version used in a page, so it wouldn’t be good if those data were missing.

    • Carl Seibert says:

      Good question. I just took a quick look and I’m sorry to say that you might not like what I found. This is pretty gnarly.

      Windows 11 properties dialog wrote its “Title” field to the Exif Image Description field, a field in the Exif that Microsoft has chosen to call “XP Title”, the IPTC / IIM Caption (at least that can be read by standards-compliant software, although in the real world the caption is not the same thing as the title), the XMP Title / Object Name field, and to round it out, the XMP Description/Caption field.

      Windows wrote “Tag”, which I presume to be keywords, to something in the Exif that it calls “XP Keywords” Holy moly. Windows did, at least, also write the keywords/”Tags” to the IPTC / IIM Keywords field and the IPTC’s XMP Subject (keywords) field. So your keywords are fairly safe.

      The “Subject”, which I presume to be the caption, was written only to “XP Subject” in the Exif.

      There is also “XP Author”, and “XP Comment”. Altogether, there are five only-in-Microsoft’s-alternative-world nonstandard fields in the IFD0 directory of the Exif. The Exif is really not the place for descriptive metadata in the first place, but oh, come on! There are standard Exif fields for Creator (Artist in Exif), Copyright, and a caption (Description in Exif). Some programs will copy values out of those fields into the appropriate IPTC fields. But putting information in fields that nobody’s ever heard of ?!

      They did OK with the Creator and Copyright fields, writing them to all three standard places (in the XMP, IIM, and Exif).

      If “subject” means caption, then the caption was buried so you couldn’t find it with a Geiger counter.

      They wrote “Title” to the places where a caption should go (As well as the Object Name field). So, if you know that, I suppose you could write captions in that field.

      Now, there’s no guaranty that Windows will write this data in the same places consistently. Microsoft has been known to apply logic to make finding the metadata a dynamic adventure. Even if it is consistent, this is pretty much a mess.

      Windows Photos app appears to only be able to write a Description/Caption, which it wrote to XMP Description and Exif Description fields, but not the IIM one, oddly enough, as well as the crazy “XP Description” field in the Exif. Plus the Title/Object name field (overwriting any legit info that might have been there)

      Remember that the OS itself wrote the “Title” to the IIM Caption field. The IIM matters because, while it was supposed to be phased out a decade ago, it remains the only metadata block that many programs can read. For the life of me, I couldn’t find any sort of keywords field in Photos.

      So there you have it. It’s crazy enough that unexpected behavior can be expected. Metadata is important. My advice is as always to use an IPTC standard-compliant program to write your metadata, so you can be assured that it can be read later. Photo Mechanic, Lightroom, ON1 RAW, Bridge, Capture One, and, if used carefully, XNView can all do good job.

      -Carl

  2. Carl Seibert says:

    Hi Jim,

    >so it wouldn’t be good if those data were missing.

    Darn right!

    Whether WordPress honors metadata on the various versions of the file create when you upload to Media Library depends on imaging library your server is using. By default, WordPress will use ImageMagick, if it is installed and activated. If that’s the case, you’re good to go. All your versions will have proper metadata. There’s a bunch of information on this in this post: https://www.carlseibert.com/wordpress-honors-metadata-sort-of/

    If ImageMagick isn’t on your server (Like, say, here. We haven’t got it set up on the server that serves this site yet, sadly.) WordPress will use the GD library (that’s the name, not what I think about it) that ships with PHP. GD destroys metadata. Doesn’t do as well with images themselves, either. If you’re stuck with GD, there is a workaround, which is described in this post: https://www.carlseibert.com/wordpress-metadata-workaround/

    Upcoming, I plan on doing a chart of hosting providers that support ImageMagick.

    • Carl Seibert says:

      Good work! I took the liberty of saving a copy of your chart for future reference.

      It’s a pity Microsoft appears to be making progress in the wrong direction.

      When I was working on my Creative Commons posts, I came across several documents from about the time Adobe released XMP as an open standard that were all full of optimism about how everybody was going to migrate to XMP and it would all make sense – very soon – from that point in 2009 or so. Microsoft was mentioned. Of course, it hasn’t exactly worked out that way 🙂

      • Sure, it is a Google Docs document, I am still finding out some behaviors and revise it. Most recently I added some old Digital Image Library keyword tags which WPG also reads, and then, interestingly, moves them to xmp when edited.

        For quite some time I have been digitizing all of my family’s photos, painstakingly cataloging and adding metadata as a hobby. I started using Microsoft Digital Image Suite’s Library app which later evolved into Windows Photo Gallery, which I still use today even if it is now unsupported and adding “People Tags”. I also use GeoSetter, which I highly recommend for geotagging and also adding other metadata. Among the things which I like about GeoSetter is that it uses exiftool and allows you to execute post editing exiftool commands which I use to ensure certain tags are in sync.

        I post from time to time about my metadata struggles on my blog – https://jmoliver.wordpress.com

  3. Joep Bord says:

    Now that I want to start a database for my pictures I came across your fine article. I just started to struggle in comparing all kind of programs. I need two things. A good metadata system and a good program to treat my ARW and CR2 RAW-files with the possibility of local editing. I prefer to combine those two wishes. For metadata I followed your advise to use DigiKam or XnView.
    I found out that I can use Darktable for RAW editing. The problem is that it is very hard to let this programs interchange data with DigiKam. Darktable reads all the keywords, titles etc from Digikam, but after changing/removing a keyword in Darktable, Digikam doesn’t read that back.
    In your comparison you didn’t mention Darktable which is an interesting program.
    What do you think of this program and do you know a solution for this problem It can be one program that does both or a good combination of two programs. What about Capture One for example. Quite expencive for private (but serious) use, but cheaper then Lightroom

    • Carl Seibert says:

      Hi Joep,

      I’ve only briefly played with Darktable, so I’m of no help there.

      Since you mention Capture One, I’m going to guess that you’re on Windows or Mac. In that case, I would commend ON1 Photo RAW to your consideration. It’s an all-in-one, like Lightroom or Capture One, but it has some advantages that I take seriously over either. By default, it does not depend on a central database for its non-destructive editing information. It handles that in sidecar files. That means that you can work on photos and archive them any way you want without having to worry about that database. ON1 is completely standards-compliant with metadata, and its metadata functionality is efficient to use. On the archiving side, you simply set watch folders and it catalogs anything that’s put in them (like the way DigiKam works). I like its RAW editing, too. It’s the program I actually use. I don’t usually use it alone. I do a lot of stuff in Photo Mechanic, which works perfectly with ON1. ON1 is inexpensive. Right now, it’s only $100 USD, and frequently they have it on sale for even less.

      -Carl

      • joep Bord says:

        Hi Carl
        Thanks for your advise. Yes I am on Windows. I will try ON1. BTW I found out that Darktable also makes sidecar files like DigiKam. Although the XMP files have a different structure, there is an interchange of tags/labels between the XMP files of Darktable and DigiKam. You can find the tags made in DigiKam in the XMP files of Darktable and vice versa. But Darkroom itself doesn’t show up with the tags and titles made in DigiKam.
        Both DigiKam and Darktable have problems with connecting my Sony a7III.
        Darktable and camera work reasonable with tethering but not with bringing files direct from camera to the program/computer.
        Joep

      • Joep Bord says:

        Dear Carl,
        I followed your suggestion and tried ON1. It was worthwile a try. ON1 does a perfect job considering metadata and has some cool editing features. The problem with ON 1 is that it is to slow in editing and uses to much computer capacity. The CPU is used often for about 95 % while the GPU is used max 25 %. At this moment I prefer other programs for editing in combination with Digikam or XnView for searching. I hope ON1 will develop soon as a perfect solution. It is promising but not yet there.
        Joep

        • Carl Seibert says:

          Indeed, ON1 is not the fastest bunny in the race. I find it exports significantly slower than Lightroom and (along with every other program, WAY slower than Capture One.) Although, frankly, the slow export hasn’t really bothered me. I don’t see it hogging machine resources on my laptop. Right after starting the program, while it is updating my cataloged folders (of which I have a fair number), it will make the fans spin fast. But it that scan is a low priority process and it doesn’t seem to affect anything, including ON1 itself. Good news is that they make it faster with each release. So, maybe someday….

  4. hackerb9 says:

    Thanks for the informative post!

    I am writing a command line program to quickly view and delete images
    within a terminal window. I wanted to put in the ability to edit
    comments, but had no idea what a can of worms I was opening. After
    reading the IPTC recommendations I was even more confounded —
    “headlines” sound more like what I call “titles” and “titles” are what
    I’d call “filenames” — but this blog post did help, somewhat.

    I do have a couple questions (for you or your knowledgable readers):

    1) What is your suggested order for reading metadata? I was going to
    go with XMP > IIM/IPTC > EXIF, but it seems like I’d be in the
    minority. And, since some programs you mentioned cannot actually
    write XMP, it seems like the EXIF or IPTC data has a better chance
    of being current.

    2) Which tags should I make it easy for people to edit and what should
    I call them? “Caption” (AKA “Description”) seems important, but
    what about “Comment”, “Title”, “Headline”, “Subject”, “Label”, etc?
    Which ones do people actually use? And how do the tag names people
    are familiar with map to the underlying XMP/IPTC attributes?

    • Carl Seibert says:

      Hi,

      Thanks for reading!

      For a reading order, I usually recommend a simple XMP > IIM > Exif. The machinations in the Metadata Workling Group guidelines of 2010, in addition to being complicated, were predicated on assumptions that don’t make sense anymore. (To me, anyway.) Back then, it was assumed that the world would move quickly to XMP and IIM would soon be deprecated. Hasn’t happened. In that world, it was assumed that rogue, out of sync, data would generally be written by an archaic IIM-only-writing program. In today’s world, out-of-sync data is just as likely to be written in the XMP. Exif has always been a political football.

      My logic is that XMP doesn’t pose tag length restrictions and is thus more likely to be authoritative than the possibly truncated IIM. If any descriptive data exists in the Exif, it was probably put there by the camera, would most likely have been superseded by, or transferred to, the IPTC fields. If it’s out of sync, it’s pretty likely NOT to be as authoritative as the more recently written IPTC. Few programs can read it anyway. It’s just not where descriptive data is SUPPOSED to be.

      I have a post here describing all the fields and, IIRC, their many and various names through the years. I have to confess I don’t do much to help the confusion. I often refer to fields by their traditional names, rather than their proper ones.

      (It also doesn’t help that ExifTool uses it’s own naming convention, with IIM fields usually called by their old school names and XMP ones by the new names. The fields, of course, are the very same fields. The field is one entity in the schema; it’s just written in duplicate.

      Fields that I think matter: Caption/Description is obvious. As are the Creator and Copyright fields.

      (Google Images supports Creator, Copyright, and Credit, BTW. Credit is poorly understood and rarely used outside of the publishing and stock photo industries. That Google chose to include it is a testimony to the fact that they were, shall we say “urged” to do the right thing by a pending lawsuit by a company entrenched in both of those industries.)

      Keywords is important, too. I would avoid dallying with the special Lightroom hierarchical keywords field. Just the normal ones will do fine. It’s very important to always put keywords in the standard IPTC place. There are unfortunately programs out there that write them to idiosyncratic places, causing tragic data losses when people migrate away from them.

      Title/Object Name and Headline are interesting in that they are used differently by different organizations. Headline is supposed to be just exactly that. Object Name is supposed to be a “slug”, a human-readable not necessarily unique identifier for the image. But usage in the real world is all over the place. I doubt most individual photographers use either one of them.

      “Subject” is the XMP place for keywords. It is synchronized with the IIM “Keywords” field. The pair are always labeled “Keywords”.

      Rating and Label are the fields that carry the star rating and color label fields that most GUI programs use. Both live in the XMP. Image Rating is an official IPTC field and its use is standardized. Label is not part of the standard and its use is all over the place. I recommend using the Adobe Lightroom color set as it’s something of a lowest common denominator. (I covered labels in, I think, the Photo Mechanic 6 post or video.)

      Comments is actually not an IPTC field. It’s in the Exif and is not terribly widely supported. It makes no sense, frankly. The Exif is generally written to by the camera and how is a camera going to comment on anything? The closest thing to a real comments field is Special Instructions. (Which, refreshingly, is never called anything else.)

      Now that I’ve brought it up, there are three Exif fields that are semantically equivalent to IPTC ones, “Artist”, which is Creator by yet another name, Copyright, and Description. Caption/description is useless because cameras can’t write anything useful there. Cameras can do just fine with Copyright and Creator because those are standing values, the same for every picture. But information in those fields doesn’t do much good unless it is later transferred to the proper IPTC fields.

      The IPTC standard has the current official names for all of the fields and you can glean from it which fields are synchronized between the IIM and XMP. (There’s actually a chart in the standard. But it omits a couple of obscure fields that are actually synchronized, so it isn’t as authoritative as poring over the field descriptions themselves.)

      There is a link here to download the current IPTC standard. Just use the search function.

      And I would be happy to check over your work if you’d like.

      -Carl

      • Scott Meyers says:

        I’m embarking on a project to scan our family’s old and sizable slide collection, and your remark about “tag length restrictions” made my blood run cold. My plan had been to write metadata about scanned slides to the three description fields (Exif Image-Description, IIM Caption-Abstract, XMP dc:Description) in a bid to (1) ensure consistency between these fields and (2) make the information visible via as many programs as possible (hello, Windows’ File Explorer). I don’t anticipate writing giant amounts of information per slide, but the thought of length restrictions is troubling, and the remark at https://support.captureone.com/hc/en-us/articles/360003412157-XMP-and-IPTC that IPTC fields are limited to 32 characters is positively horrifying. Your mentioning of IPTC field truncation makes it worse. I’m now concerned that if I write a description longer than 32 characters, it will be truncated in the IPTC Caption-Abstract, but not in the Exif or XMP description fields, thus giving rise to the inconsistency writing to all three fields is intended to prevent.

        I’d be grateful for any information you can provide regarding tag length restrictions on these description fields, and I’d very much welcome your advice regarding the best way to represent general free-form descriptive metadata for scanned images such that the information is likely to be viewable (in its entirety) by as many people using as many programs on as many platforms as possible. The thing about scanning family slides is that the resulting images are likely to be distributed to lots of different people with various levels of technical sophistication who will view the images in a variety of different computing environments. I want to maximize the likelihood that what I do will work for everybody, both now and in the future.

        Unrelatedly, thank you *very* much for publishing the original article. It represents a huge amount of work (as does the writeup itself). It’s enormously helpful to those of us grappling to get a handle on the practical realities of working with this kind of metadata.

        • Carl Seibert says:

          Actually, the character limit for the caption/description in the IIM is 2,000 characters. So, in any but the most extreme cases, you should be OK. Different IIM fields have different character limits. Some are indeed 32 characters.

          Your concern is well-placed though. Character limits aren’t the only thing that can bite when reading back text fields from the IIM or Exif. XMP effectively has no character limits and has good character set support, which is particularly important in captions, since many names have diacritics and ASCII doesn’t do diacritics. Exif, I should point out, doesn’t have set character limits – although authoring software might – and some Exif fields can support more than ASCII, but this one doesn’t (Pretty much everything about Exif sets my teeth on edge.)

          All of this is why I advise any software developer who asks to prefer – for fidelity’s sake – to read XMP and then go to IIM and Exif, in that order, if the XMP doesn’t exist.

          In order to populate all three fields, you’ll need metadata authoring software that will write to all three fields. Of those that I cover, ON1 Photo RAW, Lightroom Classic, and the Adobe products that use Photoshop’s metadata functionality (Photoshop, Bridge, Illustrator, and some others) will all write your caption to all three places.

          Photo Mechanic will (generally) write to the Exif field that’s semantically equivalent to an IPTC one only if Photo Mechanic finds that field already populated. Why, you might ask? Well, it’s a bit complicated. There’s a general reluctance to write stuff into logging data, which Exif is. That’s one thing. But there’s data in the Exif that might be incorrect and might need to be fixed. Think the Create Date, or Date/Time Original as it’s known in Exif. Or the Creator and Copywrite fields. All of which can be written by the camera and can be wrong, meaning that a program like Photo Mechanic is obliged to overwrite those fields to keep the file correct and in sync. Every developer resolves the “Exif conflict” differently. Photo Mechanic tries not to fix it if it ain’t broke.

          This matters for you because the Exif Image Description field is a super-prime candidate for getting out of sync down the road. With some software reading it or writing to it and some software not, it’s easy to see what might well happen if your captions are edited in the future.

          Thank you for your well-founded comment and especially for caring about your photos’ legacy. It’s people like you who motivate me to do this.

  5. Roger says:

    Doesn’t look like he ever replied to, or possibly even read your thorough and authoritative response – but I did, and wanted to thank you for putting out all this very useful information, that I’m sure took many years to collect and analyze.

    You make it look obvious, simple even, in a way that only a true subject expert can master. Thanks for sharing this knowledge and your expert advice!

  6. I just copied the Photos library to my external Archive drive, then I deleted the default macOS Photos library, and made the new one my default in the Photos app. So, now only the pictures I take with my iPhone are transferred to the hard drive + backed up.

    I found this to be the easiest/best way, and not take up any space on my machine (because I don’t use Photos on my Mac at all), but at the same time, I get my pictures from the phone archived and a good backup.

  7. Scott Meyers says:

    Regarding your comments about Apple Preview, I just did some tests using the most recent version of MacOS, and at least some XMP fields are now read if Exif and IIM information is missing. So Apple Preview now seems to be IIM > Exif > XMP. Shockingly, my tests indicate that this is regardless of the tab you’re looking at in the Preview Inspector. So if all of Exif, IIM, and XMP metadata are in the file, you’ll see IIM metadata on the Exif and TIFF tabs, not just on the IPTC tab. If IPTC and Exif metadata is missing, but XMP is present, you’ll see XMP field values on the Exif and IPTC tabs (as well as the TIFF tab). I don’t know what the TIFF tab is supposed to show, anyway, but since all metadata in all tabs seems to follow IIM > Exif > XMP, the different tab labels seem to be purely cosmetic.

    If you have time to run some tests of your own, I’d be interested to know if you can reproduce the above. I was really shaken to see IPTC information on the Exif tab.

    • Carl Seibert says:

      Very cool. Thank you for doing that. This is great news (I think) as far as some support for XMP data is concerned. The weirdness of things popping up here, there, and everywhere is a bit troubling. But a small step forward is still a step forward.

      I have been massively busy lately and I haven’t devoted as much attention to the blog as I should. I need to go back and refresh this whole post, looking for developments like the one you have so kindly documented.

      • Scott Meyers says:

        It’s interesting that you focus on the XMP support, because I can’t get past a tab labeled “Exif” that shows IPTC metadata if the file has IPTC metadata in it. If everything is in sync, this is harmless, but if you’re looking at the metadata to check to see *whether* the IPTC and Exif info are consistent, you’ll come away thinking they are, even if they aren’t. Ditto if the file has only XMP data, because you’ll then be presented with tabs for Exif and IPTC claiming to show metadata that is not actually in the file.

        I can’t forgive an app that lies, and Apple’s Preview does when metadata in Exif, IPTC, and XMP are not consistent. That XMP can now be part of the lie is hardly consolation.

        • Carl Seibert says:

          “Unforgivable” is indeed a good word for it. And, honestly, this stuff isn’t that hard. If teeny little companies can get it right, the world’s most “valuable” corporation should be able to get it right.

          You can check to see if a file is in sync with the IPTC’s online tool at getpmd.iptc.org (There’s a link below in the footer.) Out-of-sync files are a real problem. I don’t know of any programs that people would be expected to use to work with their files that check to see if the file is in sync. Yes, a good program like Photo Mechanic will write your edits in sync. But if the file was out of whack, you’ll never know what you just wrote over.

  8. RK says:

    Hi, Thanks for a very informative article. I do mostly tagging and i do it in Windows Explorer or Photo Gallery. Since you mention how windows obtains the metadata it displays, it would be great if you could also write where it updates data, e.g. when the user views the Title of an image and modifies it. I think they update at least both EXIF & XMP. Thanks

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.