The principal of Carl Seibert Solutions and the owner of this site, Carl Seibert has become a metadata crusader. From clients who need to bring order to their assets collections, to website owners, to Creative Commons activists, the digital world needs to take advantage of better metadata. Carl has made it a mission to spread the [meta]word.
What are keywords? Why do you want them? Why is there air? Keywording is probably the trickiest wicket in the whole metadata game. Your keywording regime requires more forethought than most any other component of your workflow.
A good keywording approach depends heavily on a specific understanding of your collection, your searching needs, and the capabilities of your archive system.
There are lots of shades of gray here. Keywording can be controversial.
Looking for something? I just added a search engine to this blog.
If, say, you’re interested in using ExifTool to work with GPS data, you can now search exiftool gps, instead of reading through every post. (ExifTool and GPS are both mentioned in several posts, but closest to what you want, as of this writing, is semi-hidden near the bottom of the Emmanuel Macron portrait post. So, yeah, it was becoming pretty obvious that I needed to add that search functionality.)
(Note that I’m using orange type when I talk about search terms here because quotation marks have a particular meaning in the world of search and it would be confusing as all get out if I used them for, you know, quotations.)
There is a search box in the footer of every page and post, and the main menu at the top of every page and post now has a link to a search page.
Let’s get Boolean
The new search engine connects search terms with a Boolean AND operator. That’s like the default in Google
Today we have a new search tool and a primer on Boolean searching.
back in the day or the “Must Include All Of” option found in many search functions.
So, if you enter two search terms, like joe photographer a (hypothetical) post that included “Joe Smith is a great guy”, and “Suzy is a great photographer” would return.
AND means that a content item that contains both thisand thatand some-other-thing meets the criteria and will return. AND searches return few results (hopefully including what you were looking for). AND can be hard to wrap your head around. Another way of thinking about AND is that, on a Venn diagram, it’s the intersection. If thinking about Venn diagrams is how you roll.
You wouldn’t be blamed for thinking at first that an AND search would return, all summed up together, the results of individual searches for this, that, and some-other-thing. (Union, in Venn terms) You’d be wrong, but you wouldn’t be blamed.
OR searches are like summing up the results of separate searches. An OR search for joe photographer would return any posts that mention Joe in any way, plus any posts that mention photographer in any way. OR searches typically return tons of results that aren’t what you want. Venn-wise, OR is a union. In some programs, OR would be “Includes Any Of”.
That said, if you want to do OR searches here, I can buy an upgrade that makes that possible. Speak up in the comments. If enough people pester me about it, I’ll do that.
Our new search engine allows you use double quotes to search for an exact phrase. So “joe photographer” would return only posts that mention Joe Photographer specifically, excluding examples like the Joe Smith one above.
Partial strings are supported if the missing letters are at the beginning or end of a word. photo and grapher will both return posts with the word photographer. But graph will not.
NOT searches are not supported. Sadly.
Fancy search engines that I can’t afford (and would not likely be found in the sort of desktop software that most of you will use to manage your photos) allow users to string Booleans together like mathematical equations to make elegant searches. (joe OR photographer) NOT smith would return any posts that include either the words joe, orphotographer, but would exclude that anything that mentions that Smith guy.
So today we have a new search tool and a primer on Boolean searching. Enjoy!
Not finding what you’re looking for, even with the search functionality? It’s entirely likely that I haven’t written about it yet. Boot me into action in the comments.
Which IPTC metadata fields do you need to fill out for each of your pictures? Which ones do you take care of with your template? Do you need to add metadata to all your photos, or just a subset? Enquiring minds want to know.
Users discover that their iPhones are using AI image recognition technology to tag their pictures. Of their underwear. Gasp! A ripple on the internet ensues. But for real, can machine learning image recognition be useful? After some snarkiness, we take a look.
OK. So what, exactly, is it that I want you to do about this metadata thing?
If you give birth to photographs – label them properly with a caption, copyright notice, and some contact information before you send them out into the world.
If you operate the means of publishing or distributing pictures, or if you’re just a cog in a great machine that does that, read the label to be sure you know what’s what and that you have rights to publish whatever it is before you publish.
If you run a website, make sure your server doesn’t strip the metadata labels, also known as Copyright Management Information, off of works that are published or distributed on your site.
What do I (you) get out of it?
If you’re a photographer, you get the warm and fuzzy of knowing that your work has a fighting chance of surviving. Maybe, years from now, somebody will look at that picture, understand what it is about, and who you are. Maybe that somebody calls you up to buy a license instead of stealing your work. (Or to ask your permission to use it, even.) Heavens to legacy.
In your own life, it means that when you have 50,000, or 500,000, or a million photos in your collection, you’ll be able to find the one you’re thinking of without spending hours or days looking for it.
…the balance of karma around you will improve. Your life will be a little better. The business environment in your segment will be a little better.
If you’re licensing your work to the future through Creative Commons or some similar means, it means that, well, that will actually work. Your work won’t just go in the dustbin after one use. Your name, the license information, and supporting data will be right there in the metadata and your work can be used again and again.
If you’re a publisher, metadata on a photo gives you the opportunity to be an honest person. (Without having to break your back about it.) That doesn’t suck. You know that you really do have rights to use that photo. You know for sure who’s in the photo.
You’re preserving culture
By not removing that copyright information, you’ll be following the law. The new, disruptive, novel, one-weird-trick way to not get sued in the intellectual property biz is to follow the copyright law. (A bold strategy if there ever was one. We should make up an acronym for it.) It’s an easy warm and fuzzy. Taking one more threat that might destroy your business, even if it isn’t a statistically huge threat, off the table is a good thing in my book any day. See this post.
And, if you have zillions of assets, you’ll be able to find the one you want, too.
How do you accomplish all this goodness?
Labeling your work with metadata is usually a two-step process.
Your copyright and contact information goes on your pictures automatically (All, or just the ones you might publish, or some that will serve as “signposts” when you are searching through your collection. It depends.) Depending on what software you use, templated information like that goes on your picture when you download them from your camera cards all by itself, or it might take a couple clicks and a few seconds for each batch of photos. (Look around this site for software recommendations and instructions, metadata explainers, and even downloadable starter templates. )
Then, it will take (a little) effort to caption and keyword your final selections. Maybe a minute for each published photo.
(Read what the copyright office has to say about registering copyrights. It’s not really a metadata thing, but since we’re here…)
Website operators, or agencies, or publications:
When a photo comes to you, look at it. Are the rights OK? Does the caption seem to be accurate? It only takes a second (literally) to look.
Insist/encourage photographers, clients and whoever might supply pictures to you to label them properly in the metadata. If – excuse me, when – they don’t, (and some always won’t) mark up the picture yourself. Trust me, you’ll save more time, money and lawsuits than you invest.
Software to do this? Pretty much every creative on the planet has the Adobe suite. Adobe Bridge will get the job done. Not pretty, but done. XnView works great and it’s so cheap it’s ridiculous. One way or the other, you’ve got to look at the picture. It doesn’t really take any extra work to see what the metadata says. See my software articles for specifics.
If you run the backend of a website, make sure your server doesn’t strip away IPTC metadata where all that culturally and legally important information lives. (See this post and this one for more information on how metadata is structured within an image file.)
In the interest of full disclosure: You will pay a small – insignificant, really – price in page load time for the 8 KB or so of metadata that you’re preserving. We’re talking about a millisecond and a half per picture for fixed broadband in the US (2017), and about four milliseconds for mobile devices. By way of comparison, it takes 300 to 400 milliseconds to blink your eye. So – not too bad a bargain.
If your website runs on WordPress, all you need to do is make sure your server is using ImageMagick (instead of GD) as its imaging library and important metadata will be preserved by default. Most hosting providers support ImageMagick, and many enable it by default. In the latter case, you don’t have to do a darn thing – except choose one of those providers. (In an upcoming post, I’ll publish the first edition of a chart listing providers who support or enable ImageMagick.)
If the provider supports ImageMagick but doesn’t enable it by default, it’s usually just a matter of contacting customer support (it’s chat, usually) and the deed is done in a couple minutes.
If your site is on a different CMS, it’s more or less the same idea. You might have to specify a different imaging library or change the configuration of the one you have. Most big-time industrial CMSes already use ImageMagick as their imaging library. In those cases, we’re probably talking about updating a config file.
Hold the phone
I hear someone in the shadows calling out “What about social media? What about phones? Aren’t those things dominating the media landscape now?”
Sort of. We’re not really talking about throw-away content here. That’s the whole point.
But throw away or not, professional content has to be, well, professional. It’s critically important for facts to be right. We can’t afford to accidentally use the wrong photo, or the photo the social media user didn’t authorize. And the quantities of content in the omnichannel world are staggering. Great metadata, great digital asset management and care and attention to rights and attribution help make the difference between living and dying for people working in a social media world.
Social media tends to strip away metadata. But you still need to keep track your assets. You should make sure every picture you put out there has metadata, regardless. If nothing else, you’ll be better able to keep track of the asset later. That stripped-off-by-the media-company metadata may or may not carry the day in some future legal hassle, but it sure isn’t going to hurt. See this post for more on what metadata needs to be on a photo you release and what metadata shouldn’t be.
By the way, your copyright and byline will survive a round trip through Facebook. Everything else ends up on the cutting room floor.
Social media companies may seem like such behemoths that we can never change their behavior. A little pressure won’t hurt, though.
Make metadata on mobile
As for phones – tons of photos are made with phones today. More and more each day. While most pictures that find their way to publication pass through a computer-based workflow on their way there, some don’t.
Not to worry! There are good metadata authoring apps available for both Android and iPhone. I’ll be writing about the best for each platform soon.
Will doing this really help? Will it make a dent?
Yes. It will help you. It will make the environment around you better. Your life will be better and easier.
I just suggested that publishers and agency people insist that photos they pay for be properly marked up. Poof! In one stroke, most of the pictures on your plate will be find-able and easier to use. You’ll save time and money. Life will be good. (Or better, at least. Your health, your family life – those things metadata probably won’t help.)
Photographers will save back the time and effort of marking up their stuff and then some. And just how many calls offering reuse fees does it take to make your day brighter?
If push one day comes to shove and one day you need to sue a copyright infringer, and that CMI in the metadata makes the difference between a lawyer taking the case and getting a judgment or not, that investment in metadata will make for a happy day.
Good works can go viral
There are trillions of photos floating around out there. In terms of that giant pile, good efforts by you and your friends might not make a statistical dent. But the balance of karma around you will improve. Your life will be a little better. The business environment in your segment will be a little better. That’s better than a dent.
And communities are interconnected. Trends take hold. The content creation and publishing communities are big, no mistake. But if the players in your niche start doing a good thing, it will spread to ever wider and wider circles of influence. Good ideas can spread through whole industries in no time at all.
Have you done something with metadata that we all should feel good about? Dive into the comments. Brighten our day!
You can use web-based tools to view metadata on photos. While I doubt that’s earth-shattering news to any of you, a quick Google search on the subject returns breathless posts. “OMG! There’s metadata! Look! See!.” Granted, we have a lot of educating to do if we are to improve the environment in which photos must live online, but it’s a bit over the top. Let’s exhale and see what, if any useful resources we can find here.
I used to have an icon in my footer that linked to their manifesto, at http://www.embeddedmetadata.org/embedded-metatdata-manifesto.php (It’s not a link. Copy and paste it into a browser.)
I upgraded this site to secure all its traffic with SSL. The link to their still-non-SSL site was causing web browsers to issue security warnings to my visitors. I want you to be comfortable here, so that wasn’t good.
I should point out that neither their site nor mine (before the upgrade) is/was a danger. The old thinking was that SSL was only needed for sites that dealt with confidential information, like credit card data. Now, the feeling is that everybody should do SSL, and Google is making it a requirement for ranking in search results. Every website operator is somewhere in the process of switching over. The IPTC’s main site is already SSL-friendly, for example.
I’m a manifesto kind of guy, so for the time being, I’ll just quote the manifesto in its entirety for you right here.
Embedded Metadata Manifesto
How metadata should be embedded and preserved in digital media files
Photographers, film makers, videographers, illustrators, publishers, advertisers, designers, art directors, picture editors, librarians and curators all share the same problem: struggling to track rapidly expanding collections of digital media assets such as photos and video/film clips. With that in mind we propose five guiding principles as our “Embedded Metadata Manifesto”:
Metadata is essential to describe, identify and track digital media and should be applied to all media items which are exchanged as files or by other means such as data streams.
Media file formats should provide the means to embed metadata in ways that can be read and handled by different software systems.
Metadata fields, their semantics (including labels on the user interface) and values, should not be changed across metadata formats.
Copyright management information metadata must never be removed from the files.
Other metadata should only be removed from files by agreement with their copyright holders.
More details about these principles:
1: All people handling digital media need to recognize the crucial role of metadata for business. This involves more than just sticking labels on a media item. The knowledge which is required to describe the content comprehensively and concisely and the clear assertion of the intellectual ownership increase the value of the asset. Adding metadata to media items is an imperative for each and every professional workflow.
2: Exchanging media items is still done to a large extent by transmitting files containing the media content and in many cases this is the only (technical) way of communicating between the supplier and the consumer. To support the exchange of metadata with content it is a business requirement that file formats embed metadata within the digital file. Other methods like sidecar files are potentially exposed to metadata loss.
3: The type of content information carried in a metadata field, and the values assigned, should not depend on the technology used to embed metadata into a file. If multiple technologies are available for embedding the same field the software vendors must guarantee that the values are synchronized across the technologies without causing a loss of data or ambiguity.
4: Ownership metadata is the only way to save digital content from being considered orphaned work. Removal of such metadata impacts on the ability to assert ownership rights and is therefore forbidden by law in many countries.
5: Properly selected and applied metadata fields add value to media assets. For most collections of digital media content descriptive metadata is essential for retrieval and for understanding. Removing this valuable information devalues the asset.
Now we’ll have a short chorus of “amens!”, please.
There’s good content on the Embedded Metadata site. Mostly, it tells you the same stuff I’ve been telling you. Which means you’ve been getting the straight dope here. I take that as a good sign. I encourage you to take a look around.
Hopefully, it won’t be long and I’ll be able to link to their site again.
Which instance of the IPTC metadata does your favorite application prefer? Inquiring minds want to know.
Let’s step back for a moment for some background. Because all things that should be dead simple usually aren’t, the IPTC metadata - important information like the caption, your byline, and copyright notice - is stored in multiple places in your file.
If you have captions on your photos, WordPress will place them on your page (or post) along with the pictures. If the details in the caption were correct when the photographer - you or whomever - originally captioned the picture, they’ll be correct on your site. That means less chance to make an error. (And less room for excuses if you do.)