Google Images will include copyright-related IPTC metadata
Google announced today that Google Image Search will support some IPTC metadata. In a blog post dated today, September 27, 2018, Google Images product manager Ashutosh Agarwal says that “Starting today, we’ve added Creator and Credit metadata whenever present to images on Google Images. … Over the coming weeks, we will also add Copyright Notice metadata.”
Google will read from the IPTC Creator, Creditline, and Copyright fields to expose the metadata information.
In the IPTC’s own press release, photo metadata guru Michael Steidl says, “Embedded IPTC photo metadata has an essential role for photos posted on a website. These fields easily show people searching for images who its creator and copyright owner is. We encourage all parties who post images on the web to fill in these IPTC fields.”
Damn right!
This is a huge win, folks. The IPTC, CEPIC (the Council of European Professional Informatics Societies, which is an IPTC member and collaborated in the effort) and Google have made a giant stride. The motto of this blog says that “metadata empowers honest people”, and that’s just what happened today. Google has used it’s enormous weight to push forward the role of metadata, enhance copyright protections, and by extension, promote honesty itself.
What’s new
Users, upon finding an image in a Google Image search will see a link for “Image Credits” on a photo’s search results page. Clicking through will reveal the metadata, first from the Creator and Creditline fields, and soon from the Copyright field as well. A further Google search should produce contact information for the copyright holder, from whom a license to use the image legally could be obtained.
What you need to know: The IPTC has published a Quick Guide for metadata for Google Images here.
Full faith and force
The key takeaway is that the “force of Google” has been imposed. Users with professional level skills know that copyright management information metadata can be read from an image on Google Images by simply downloading the image and looking at its metadata. Such users also understand that scarcely any images include any metadata, either because their creators didn’t bother to put it there, or because some website stripped it away.
And no one could help but appreciate the irony in Google’s “Images may be subject to copyright” disclaimer.
That changes today
As of now, photographers are on notice that if they wish their rights to be taken seriously, they need to sign their work in the metadata.
Website operators who want to please Google – which is to say all of us – will need to check to make sure their websites preserve embedded metadata. (See this post, and this one for information on making your site metadata-friendly.)
Designers and ordinary users will be able to see at a glance who owns a photo, assuming that person has labeled their work. (And there won’t be any excuse for not looking.)
Internet hosting providers now have more incentive to provide metadata-friendly default settings for their customers.
Quick copyright refresher
The creator of a work owns the copyright to the work, unless the creator is an employee whose job is the creation of copyrighted content, or the creator has explicitly transferred the copyright to another party. Thus, it follows that a copyright owner will be identified in the Creator or Creditline metadata fields. Later, when Google exposes the Copyright field, not only will it identify the copyright holder directly, but often will contain contact information, such as a telephone number or web address.)
SEO implications?
Google hasn’t said that it will consider IPTC metadata, and a website’s treatment of it, in calculating page rank. Let’s just say it wouldn’t surprise me. I’d be shocked if they don’t, frankly. If not now, soon. Google’s oft-stated mission is to surface the best quality, most relevant, content. The concept of “authority” has long been a critical means to that end. Respect for copyright and specificity in description of content certainly seem like markers of “authority” to me.
A mere hint that Google might value something usually causes a stampede of activity as SEO consultants spread the word to their clients, who, in this case, will surely pass on requests (or requirements) for metadata to their content providers.
It doesn’t hurt that sensible metadata is basically free Google “juice”. A reasonably full set of metadata adds only a millisecond or so to page load time. And setting a web server to be metadata-friendly doesn’t cost a darn thing.
Will Google look further into embedded metadata in the future? Will they, for example, compare the contents of a photo’s own caption with the caption on the web page and use the results to predict the relevance and freshness of the content? I sure would, in their place. As a human photo curator, that’s a strategy that I do use. Google is famously tight-lipped about their ranking algorithms. They’re also logical and smart. If Google’s support for embedded metadata makes embedded metadata more commonly available as a potential ranking factor, will they go ahead and use it? I’ll bet they will.
Technical stuff
Google will be reading metadata from the XMP and IIM data blocks, in that order. If XMP is present, it’s read. If not, the IIM will be read. That’s a sensible reading order, and in fact the one I usually recommend. No mention has been made by either Google or the IPTC of reading creator or copyright data from the Exif. That’s fine by me. I never thought that descriptive metadata belonged in the Exif block anyway.
Regular readers will know of the challenges and ambiguities of the Creditline field, and that I’m not too comfortable with using it in the way that is suggested by this development. Watch this space for new guidance as my views on this field are forced to evolve. (A summary of IPTC fields can be found here.)
“Starting today” tends to be an elastic concept for Google. As I write this post in the afternoon of September 27th, I haven’t yet been able to find a working example of the new functionality anywhere on Google Images. I have included the animated GIF Google used to illustrate its own post, but I have yet to see “Image Credits” in the flesh, not even on the image Google used in their own illustration. It may be a while before a Google Images search returns metadata for your images.
You can read metadata from any image with the IPTC’s own online metadata reader. Read about the reader in this post.
In their press release, the IPTC invites website owners and software developers to contact them for help implementing metadata support in products.
This blog is part of a pro bono effort in support of the benefits of good metadata. If you are a developer, a webmaster, or a content producer and you want help with metadata, you may reach out to me, as well. In most cases (and within reasonable limits) I provide help free of charge.
Hi Carl,
Thank you for this very clear article, which gives a much better understanding of the situation than what the Google article would tell.
I would add this point, to complete your analysis:
IMATAG, a French startup that was involved in these discussions with the IPTC and CEPIC, presented a study on the state of image metadata in 2018 :
it shows that ONLY 3% OF THE IMAGES PUBLISHED ON THE WEB HAVE MORE METADATA OF CREDIT!
(The full study is here: https://imatag.com/en/blog/2018/05/11/state-of-image-metadata-in-2018/)
In fact, if you want to click on this famous button “Image Credits” in Google Image, you will have a lot of trouble because the images with IPTC credit field filled up are extremely rare.
My conclusion is that it is now up to the WEB PUBLISHERS to follow the example of Google and UPDATE their CMS, the main responsible for the stripping of credit metadata …
Christine,
You are so very right. (And by the way, I have quoted your study a couple of times and will likely do so again.)
I’m hopeful that Google’s action may start to break the terrible cycle we face now. Photographers don’t bother with metadata on the excuse that websites stip it away. Websites strip away metadata in defiance of copyright law and make that excuse that it doesn’t matter because photographers don’t bother with metadata and everybody and his dog steals stuff because – why the heck not? We’ve got to get off that merry-go-round.
Webmasters live to please Google. So, if anything is going to motivate them to action, this should be it.
Interestingly, we can now use Google Images to form a picture of what’s going on. I used a Getty picture of Christine Blasey Ford as an example for my next post. Google reverse image search revealed 40 instances of this picture on news sites. Of those, ONE had not stripped off the metadata! (It was a pool picture, and we can certainly assume that every member of the pool properly labeled the image before they distributed it.) One out of forty is less than 2%. Your study may have been generous. 🙂
You are doing important work at Imatag. Keep on with it!
-Carl