Users discover that their iPhones are using AI image recognition technology to tag their pictures. Of their underwear. Gasp! A ripple on the internet ensues. But for real, can machine learning image recognition be useful? After some snarkiness, we take a look.
Author: Carl Seibert
The principal of Carl Seibert Solutions and the owner of this site, Carl Seibert has become a metadata crusader. From clients who need to bring order to their assets collections, to website owners, to Creative Commons activists, the digital world needs to take advantage of better metadata. Carl has made it a mission to spread the [meta]word.
The Ask
OK. So what, exactly, is it that I want you to do about this metadata thing?
-
- If you give birth to photographs – label them properly with a caption, copyright notice, and some contact information before you send them out into the world.
-
- If you operate the means of publishing or distributing pictures, or if you’re just a cog in a great machine that does that, read the label to be sure you know what’s what and that you have rights to publish whatever it is before you publish.
-
- If you run a website, make sure your server doesn’t strip the metadata labels, also known as Copyright Management Information, off of works that are published or distributed on your site.
What do I (you) get out of it?
If you’re a photographer, you get the warm and fuzzy of knowing that your work has a fighting chance of surviving. Maybe, years from now, somebody will look at that picture, understand what it is about, and who you are. Maybe that somebody calls you up to buy a license instead of stealing your work. (Or to ask your permission to use it, even.) Heavens to legacy.
In your own life, it means that when you have 50,000, or 500,000, or a million photos in your collection, you’ll be able to find the one you’re thinking of without spending hours or days looking for it.
If you’re licensing your work to the future through Creative Commons or some similar means, it means that, well, that will actually work. Your work won’t just go in the dustbin after one use. Your name, the license information, and supporting data will be right there in the metadata and your work can be used again and again.
If you’re a publisher, metadata on a photo gives you the opportunity to be an honest person. (Without having to break your back about it.) That doesn’t suck. You know that you really do have rights to use that photo. You know for sure who’s in the photo.
You’re preserving culture
By not removing that copyright information, you’ll be following the law. The new, disruptive, novel, one-weird-trick way to not get sued in the intellectual property biz is to follow the copyright law. (A bold strategy if there ever was one. We should make up an acronym for it.) It’s an easy warm and fuzzy. Taking one more threat that might destroy your business, even if it isn’t a statistically huge threat, off the table is a good thing in my book any day. See this post.
And, if you have zillions of assets, you’ll be able to find the one you want, too.
How do you accomplish all this goodness?
Photographer:
Labeling your work with metadata is usually a two-step process.
Your copyright and contact information goes on your pictures automatically (All, or just the ones you might publish, or some that will serve as “signposts” when you are searching through your collection. It depends.) Depending on what software you use, templated information like that goes on your picture when you download them from your camera cards all by itself, or it might take a couple clicks and a few seconds for each batch of photos. (Look around this site for software recommendations and instructions, metadata explainers, and even downloadable starter templates. )
Then, it will take (a little) effort to caption and keyword your final selections. Maybe a minute for each published photo.
(Read what the copyright office has to say about registering copyrights. It’s not really a metadata thing, but since we’re here…)
Website operators, or agencies, or publications:
When a photo comes to you, look at it. Are the rights OK? Does the caption seem to be accurate? It only takes a second (literally) to look.
Insist/encourage photographers, clients and whoever might supply pictures to you to label them properly in the metadata. If – excuse me, when – they don’t, (and some always won’t) mark up the picture yourself. Trust me, you’ll save more time, money and lawsuits than you invest.
Software to do this? Pretty much every creative on the planet has the Adobe suite. Adobe Bridge will get the job done. Not pretty, but done. XnView works great and it’s so cheap it’s ridiculous. One way or the other, you’ve got to look at the picture. It doesn’t really take any extra work to see what the metadata says. See my software articles for specifics.
If you run the backend of a website, make sure your server doesn’t strip away IPTC metadata where all that culturally and legally important information lives. (See this post and this one for more information on how metadata is structured within an image file.)
In the interest of full disclosure: You will pay a small – insignificant, really – price in page load time for the 8 KB or so of metadata that you’re preserving. We’re talking about a millisecond and a half per picture for fixed broadband in the US (2017), and about four milliseconds for mobile devices. By way of comparison, it takes 300 to 400 milliseconds to blink your eye. So – not too bad a bargain.
In WordPress…
If your website runs on WordPress, all you need to do is make sure your server is using ImageMagick (instead of GD) as its imaging library and important metadata will be preserved by default. Most hosting providers support ImageMagick, and many enable it by default. In the latter case, you don’t have to do a darn thing – except choose one of those providers. (In an upcoming post, I’ll publish the first edition of a chart listing providers who support or enable ImageMagick.)
If the provider supports ImageMagick but doesn’t enable it by default, it’s usually just a matter of contacting customer support (it’s chat, usually) and the deed is done in a couple minutes.
And tell your friends to do the same
Check out the IPTC’s Embedded Metadata Manifesto
If your site is on a different CMS, it’s more or less the same idea. You might have to specify a different imaging library or change the configuration of the one you have. Most big-time industrial CMSes already use ImageMagick as their imaging library. In those cases, we’re probably talking about updating a config file.
Hold the phone
I hear someone in the shadows calling out “What about social media? What about phones? Aren’t those things dominating the media landscape now?”
Sort of. We’re not really talking about throw-away content here. That’s the whole point.
But throw away or not, professional content has to be, well, professional. It’s critically important for facts to be right. We can’t afford to accidentally use the wrong photo, or the photo the social media user didn’t authorize. And the quantities of content in the omnichannel world are staggering. Great metadata, great digital asset management and care and attention to rights and attribution help make the difference between living and dying for people working in a social media world.
Social media tends to strip away metadata. But you still need to keep track your assets. You should make sure every picture you put out there has metadata, regardless. If nothing else, you’ll be better able to keep track of the asset later. That stripped-off-by-the media-company metadata may or may not carry the day in some future legal hassle, but it sure isn’t going to hurt. See this post for more on what metadata needs to be on a photo you release and what metadata shouldn’t be.
By the way, your copyright and byline will survive a round trip through Facebook. Everything else ends up on the cutting room floor.
Social media companies may seem like such behemoths that we can never change their behavior. A little pressure won’t hurt, though.
Make metadata on mobile
As for phones – tons of photos are made with phones today. More and more each day. While most pictures that find their way to publication pass through a computer-based workflow on their way there, some don’t.
Not to worry! There are good metadata authoring apps available for both Android and iPhone. I’ll be writing about the best for each platform soon.
Will doing this really help? Will it make a dent?
Yes. It will help you. It will make the environment around you better. Your life will be better and easier.
I just suggested that publishers and agency people insist that photos they pay for be properly marked up. Poof! In one stroke, most of the pictures on your plate will be find-able and easier to use. You’ll save time and money. Life will be good. (Or better, at least. Your health, your family life – those things metadata probably won’t help.)
Photographers will save back the time and effort of marking up their stuff and then some. And just how many calls offering reuse fees does it take to make your day brighter?
If push one day comes to shove and one day you need to sue a copyright infringer, and that CMI in the metadata makes the difference between a lawyer taking the case and getting a judgment or not, that investment in metadata will make for a happy day.
Good works can go viral
There are trillions of photos floating around out there. In terms of that giant pile, good efforts by you and your friends might not make a statistical dent. But the balance of karma around you will improve. Your life will be a little better. The business environment in your segment will be a little better. That’s better than a dent.
And communities are interconnected. Trends take hold. The content creation and publishing communities are big, no mistake. But if the players in your niche start doing a good thing, it will spread to ever wider and wider circles of influence. Good ideas can spread through whole industries in no time at all.
Have you done something with metadata that we all should feel good about? Dive into the comments. Brighten our day!
Online metadata viewer survey
You can use web-based tools to view metadata on photos. While I doubt that’s earth-shattering news to any of you, a quick Google search on the subject returns breathless posts. “OMG! There’s metadata! Look! See!.” Granted, we have a lot of educating to do if we are to improve the environment in which photos must live online, but it’s a bit over the top. Let’s exhale and see what, if any useful resources we can find here.
Support the Embedded Metadata Manifesto
What is the Embedded Metadata Manifesto?
EmbeddedMetadata.org is an effort of the IPTC.
I used to have an icon in my footer that linked to their manifesto, at http://www.embeddedmetadata.org/embedded-metatdata-manifesto.php (It’s not a link. Copy and paste it into a browser.)
I upgraded this site to secure all its traffic with SSL. The link to their still-non-SSL site was causing web browsers to issue security warnings to my visitors. I want you to be comfortable here, so that wasn’t good.
I should point out that neither their site nor mine (before the upgrade) is/was a danger. The old thinking was that SSL was only needed for sites that dealt with confidential information, like credit card data. Now, the feeling is that everybody should do SSL, and Google is making it a requirement for ranking in search results. Every website operator is somewhere in the process of switching over. The IPTC’s main site is already SSL-friendly, for example.
I’m a manifesto kind of guy, so for the time being, I’ll just quote the manifesto in its entirety for you right here.
Embedded Metadata Manifesto
How metadata should be embedded and preserved in digital media files
Photographers, film makers, videographers, illustrators, publishers, advertisers, designers, art directors, picture editors, librarians and curators all share the same problem: struggling to track rapidly expanding collections of digital media assets such as photos and video/film clips. With that in mind we propose five guiding principles as our “Embedded Metadata Manifesto”:
- Metadata is essential to describe, identify and track digital media and should be applied to all media items which are exchanged as files or by other means such as data streams.
- Media file formats should provide the means to embed metadata in ways that can be read and handled by different software systems.
- Metadata fields, their semantics (including labels on the user interface) and values, should not be changed across metadata formats.
- Copyright management information metadata must never be removed from the files.
- Other metadata should only be removed from files by agreement with their copyright holders.
More details about these principles:
1: All people handling digital media need to recognize the crucial role of metadata for business. This involves more than just sticking labels on a media item. The knowledge which is required to describe the content comprehensively and concisely and the clear assertion of the intellectual ownership increase the value of the asset. Adding metadata to media items is an imperative for each and every professional workflow.
2: Exchanging media items is still done to a large extent by transmitting files containing the media content and in many cases this is the only (technical) way of communicating between the supplier and the consumer. To support the exchange of metadata with content it is a business requirement that file formats embed metadata within the digital file. Other methods like sidecar files are potentially exposed to metadata loss.
3: The type of content information carried in a metadata field, and the values assigned, should not depend on the technology used to embed metadata into a file. If multiple technologies are available for embedding the same field the software vendors must guarantee that the values are synchronized across the technologies without causing a loss of data or ambiguity.
4: Ownership metadata is the only way to save digital content from being considered orphaned work. Removal of such metadata impacts on the ability to assert ownership rights and is therefore forbidden by law in many countries.
5: Properly selected and applied metadata fields add value to media assets. For most collections of digital media content descriptive metadata is essential for retrieval and for understanding. Removing this valuable information devalues the asset.
There’s good content on the Embedded Metadata site. Mostly, it tells you the same stuff I’ve been telling you. Which means you’ve been getting the straight dope here. I take that as a good sign. I encourage you to take a look around.
XMP, IPTC/IIM, or Exif; which is preferred?
Which instance of the IPTC metadata does your favorite application prefer? Inquiring minds want to know.
Let’s step back for a moment for some background. Because all things that should be dead simple usually aren’t, the IPTC metadata - important information like the caption, your byline, and copyright notice - is stored in multiple places in your file.
Do you need the dick pics locator?
New website tells where dick pics come from; it's all about metadata
Dick pics. Film at eleven. This week’s internet’s social, er, upheaval has it all. Bad puns. Check. Click bait. Check. Moral outrage. Check. (I guess.) Metadata. Check. Wait. Metadata?
Automatic captions in WordPress
If you have captions on your photos, WordPress will place them on your page (or post) along with the pictures. If the details in the caption were correct when the photographer - you or whomever - originally captioned the picture, they’ll be correct on your site. That means less chance to make an error. (And less room for excuses if you do.)
XnView metadata How-To
Considering its power and low cost, XnView is a must-have. XnView is a photo browsing/editing/metadata tool. It operates in a files and folders environment, like Photo Mechanic, and unlike Adobe Lightroom. It’s available at a price that suggests that there’s no reason not to have a good tool for working with metadata.
A meditation on the caption
Captions connect pictures to the world. That connection between an image and its subjects, time and place (and its author, too) gives a photo the power to endure. Join your Aunt Louise as we explore the power of the caption.
WordPress Metadata Workaround
Replace stripped-away metadata on your WordPress server with this quick workaround
If your WordPress hosting provider makes ImageMagick available for your site, that’s good news. It’s good news for metadata. It’s good news for image optimization. It’s just a great day all around.
But what if you’re stuck with GD? Is there a way to fix the metadata damage that library inflicts? Yes. Probably. Maybe. I have a workaround for you. It works for most sites, depending on what sort of access your hosting provider grants.
You’ll need FTP access
If you have FTP access to your site, this will work. Most hosts that I’m familiar with do provide FTP access for self-hosted sites. But your luck may vary. (And since you’re probably reading this because your provider already doesn’t provide ImageMagick, you might want to hold off on buying that lottery ticket.) WordPress.com, by the way, does not support FTP access for its users.
Your hosting provider may have given your FTP credentials when you first opened your account. Or, your FTP details may be somewhere on your account’s homepage. Or, maybe there are instructions in the support documentation. Or, you might have to use your account’s control panel to enable an FTP account. Or, quite likely, you’re already well familiar with all that and you’re just waiting for me to get on with this.
If you’re not familiar with using FTP, there are lots of How-Tos on the web. Just Google “FTP and WordPress”.
Step by step
What we’re going to do here is FTP to our WordPress server, grab the image files that have been stripped of their metadata, paste the metadata back on and put the files back where we found them.
Download images
Fire up your FTP client. If you need an FTP client, Filezilla is a good one. I use it. And it’s free. But they all work, FTP isn’t brain surgery.
- Connect to your WordPress server. Now, your FTP root may be the your home directory, or it may be the web root for your site, or it may be any arbitrary directory that some admin (or you) thought would be good. We’ll assume that it’s your home directory. If it’s your web root, no worries. You’ll just start a click or so closer to your destination.
- From your FTP root, navigate to your web root. It will be called something like “public_html”, or “htdocs” or the like. Inside it, you’ll see a bunch of WordPress files and directories, “wp-this or that”.
- Find “wp-content” and open it.
- Now find “uploads” and open it.
- Uploads will (usually) contain subdirectories for each year and then, inside, each month. Navigate into the correct month’s subdirectory.
(You can make a shortcut in your FTP client to save drilling through so many directories.)
Once we have safely arrived in the correct directory, we’ll be looking at our media files. Each image will appear along with several resized copies. (Four, by default)
These resized versions of the image are served to visitors responsively, according to the size of the visitor’s viewport. The original, or full-sized, image is exactly the file that you uploaded to your Media Library. It will have its metadata just as it did when it was on your desktop. The smaller ones, which have dimensions appended to their filenames, were created by your imaging library. If that’s GD and not ImageMagick, these files will stripped of all metadata. We’ll fix that.
- Download one or more complete sets of pictures, including the full-sized one(s).
(Don’t worry about interrupting visitors’ access to the images. You’re copying them. The files will be available on your site while you work on them. There will only be a period of a second or so, when we put the files back, when each file wouldn’t be available.)
On your computer, paste back your metadata
- Now, you should have the images on your local computer, in whatever folder you chose in your FTP client.
- Open that folder (I just use the desktop) in a suitable metadata-editing tool. Photo Mechanic or XnView are good choices. I’m going to use Photo Mechanic.
If you look at your files’ metadata (“I”, or the tooltip in Photo Mechanic), you’ll see that the full-size one has all its metadata in place. The others are stripped clean. We’re going to simply copy the metadata from the full-size image and paste it on the other files.
- In Photo Mechanic, the easiest way to is to take an IPTC snapshot of the big image and paste it on the smaller ones. It takes less time than reading this paragraph. See this post for detailed instructions on Photo Mechanic’s various metadata tools.
In XnView, the process is a little different. You’ll open the IPTC panel for the big image, save its metadata to a reusable template, and apply that to the other files. See my post on moving templates between programs for a fuller explanation.
- Repeat the process for each set of images you downloaded.
Upload the images again
- Now, upload the resized versions of the images right back to where you found them. There’s no need to upload the full-size ones. We didn’t do anything to them. We only copied their metadata. Uploading them would just be a waste of bandwidth.
-
- During your upload, you’ll need to tell your FTP client to overwrite the files on your server with your new, fixed, files. FileZilla has a handy radio button that lets you overwrite all files in your current upload queue. In other FTP clients, You may have to click once per file.
Your repaired images will be exact replacements for the files you downloaded, except for being a few kilobytes bigger and having working metadata.
Test to see that it worked
Now you’ll probably want to go to your website and right-click and download one of your pictures to assure yourself that your work-around-ing did indeed, work. You won’t need to do this next time.
If you put fifty photos on your website whenever you post, fixing your metadata like this could be a fair amount of work. If you’re like me and it’s a half-dozen or so pictures at a time, we’re only looking at a minute or two. It’s just a matter of making it a habit. (Full disclosure: The server on which you’re reading this is ImageMagick-challenged. I haven’t fixed the photos in last week’s post. I’ll take care of them when I post this.)
There you have it. Hopefully, this technique will tide you over until you can get that ImageMagick situation in order. And yes, the technique can be used for other stuff, like running ImageOptim (in lossless mode, please) on your resized files while you have them in front of you. Just be sure not to change the images’ dimensions.
Stay tuned. In future posts, I’ll look at optimizing images for WordPress in a metadata-safe manner, and I’ll do How-Tos for that task in our supported range of software. And, oh yeah, I owe you a metadata How-To for XnView. Dive into the comments and…comment.