Exploring c2patool and IPTC Photo Metadata
We recently did a short demo for a Project Origin workshop showing how the Content Authenticity Initiative’s open source toolkits for C2PA can work with IPTC Photo Metadata.
It shows a practical way that developers and power users can start working with the C2PA Specification straight away.
Here we record the steps we took to create the demo, along with a few notes and suggestions for improvements. It includes several “manual” steps involving editing things with text editors.
Overall, it shows that it is possible to comply with the C2PA spec, but it also shows that there is a lot of work to be done to make it easy for users.
Step 1: Create an image with embedded IPTC Photo Metadata.
We used Adobe Photoshop but many other tools can be used - see IPTC’s list of software that supports IPTC Photo Metadata Standard.
Step 2: Use exiftool to export the embedded metadata in XMP format
Using a Terminal window on my Mac, I ran the following command:
exiftool -xmp -b greenwich_flowers_with_metadata.jpg >greenwich_flowers_xmp.xml
Here is a description of what the extra arguments do:
-xmp
: Export the XMP metadata from the image file-b
: Show binary data. The XMP block includes byte-order markers which show up as binary data in exiftool output - you can see the<feff>
characters at the top of the output shown below.>greenwich_flowers_xmp.xml
: Send the output to a new file called “greenwich_flowers_xmp.xml”
We have just created an XML file that looks like this: (actually the real version is all on one long line, so I have inserted line breaks here to make it slightly more readable)
<?xpacket begin="<feff>" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 7.2-c000 79.566ebc5b4, 2022/05/09-08:25:55 ">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:xmp="http://ns.adobe.com/xap/1.0/"
xmlns:photoshop="http://ns.adobe.com/photoshop/1.0/"
xmlns:aux="http://ns.adobe.com/exif/1.0/aux/"
xmlns:exifEX="http://cipa.jp/exif/1.0/"
xmlns:xmpMM="http://ns.adobe.com/xap/1.0/mm/"
xmlns:stEvt="http://ns.adobe.com/xap/1.0/sType/ResourceEvent#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:Iptc4xmpCore="http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/"
xmlns:xmpRights="http://ns.adobe.com/xap/1.0/rights/"
xmlns:Iptc4xmpExt="http://iptc.org/std/Iptc4xmpExt/2008-02-29/"
xmlns:plus="http://ns.useplus.org/ldf/xmp/1.0/"
xmp:CreateDate="2022-07-09T17:47:17" xmp:CreatorTool="15.5"
xmp:ModifyDate="2022-07-21T15:46:32+03:00" xmp:MetadataDate="2022-07-21T15:46:32+03:00"
photoshop:DateCreated="2022-07-09T17:47:17.538" photoshop:LegacyIPTCDigest="74680B31492930D266D0FAE3B244D1BD"
photoshop:ColorMode="3" photoshop:ICCProfile="Display P3" photoshop:AuthorsPosition="Managing Director"
photoshop:Headline="Wildflowers at the Rose Garden, Greenwich Park, London" photoshop:City="London"
photoshop:State="London" photoshop:Country="United Kingdom" photoshop:CaptionWriter="Brendan Quinn"
... and much more ...
<?xpacket end="w"?>
If you’ve never seen an XMP packet before, this is probably quite confusing, but don’t worry - it just shows all the IPTC metadata properties in the XMP format that’s specified by ISO ISO 16884.
Note that exiftool
also has a JSON export function (-json
), but the format used is slightly different from the JSON-LD version that we need for the C2PA assertion. Another approach could be to generate the JSON output using exiftool -json -struct -G1
and then convert the field names to their correct XMP values.
Step 3: Convert the RDF/XML to JSON-LD
XMP’s standard data storage format is RDF/XML, which is a representation of the RDF data model in the XML format. To add these fields to a C2PA manifest, we need to convert the RDF/XML to another serialisation of RDF called JSON-LD.
There is a standard for expressing XMP in JSON-LD (ISO16884-3: JSON-LD serialisation of XMP) but exiftool
doesn't export in that format directly, so we have to do some conversion ourselves.
We can use a tool called riot
from the Apache Jena project to convert the RDF/XML to JSON-LD. If you don’t have it installed, download Jena or install it with Homebrew on a Mac (brew install jena
).
But riot
doesn't like the XML processing directives in the XMP packet. So first, we have to remove the XMP headers by hand. (Side note: if anyone knows of a good way of removing the processing headers using the command-line, please let me know! Probably some sed
magic would do the trick?)
Open the XML file in a text editor and remove the <?xpacket ... ?>
sections at the start and end of the file. If you have it, you can also use xmllint
to make the xml look a bit nicer:
xmllint --format greenwich_flowers_xmp.xml >greenwich_flowers_xmp_pretty.xml
Then you can run riot
on the RDF/XML to create JSON-LD:
riot --syntax rdfxml --output jsonld greenwich_flowers_xmp_pretty.xml >greenwich_flowers_xmp.jsonld
riot converts between various syntaxes of RDF. This command converts the original RDF/XML format to JSON-LD, a JSON-friendly version of RDF.
Unfortunately, that’s not yet the end of our story in preparing metadata for embedding into a C2PA assertion. riot
does a straight conversion to JSON-LD, including the structures such as rdf:Bag, rdf:Seq and various “blank nodes” used to handle multi-valued properties, but XMP's use of JSON-LD is much simpler. So we need to do some more conversion by hand.
The output at first looks like this:
{
"@graph": [
{
"@id": "_:b0",
"rdf:_1": "Brendan Quinn",
"@type": "rdf:Seq"
},
{
"@id": "_:b1",
"plus:ImageSupplierName": "IPTC"
},
{
"@id": "_:b2",
"stEvt:changed": "/",
"stEvt:softwareAgent": "Adobe Photoshop 23.4 (Macintosh)",
"stEvt:when": "2022-07-21T15:46:32+03:00",
"stEvt:instanceID": "xmp.iid:ef558a6e-e419-4aae-8890-f6b2b2bed77c",
"stEvt:action": "saved"
},
{
"@id": "_:b3",
"stEvt:changed": "/",
"stEvt:softwareAgent": "Adobe Photoshop 23.4 (Macintosh)",
"stEvt:when": "2022-07-21T15:46:32+03:00",
"stEvt:instanceID": "xmp.iid:f9c65ac7-2ad9-4016-84b3-97065eb121b4",
"stEvt:action": "saved"
},
{
"@id": "_:b4",
"rdf:_1": {
"@language": "x-default",
"@value": "Wildflowers at the Rose Garden, Greenwich Park, London"
},
"@type": "rdf:Alt"
},
... much more ...
All those _:b0
, _:b1
, _:b2
objects are parts of the XMP packet that can actually be converted into simple arrays and objects in JSON. Right now there is no tool that can do this conversion, so I did it by hand! It took a while... It looks like it would be handy to create a Python and/or JavaScript tool to do it for us.
Ideally exiftool
would have an option that outputs XMP in ISO16884-3-compliant JSON-LD, but until it does, we need to do the work manually.
After a lot of manual reworking effort, here’s what I ended up with:
{
"@context": {
"xmpRights": "http://ns.adobe.com/xap/1.0/rights/",
"aux": "http://ns.adobe.com/exif/1.0/aux/",
"exifEX": "http://cipa.jp/exif/1.0/",
"stEvt": "http://ns.adobe.com/xap/1.0/sType/ResourceEvent#",
"Iptc4xmpExt": "http://iptc.org/std/Iptc4xmpExt/2008-02-29/",
"photoshop": "http://ns.adobe.com/photoshop/1.0/",
"plus": "http://ns.useplus.org/ldf/xmp/1.0/",
"Iptc4xmpCore": "http://iptc.org/std/Iptc4xmpCore/1.0/xmlns/",
"rdf": "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"xmpMM": "http://ns.adobe.com/xap/1.0/mm/",
"xmp": "http://ns.adobe.com/xap/1.0/",
"dc": "http://purl.org/dc/elements/1.1/"
},
"@graph": {
"@id": "file:///Users/brendan/dev/iptc/c2pa/c2patool-play/greenwich_flowers_xmp_pretty.xml",
"dc:description": "Wildflowers at the Rose Garden, Greenwich Park, London",
"exifEX:LensModel": "iPhone 11 back dual wide camera 4.25mm f/1.8",
"xmp:MetadataDate": "2022-07-21T15:46:32+03:00",
"aux:LensInfo": "807365/524263 17/4 9/5 12/5",
"exifEX:LensMake": "Apple",
"xmp:rights/UsageTerms": "Usage permitted under Creative Commons Attribution (CC-BY 4.0) licence.",
"xmp:ModifyDate": "2022-07-21T15:46:32+03:00",
"photoshop:CaptionWriter": "Brendan Quinn",
"xmp:mm/OriginalDocumentID": "4B267F2683ABA382F9E0C1DE10BF4210",
"photoshop:ColorMode": "3",
"photoshop:AuthorsPosition": "Managing Director",
"plus:Licensor": {
"plus:LicensorEmail": "office@iptc.org",
"plus:LicensorURL": "www.iptc.org/",
"plus:LicensorTelephoneType2": "http://ns.useplus.org/ldf/vocab/work",
"plus:LicensorName": "Brendan Quinn"
},
"dc:subject": [
"Rose Garden",
"Greenwich",
"flowers",
"Wild flowers"
],
"dc:creator": "Brendan Quinn",
"Iptc4xmpCore:CountryCode": "GB",
"Iptc4xmpCore:IntellectualGenre": "https://cv.iptc.org/newscodes/genre/Actuality",
"Iptc4xmpExt:LocationShown": {
"Iptc4xmpExt:WorldRegion": "Europe",
"Iptc4xmpExt:CountryCode": "GB",
"Iptc4xmpExt:CountryName": "United Kingdom",
"Iptc4xmpExt:ProvinceState": "London",
"Iptc4xmpExt:City": "London",
"Iptc4xmpExt:Sublocation": "Greenwich"
},
"photoshop:City": "London",
"photoshop:DateCreated": "2022-07-09T17:47:17.538",
"photoshop:ICCProfile": "Display P3",
"plus:ModelReleaseStatus": "http://ns.useplus.org/ldf/vocab/MR-NAP",
"photoshop:LegacyIPTCDigest": "74680B31492930D266D0FAE3B244D1BD",
"photoshop:State": "London",
"xmp:CreateDate": "2022-07-09T17:47:17",
"dc:format": "image/jpeg",
"plus:CopyrightOwner": {
"plus:CopyrightOwnerName": "Brendan Quinn"
},
"plus:ImageCreator": {
"plus:ImageCreatorName": "Brendan Quinn"
},
"xmp:CreatorTool": "15.5",
"plus:ImageSupplier": {
"plus:ImageSupplierName": "IPTC"
},
"Iptc4xmpCore:Scene": "https://cv.iptc.org/newscodes/scene/011600",
"Iptc4xmpCore:Location": "Greenwich",
"Iptc4xmpCore:CreatorContactInfo": {
"Iptc4xmpCore:CiAdrCtry": "United Kingdom",
"Iptc4xmpCore:CiTelWork": "+44 (0)20 3178 4922 ",
"Iptc4xmpCore:CiUrlWork": "https://www.iptc.org/",
"Iptc4xmpCore:CiEmailWork": "mdirector@iptc.org",
"Iptc4xmpCore:CiAdrPcode": "WC2A 1AL",
"Iptc4xmpCore:CiAdrRegion": "London",
"Iptc4xmpCore:CiAdrCity": "London",
"Iptc4xmpCore:CiAdrExtadr": "25 Southampton Buildings"
},
"Iptc4xmpExt:LocationCreated": {
"Iptc4xmpExt:WorldRegion": "Europe",
"Iptc4xmpExt:CountryCode": "GB",
"Iptc4xmpExt:CountryName": "United Kingdom",
"Iptc4xmpExt:ProvinceState": "London",
"Iptc4xmpExt:City": "London",
"Iptc4xmpExt:Sublocation": "Greenwich"
},
"plus:PropertyReleaseStatus": "http://ns.useplus.org/ldf/vocab/PR-NAP",
"xmp:mm/InstanceID": "xmp.iid:ef558a6e-e419-4aae-8890-f6b2b2bed77c",
"photoshop:Headline": "Wildflowers at the Rose Garden, Greenwich Park, London",
"photoshop:Country": "United Kingdom",
"aux:Lens": "iPhone 11 back dual wide camera 4.25mm f/1.8",
"dc:rights": "Copyright (C) Brendan Quinn 2022. Some rights reserved.",
"Iptc4xmpExt:DigitalSourceType": "http://cv.iptc.org/newscodes/digitalsourcetype/digitalCapture"
}
}
Step 4: Add the JSON-LD to a c2patool assertion block
Now we can finally start using c2patool
!
If you don’t yet have c2patool installed, and you’re on a Mac, I recommend installing the Homebrew version:
brew tap contentauth/tools
brew install c2patool
You can try running c2patool on the image file just to see what happens. As we only have IPTC embedded metadata but no C2PA claim included, it should show “No claim found”:
Now we can follow the “manifest definition format” specification in the c2patool docs to turn our IPTC metadata structure into an assertion:
Save that as, say, test_assertion.json. Then we are told that we can run c2patool using this file as a parameter. So let’s try it:
But we get the following response from c2patool:
Note that this message contradicts itself: it requires a folder called ~/.cai
, NOT ~/.x509
as the example command says. Also the filenames for the key and certificate don’t match those given in the assertion example JSON, so I changed them in .
To make this work, on my MacOS system at least, I had to install openssl version 3 using homebrew:
And then explicitly use the openssl 3 binary when running the command (see below).
So what I had to do (after installing openssl 3) was the following:
When I ran this command, it asked me some questions about what information I wanted to put into the certificate:
It then returned with no information, but my ~/.cai
folder now contained files called es256_certs.pem
and es256_private.key
as I had requested in the command.
Step 5: Embedding the assertion into our image using c2patool
Finally we can now run the c2patool
command to sign and embed the assertion:
This returns a full copy of the embedded, signed manifest, which is also embedded into the file:
Step 6: Checking what we’ve done
Firstly, we can use c2patool
itself in read-only mode, to view the embedded manifest including our assertion:
So we can see that the signatures match and the image is un-tampered. (A fun exercise might be to try to change the metadata in a hex editor and see what happens!)
We can then load the image into verify.contentauthenticity.org to see what it looks like:
So we can see that the image is “signed by” the string that we entered into the certificate “Organization Name” field. For “produced with,” verify.contentauthenticity.org seems to use the first word in the “claim generator” string (everything up to the first space character).
But none of the IPTC metadata included in the assertion is shown on verify.contentauthenticity.org
We are in active discussions with the CAI team to work out exactly how to display asserted IPTC metadata on this screen, so watch this space!
How about if we load the image into Photoshop, with Content Credentials (the Photoshop name for C2PA) switched on? Well it doesn’t seem to do much at all…
As can be seen from the screenshot, loading the image into Photoshop and viewing the “Content Credentials Preview” window seems to just show that an Adobe signature would be added when the image is saved. Perhaps I’m missing something here - please let me know if you can see I’m doing something wrong!
(As more C2PA-compliant tools are released, we will add more content to this section.)
Step 7: Tampering with the metadata
To see if the hashing and signing feature works, I tried tampering with the embedded metadata.
To do this I created a hex dump of the C2PA version of the image using xxd on my Mac command line, edited it using a text editor, and used xxd again to re-create the binary from the tampered image.
The specific change I made was to change the copyright year (dc:rights
field) from 2022 to 2099.
Let’s see what c2patool
says about the tampered file:
As you can see in the validation_status
section at the bottom of the output, the hashes no longer match. We have detected the tampering!
Loading the image into verify.contentauthenticity.org doesn’t say that the image has been signed, but on the other hand it doesn’t say that it has been tampered with, either. I would have thought that it’s better to signal tampering to users than to simply say that there are no credentials. (Note: I now see that this doesn’t work for the previous image either, so something may have changed on verify.contentauthenticity.org since I took the above screenshots a couple of weeks ago)
Conclusions
So we have proven the concept and we have shown that it works, albeit in a limited context. What have we learned, and what comes next?
We can see that the technology works: we have created an assertion, hashed and signed it, and we can extract it and check the signature afterwards. This is a big step forward.
Creating the JSON-LD version of the XMP packet is awkward and time-consuming. We need to develop some tools to help.
Unfortunately there are no tools that can currently show IPTC Photo Metadata assertions to users, or Exif assertions for that matter, except for the command-line c2patool which is not accessible to regular users. We are working with CAI and C2PA on addressing this problem.
Notice that I simply typed in “IPTC” as the organisation when I created the certificate, and verify.contentauthenticity.org displayed it unquestioningly. I could have typed “BBC” or “The New York Times”, used it to sign anything, and the tool would dutifully display it as if it is genuine. Obviously this is the opposite of content authenticity! So we need to work on certificate chains and webs of trust (the same way HTTPS certificates work) in the next phase of adoption. We’re already talking about this in Project Origin, CAI and C2PA circles.