Packages - NewsML-G2 Quick Start Guide

Quick Start - Packages

We recommend reading the Quick Start Guide to NewsML-G2 Basics before this Quick Start Guide to News Packages.

1. Introduction

The ability to package together items of news content is important to news organisations and customers. Using packages, different facets of the coverage of a news story can be viewed in a named relationship, such as "Main Article", "Sidebar", Background". Another frequent application of packages is to aggregate content for news products, for example "Top Ten" news packages such as that illustrated below.

A description of how to create this type of package with ordered components can be found further on in this document.

A Top Ten News package displayed on a web page
A Top Ten News Package displayed on the Web

Packages can range from simple collections on a common theme, to rich hierarchical structures.

NewsML-G2 is flexible in allowing a provider to package content that has already been published, or a package may be sent together with all of its content resources in a single News Message. See the Guidelines section on News Messages.

2. Packages and Links: the difference

The NewsML-G2 <link> property is a useful way to indicate optional supplementary resources that may be retrieved by the end-user when processing or consuming a NewsML-G2 Item. Links should not be used as a lightweight method of packaging news; a NewsML-G2 processor would not be able to distinguish between News Items with some optional resources, and News Items that are intended to be pseudo-packages using links. It is also a basic NewsML-G2 rule that a News Item only conveys one piece of content.

By contrast, Packages:

  • Express structure, allowing news to be packaged as a list, or as a named hierarchy of content resources.

  • Have a mode property that enables the expression of a relationship between the components of a package group.

3. Package Structure

A simple Package has a structure as shown in the example below. The top level for content of a Package Item is one and only one <groupSet> element, followed by at least one <group> structure containing one or more <ItemRef> references to content. The <group> structure may also be repeated, but this example has only one. The diagram below shows a skeleton of the XML elements in a simple package and a visualisation of the relationship that this structure creates:

Simple package structure
Top-level element view of a simple package, and (right) a visualisation of the structure

All Scheme Aliases used in the listing below indicate IPTC NewsCodes vocabularies, except for the following alias values: ex-staffjobs, ex-mystaff, ex-svc and ex-group.

<?xml version="1.0" encoding="UTF-8"?> <packageItem xmlns="http://iptc.org/std/nar/2006-10-01/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://iptc.org/std/nar/2006-10-01/ ./NewsML-G2_2.30-spec-All-Power.xsd" standard="NewsML-G2" standardversion="2.30" conformance="power" guid="tag:example.com,2008:UK-NEWS-TOPTEN:UK20081220098658" version="14"> <catalogRef href="http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_37.xml" /> <catalogRef href="http:/www.example.com/customer/cv/catalog4customers-1.xml" /> <itemMeta> <itemClass qcode="ninat:composite" /> <provider qcode="nprov:AcmeNews" /> <versionCreated>2021-11-17T12:30:00Z</versionCreated> <firstCreated>2008-12-20T12:25:35Z</firstCreated> <pubStatus qcode="stat:usable" /> <profile versioninfo="1.0.0.2">simple_text_with_picture.xsl</profile> <service qcode="ex-svc:uktop"> <name>Top UK News stories hourly</name> </service> <title>UK-TOPNEWS</title> <edNote>Updates the previous version</edNote> <signal qcode="sig:update" /> </itemMeta> <contentMeta> <contributor jobtitle="ex-staffjobs:cpe" qcode="ex-mystaff:MDancer"> <name>Maurice Dancer</name> <name>Chief Packaging Editor</name> <definition validto="2021-11-17T17:30:00Z"> Duty Packaging Editor </definition> <note validto="2021-11-17T17:30:00Z"> Available on +44 207 345 4567 until 17:30 GMT today </note> </contributor> <headline xml:lang="en">UK</headline> </contentMeta> <groupSet root="G1"> <group id="G1" role="ex-group:main"> <itemRef residref="urn:newsml:iptc.org:20081007:tutorial-item-A" contenttype="application/vnd.iptc.g2.newsitem+xml" size="2345"> <itemClass qcode="ninat:text" /> <provider qcode="ex-nprov:AcmeNews"/> <pubStatus qcode="stat:usable"/> <title>Obama annonce son équipe</title> <description role="drol:summary">Le rachat il y a deux ans de la propriété par Alan Gerry, magnat local de la télévision câblée, a permis l'investissement des 100 millions de dollars qui étaient nécessaires pour le musée et ses annexes, et vise à favoriser le développement touristique d'une région frappée par le chômage. </description> </itemRef> <itemRef residref="urn:newsml:iptc.org:20081007:tutorial—item-B" contenttype="application/vnd.iptc.g2.newsitem+xml" size="300039"> <itemClass qcode="ninat:picture" /> <provider qcode="ex-nprov:AcmeNews"/> <pubStatus qcode="stat:usable"/> <title>Barack Obama arrive à Washington</title> <description role="drol:caption">Si nous avons aujourd'hui un afro-américain et une femme dans la course à la présidence. </description> </itemRef> </group> </groupSet> </packageItem>

4. Document structure

The building blocks of the Package Item are the <packageItem> root element, with additional wrapping elements for metadata about the Package (itemMeta), metadata about the content (contentMeta) and the package content (groupSet). The top level (root) element <packageItem> attributes are:

<packageItem xmlns="http://iptc.org/std/nar/2006-10-01/" guid="tag:example.com,2008:UK-NEWS-TOPTEN:UK20081220098658" version="14"> standard="NewsML-G2" standardversion="2.30" conformance="power"

This is followed by Catalog information:

<catalogRef href="http://www.iptc.org/std/catalog/catalog.IPTC-G2-Standards_37.xml" /> <catalogRef href="http:/www.example.com/customer/cv/catalog4customers-1.xml" />

5. Item Metadata

The <itemMeta> wrapper contains properties that are aids to processing the package contents.

5.1. Profile

The <profile> element allows a provider to name a pre-arranged template or transformation stylesheet that can be used to process the package, for example "text and picture" could be the name of a template; "textpicture.xsl" would be an XSL stylesheet. The @versioninfo of a <profile> enables the template or stylesheet to be versioned:

5.2. Item Metadata in full

5. Item Metadata

The <itemMeta> wrapper contains properties that are aids to processing the package contents.

5.1. Profile

The <profile> element allows a provider to name a pre-arranged template or transformation stylesheet that can be used to process the package, for example "text and picture" could be the name of a template; "textpicture.xsl" would be an xsl stylesheet. The @versioninfo of a <profile> enables the template or stylesheet to be versioned:

5.2. Item Metadata in full

6. Content Metadata

The <contentMeta> wrapper in this example contains extended metadata about the person who compiled the package, including hours of duty and contact telephone number.

7. Group Set

The <groupSet> has a mandatory @root attribute that references the primary child <group> element. The primary <group> element must identify itself using an @id that matches the @root of <groupSet>.

7.1. Group

Although the id attribute is optional, in practice one must be provided to match the mandatory @root attribute of the <groupSet>, even if there is only one <group>. If there is more than one <group> element, one and only one can be identified as the root group.

Group elements must also contain a role attribute to declare its role within the package structure. The role is a QCode, but a Scheme of Roles may typically contain values representing "main", "sidebar" or other editorial terms that express how the content is intended to be used in the package.

7.2. Item Reference

The <itemRef> element identifies an Item or a Web resource using @href and/or @residref. The IPTC recommends that Package Items should reference NewsML-G2 Items if they are available (typically NewsItems) rather than other types of resource, such as "raw" news objects. Referring to other kinds of Web-accessible resource is allowed and is a legitimate use case, however it has some disadvantages. Resources referred to in this way cannot be managed or versioned: if one of the resources is changed, the entire package may need to be re-compiled and sent, whereas a reference to a managed object such as a <newsItem> may refer to the latest (or a specific) version.

The example versions the referenced Items using @version, and gives processing or usage hints using @contenttype and @size. The @contenttype uses the registered IANA Media Type for a NewsML-G2 News Item:

The Item Reference includes properties from the referenced Item that have been extracted as an aid to processing:

8. Hierarchical Package Structure

Hierarchies of Groups and Item References can be created by adding multiple Groups to Packages and using <groupRef>, to reference other Groups by @idref, as illustrated by the following diagram:

The code listing below shows how such a hierarchical package would be fully expressed in XML in a NewsML-G2 Group Set:

LISTING: Group Set example showing Hierarchical Package Structure

In the example, the "root" group is identified as the group with id="G1". This group has a role of "main" and consists of a text story and a picture of Barack Obama. The group with id="G2" has the role of "sidebar" and contains a text and picture of Hillary Clinton. It is referenced by a <groupRef> in Group G1.

9. List Type Package Structure: “Package Group Mode”

The @mode attribute indicates the relationship between components of a group using one of three values from the IPTC Package Group Mode NewsCodes (recommended Scheme Alias pgrmod):

  • pgrmod:bag -- an unordered collection of components, for example different components of a web news page with no special order, as in the example below. This is the default @mode.

  • pgrmod:seq -- denotes a sequential package group set in descending order, for example a "Top Ten" list: each sub-group would provide references to a text article and a related picture.

  • pgrmod:alt -- an unordered collection. Each sub-group is an alternative to its peer groups in the set, for example coverage of a news event supplied in different languages.

The diagram above shows a package containing two Items in the root group, and a group reference to a "group of groups" with package mode set to "alt" indicating that the child groups contain alternative content. The example uses groups of associated video suitable for different Android device screen sizes as indicated by the @role of each group.

The code overview shows the root group referencing the two Items and the <groupRef> element referencing the group with @id "G2". Group G2 has its package mode set to "alt" and its components are references to alternate groups G3, G4 and G5, which reference videos at the required rendition for each screen type.

The right-hand image in the diagram is a visual representation of the relationship expressed through this package structure.

Note the <group> that has its Mode set to "alt" -- not the "main" group but the second group with @id "G2". The components of this group are alternatives: each references a group containing the video content. The code example below shows how this relationship is fully expressed in NewsML-G2:

LISTING: Group Set example showing an "alt" Package Group Mode

10. A Sequential "Top Ten" Package

The screenshot at the start of this Chapter shows a "Top Ten" list of news items in order of importance. The Package Group Mode of pgrmod:seq indicates that the components are in descending order and a code skeleton and visual representation of the package structure is shown in the diagram below:

Note how the <group> sets the Mode for its components, in this case the component group references of the "main" group are sequentially ordered. The relationship is fully-expressed in XML in NewsML-G2 as shown below:

LISTING: Group Set example showing a "seq" Package Group Mode

11. Package Processing Considerations

11.1. Other NewsML-G2 Items

In the above examples, the referenced resources in the package have been News Items, but <itemRef> may also refer to other Items, such as Package Items. The following example of <itemRef> shows how a Package Item can be used as part of a Package Item. This type of "Super Package" could be used to send a "Top Ten" package (a themed list of news) where each referenced item is also a package consisting of references to the text, picture and video coverage of each news story.

The advantage of using this "package of packages" approach is that it promotes more efficient re-use of content. Once created, any of the "sub-packages" can be easily referenced by more than one "super-package": a package about a given story could be used by both "Top News This Hour" and by "Today’s Top News". If the individual News Items that make up a sub-package were to be referenced directly, these references have to be assembled each time the story is used, either by software or a journalist, which would be less efficient.

As these sub-packages are managed objects, we use @residref to identify and locate the referenced items. Each referenced item may be a Package Item, shown by the Item Class of "composite" and the Content Type of application/vnd.iptc.g2.packageitem+xml. Each <itemRef> would then resemble the following:

11.2. Facilitating the Exchange of Packages

There needs to be some consideration of how such a "Super Package" should be processed by the receiver. The power and flexibility inherent in NewsML-G2 Packages could lead to confusion and processing complexity unless provider and receiver agree on a method for specifying the structure of packages and signalling this to the receiving application. Processing hints such as the <profile> property (described above) intended to help resolve this issue.

In the example below, we maintain flexibility and inter-operability with potential partner organisations by defining any number of standard package "templates" – termed Profiles – for the Package, among other processing hints. Partners would agree in advance on the Profiles and rules for processing them. All that the provider then needs to do is place the pre-arranged Profile name, or the name of a transformation script, in the <profile> property.

Package profiles could be represented as diagrams like those shown below:

In this example, the Profile Name is intended to be a signal to the processor that references to each member of the Top Ten list are placed in their own group, and that we create our Top Ten list in the "root" group of the Package Item as an ordered list of <groupRef> elements. (as in the "Top Ten" list profile shown in the above diagram)

The properties in <itemMeta> that can be used to provide information on processing are:

<generator>, a versioned string denoting the name of the process or service that created the package:

<profile>, as discussed, sets the template or transformation stylesheet of the package

<signal> is a flexible type property (may have a @qcode, @uri, or @literal value) that instructs the receiver to perform any required actions upon receiving the Item. An <edNote> may contain natural-language instructions, if necessary, and a <link> property denotes the previous version of the package.