6.0 Meta-Data

 

Authors: Tom Snyder & Eric Hoffert. Created: 9/15/06. Updated: 2/27/07. 

 

One issue that often arises in the Enterprise CMS (content management systems) and Portals sector is: how can we utilize the value in all our unstructured data?  Unstructured data are uncategorized, unprocessed documents. Humans are able to extract all the information from an unstructured document, albeit slowly, by reading it. Computers cannot.

 

Meta-data are the structured attributes attached to a document.  Examples are: the original author, keywords, comments, the project it is associated with, a classification of the purpose of the document, the privacy limitations, the audience it is intended for, rankings of importance, etc. Meta-data are what computers use to execute a work-flow process and help us find and use the correct documents. Attaching and maintaining structured meta-data to documents dramatically improves the efficiency and productivity of any document system.

 

An OpenSAM mashup uses the following mechanisms to capture, transmit, and store any meta-data associated with a document. 

Meta-Data CGI Parameters

When a Home Application launches a Productivity Application to create or edit a document, the Home Application knows much about the reason, the context (account, project, etc.), and the circumstances surrounding the birth or change of the document. It passes this information to the Productivity Application to be attached to and maintained with the document.

 

Table 6.1 Meta-Data CGI Parameters

The MNAME portion of the CGI Parameter is filled in with the attribute name as determined by the Home Application.

The values for these parameters are strings or URL encoded XML, as defined by the Home Application.

StorageMeta_MNAME

One of potentially many attribute/value pairs attached to the document.

 

Storing Meta-Data

When the Productivity Application stores a document via WebDAV, it also issues PROPPATCH requests on the document to set the meta-data values it was given when launched. It does not include the StorageMeta_ prefix portion within the attribute name when performing the PROPPATCH.

 

For example, if the document KeyThoughts.html were created with StorageMeta_Region=South%20America and StorageMeta_Audience=Internal-Private, then the following PROPPATCH would be issued:

PROPPATCH /handlr/KeyThoughts.html?
&StorageSessionId=nrksnx6ockEkHoyiSTd__3ZfWPeP1YrYELU5.XofrryIMM%3D
&StorageOrg=ShareMethods
&StorageUserName=joe%40shareoffice.demo HTTP/1.1

Host: webdav.homeapplication.com
User-Agent: Server-to-Server HTTP_Request
Connection: close
Content-Type: text/xml
Content-Length: 262

<?xml version="1.0" encoding="utf-8" ?>
   <D:propertyupdate xmlns:D="DAV:" xmlns:Z="http://www.w3.com/standards/z39.50/">^M
     <D:set><D:prop><Z:Region>South America</Z:Region>
                  <Z:Audience>Internal-Private</Z:Audience>
            </D:prop>
     </D:set>
   </D:propertyupdate>

 

If a user performs a Save As... to create a new file from an existing file, the Productivity Application must take extra care to propagate the CGI Parameter's meta-data to the new file as well.

 

This raises another issue -- what happens when the user opens file AA, and then saves it as BB. We wish BB to inherit the meta-data associated with AA, but perhaps not all of it. Since the meta-data is defined and managed by and for the Home Application, we allow the Home Application to decide which attributes to propagate from AA to BB. We do this by informing the Home Application which file the new file was derived from by doing a PROPPATCH of a special attribute:

 

<Z:SavedAsSourceFile>URI of the source file</Z:SavedAsSourceFile>

 

The Home Application is free to perform meta-data propagation, simply store for future use, or completely ignore this special property.

Visibility of Meta-Data Attributes

In the preceding discussion, the meta-data attributes are defined and consumed by the Home Application and are opaque to the Productivity Application.

 

Users are accustomed, however, to viewing and updating a document's meta-data within a document editing application. For example, a File Properties dialog box often gives access to several panes of document meta-data.

 

To accomplish this in a convenient, but also extensible way, OpenSAM defines two ways for meta-datum to be classified as being appropriate to show to a user:

 

1.

Directed by CGI Parameters.

Three CGI parameters can be passed by a Home Application to cause a Productivity Application to show specific meta-datum to a user:

Table 6.1 Meta-Data Visibility CGI Parameters

StorageShowMetaRW

Each parameter contains a list of meta-data attribute names seperated by commas.

The RW attributes are presented to the user in an editable field and can be changed.

StorageShowMetaRO

The RO attributes are presented to the user in a read-only display field.

StorageShowMetaWO

The WO attributes are presented to the user in a blank editable field. Users can supply

new values for these attributes, but cannot view the previous values.

 

 

2.

The Dublin Core Set + Extensions

Any meta-data attribute which is part of the Dublin Core Set or the following extensions will be implicitly understood as appropriate to show to a user, in a read/write mode, within a Productivity Application.

Dublin Core Set:

Title

Subject

Description

Type

Source

Relation
Coverage

Creator 

Publisher

Contributor

Rights

Date
Format

Identifier

Language

 
Extensions:

Company

Department

Owner

Comments

Keywords

Audience

Account

Client

Project

Contract

PONumber

Purpose

Status

Stage

Process

NextStep

ApprovedBy

ExpirationDate

Longevity

 

       

 

 

 

Copyright © 2006-2008, the authors.