Implementation of Metadata Discovery tools (exiftool and mediainfo)

Project:RUcore Workflow Management System (WMS)
Component:File Upload Module
Category:feature request

This issue tracks the specifications for implanting metadata discovery in WMS. A similar issue will be created for the Administrative tool. Attached to this issue are the documentation files for implementing this capability (spec and mappings).



Category:specification» feature request
Assigned to:yuyang» ibeard
Status:active» test


Technical metadata for following content model or digital files types should be automatically populated after files are uploaded. Use attached spreadsheets as the guide for what to look for:

Microsoft office documents (doc, docx, xls, …)

Note: Currently technical metadata for the file types not listed above are also automatically populated after files are uploaded. It output the fields following the above specs. But it is experimental, and because they are different file types then the above list, the output metadata may change as new specs come out in the future.



Please take a look at the technical metadata for WMS ID - 10414.

Here is what I did:
- Uploaded 100 TIFFs and examined tech MD.
- Removed the first TIFF and uploaded another TIFF file. It removed the first occurrence of the TechMD in the XML and added the new TechMD for the first TIFF file at the end of the XML.

In TechMD:
- Creating Application Version, Photometric Interpretaion:ColorSpace, Image Orientation, creatingApplicationDateCreated-Start not in the TechMD XML.
- Mimetype, FileSize not in the TechMD XML but in the datastream ID section. This may be OK as long as it's preserved.
- I see two dateCreated (today's date) in the XML. Looks like the same date. One date may be creatingApplicationDateCreated but it should not be today's date.



A few more objects to inspect. These are all word documents and seem like too many dateCreated in the techMD XML.

<a href="" title=""></a>
<a href="" title=""></a>
<a href="" title=""></a>



I just created WMS ID 10424. I see that some fields are being populated, including colorspace on my new upload. It could well be that fields not being populated are simply because the file lacks that specific metadata.

Also seeing that date created is being duplicated number of times, and seems to increase every time I go back to edit the record.


Accessed a WMS record for editing, and the issue with creatingApplicationDateCreated still appears to be occurring (each time I access the record, another instance of creatingApplicationDateCreated is duplicated). Example record is WMS ID# 10578 in Release 8.1 Test Collection - Chrome.


Assigned to:ibeard» yuyang
Status:test» active

This is still an issue. I am not sure whether this could be addressed in this release. If it can't be, please move to 8-x.


Assigned to:yuyang» ibeard
Status:active» test

Isaiah, please test the duplicate date issue when editing/saving record. -YY


Status:test» fixed

Working now, thanks!


Status:fixed» closed

Back to top