Unable to ingest PCSP records due to an error from Solr/Lucene

Project:RUcore SOLR Searching and Indexing
Version:7.6
Component:Code
Category:bug report
Priority:normal
Assigned:liny4
Status:closed
Description

I got an error message while ingesting a couple of PCSP records (system ID: 46713 and 46717) in the "PCSP Archive" collection. The error message indicated "Error add index to Solr/Lucene: Error with add actions for ..." I viewed the WMS records and found out there were two sets of "Rights metadata" which I had no clues about it. Please assist.

Comments

#1

If you look at "edit record" in the WMS, you see only one rights metadata datastream. However, if you look at the FOXML you will see that there is another "hidden" rights metadata datastream. I have seen this occur when metadata is imported from an external source (e.g. RUetd submission system). Something about that metadata prevents it from displaying in the WMS. Take a look at what is in the metadata being imported and see if there is an element or value that is "invalid" on the WMS form. If you have included an element or subelement or attribute that is not in the current WMS, that will prevent it from being displayed in the WMS.

For example, the WMS looks for an authority and ID attribute in the rights declaration, but there is none here.

#2

I think Yu-Hung is batch importing these records. This didn't happen on the test system so we need to look at the process. Also the System IDs are 76713 and 76717 (not 46713 and 46717).

#3

This comment is for additional information. This was indeed a batch import. One interesting thing that we discovered is that the illegal FOXML with two datastreams with the ID RIGHTS1 actually passed xmllint parsing. James and I tested further using Oxygen and discovered that the Fedora schema was not attached to the pre-ingest XML and so its restrictive rule against duplicate IDs was not being checked and it was only tested for well-formedness. We were able to reproduce the error in Oxygen only by adding in the Fedora xsd. When ingest was attempted, however, Fedora associated the ingest file with its internal FOXML xsd (schema) file and it failed ingest. We tested this with a command line ingest on rep-test, I got this error back:
triggs@rep-devel:/mellon/htdocs/dlr/EDIT> php -f restingest.php xmlfile=TESTOBJECTS/DisplayPopup.php.xml
<?xml version="1.0" encoding="UTF-8"?><management:validation xmlns:management="http://www.fedora.info/definitions/1/0/management/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.fedora.info/definitions/1/0/management/ http://www.fedora.info/definitions/1/0/validation.xsd" pid="unknown" valid="true">
<management:contentModels>
</management:contentModels>
<management:problems>
<management:problem>DOValidatorXMLSchema returned validation exception.
The underlying exception was a org.xml.sax.SAXException.
The message was "URI=null Line=156: cvc-id.2: There are multiple occurrences of ID value 'RIGHTS1'."</management:problem>
</management:problems>
<management:datastreamProblems>
</management:datastreamProblems>
</management:validation>

#4

Sorry. I made the mistakes on the ID numbers and they should be 76713 and 76717. I attached my spreadsheet that I used for batch importing. Thanks. YL

#5

Assigned to:Anonymous» triggs

#6

Assigned to:triggs» liny4

I'm not sure that there is still a problem with these. Yu-Hung will have to answer, but I thought he'd made it all the way through PCSP by now.

#7

Status:active» closed

Modified Rights metadata and the problem resolved.

Back to top