ETD ingest failed on two records - Staging server

Project:RUcore Workflow Management System (WMS)
Version:8.x
Component:Fedora Ingest
Category:bug report
Priority:normal
Assigned:dhoover
Status:Moved to JIRA
Description

Two ETDs failed on ingest in the Graduate School - New Brunswick collection:

Test? of €TD with Σρecial Charact≤rs in Title (51360)
ETD with Special Character in Abstract (51361)

Empty FID returned from ingest script - check for error.

REST ingest error: Error ingesting rutgers-lib:201383 - url=http://127.0.0.1:8080/fedora/objects/rutgers-lib:201383?label=&format=info%3Afedora%2Ffedora-system%3AFOXML-1.1&encoding=UTF-8&namespace=rutgers-lib&ownerID=fedoraAdmin&logMessage= content_type= http_code=500 header_size=236 request_size=341 filetime=-1 ssl_verify_result=0 redirect_count=0 total_time=1.039957 namelookup_time=6.7E-5 connect_time=0.000111 pretransfer_time=0.000111 size_upload=30370 size_download=0 speed_download=0 speed_upload=29203 download_content_length=0 upload_content_length=30370 starttransfer_time=0.000404 redirect_time=0 certinfo=Array primary_ip=127.0.0.1 redirect_url= code=11
Removing DOI registration ... OK.

Comments

#1

Yang,

Attached are the full fedora log entries for rutgers-lib:201383
and rutgers-lib:201384.

It looks like it may be a mysql table update problem

Look in the attached log for entires like:

org.fcrepo.server.errors.StorageDeviceException: Error attempting
FieldSearch update of rutgers-lib:201383
org.fcrepo.server.errors.StorageDeviceException: Error attempting
FieldSearch update of rutgers-lib:201383

and

Caused by: java.sql.SQLException: Incorrect string value:
'\xCF\x83\xCF\x81ec..
.' for column 'dcTitle' at row 1
Caused by: java.sql.SQLException: Incorrect string value:
'\xCF\x83\xCF\x81ec..
.' for column 'dcTitle' at row 1
Caused by: java.sql.SQLException: Incorrect string value: '\xC9\x9Bra
\xC3...'
for column 'dcDescription' at row 1
Caused by: java.sql.SQLException: Incorrect string value: '\xC9\x9Bra
\xC3...'
for column 'dcDescription' at row 1

Dave

#2

Assigned to:yuyang» dhoover

First I should note that Peter was reporting a problem on rep-staging. I have just tested the same record Peter was having problem with on rep-dev, rep-test, and rep-staging: import from ETD - New Brunswick Graduate School, program = ETD with Special Character's in Title, then ingest it to fedora. On rep-dev and rep-test, no problems. But on rep-staging, I saw the same problem Peter reported. Since it's the same ETD record, it looks like fedora on rep-staging is behaving differently compared to rep-dev and rep-test. -YY

#3

Version:7.7» 8.1
Assigned to:dhoover» martyb
Status:active» test

Marty,

I am assigning this to you to test on Staging. I found this as an active entry in R7.7.1 release.

#4

Version:8.1» 8.x

This issue is being moved to a later release for the reason described below. The problem doesn't exist on the test server but does occur on staging. The production server has utf8 as does the test server, so we'll assume that the code will work on production as it did for the last release.
A related issue is: <a href="https://software.libraries.rutgers.edu/node/3511" title="https://software.libraries.rutgers.edu/node/3511">https://software.libraries.rutgers.edu/node/3511</a>
----------------------------------------------------------
On 5-31, Dave said to Yang:
If you remember the charset for the fedora database is different

rep-staging | character_set_database | latin1 |

rep-test (and all other) | character_set_database | utf8 |

I had thought that the install of fedora 3.8.1 was going to require a rebuild
of the database and I was going to try to change the charset at that time but
a rebuild was not required. While I knew we should try the rebuild at some point, I
got bogged down with the other fedora 3.8.1 XACML policy issues and never got
back to it. So we still have a database with the wrong charset.

We can continue on in this fashion as we did last time, or we will need to look
into how to fix the charset problem.
---------------------------------------------------------
On June 1, Yang said to Aletia:
The issue is due to rep-staging mySQL database is on different charset. As Dave indicated, this issue was discovered in last release testing, and by the time, we chose to ignore rep-staging result (for this issue only) and trust our rep-test test result. We planned to convert the charset to utf8 but never got chance to do it before this release. So for this release, we have to decide if we want to use the same approach as last time - ignore the diacritic ingest issue, or, convert staging to urt8 first now and finish test. Be ware that converting database charset is not an easy and short task.
--------------------------------------------------------
and followed up with:
Do not test for "ingest a record with diacritics". As I explained in my previous emails, rep-staging MySQL is not configured to handle fedora ingest for records with diacritics. We have tested the scenario on rep-test, and we know it works. No need to spend time fighting with staging.
------------------------------------------------------------

#5

Assigned to:martyb» dhoover
Status:test» active

Assigning to Dave to be consistent with the related issue. (software.libraries.rutgers.edu/node/3511)

#6

Status:active» Moved to JIRA

Back to top