Searching on title with apostrophe

Project:RUcore SOLR Searching and Indexing
Version:7.7
Component:Code
Category:bug report
Priority:normal
Assigned:chadmills
Status:closed
Description

On production the following record has a link to search by the uniform title. The title has an apostrophe in it.

https://rucore.libraries.rutgers.edu/rutgers-lib/31515/

The search string formed is:

https://rucore.libraries.rutgers.edu/search/?q1=CNRRA%27s+internal+commu...

No results are found. Looking at the solr index the apostrophe has been converted to a ' entity. I think this is why no results are found.

Comments

#1

Here is the entry in the solr index on production for this resource.

<field name="title">CNRRA&apos;s internal communication regarding the transfer of orphans in the Kuling (Guling) Orphanage in Kiangsi (Jiangxi), with two related correspondences attached
</field>

<field name="uniformtitle">CNRRA&apos;s internal communication regarding the transfer of orphans in the Kuling (Guling) Orphanage in Kiangsi (Jiangxi), with two related correspondences attached
</field>

#2

Assigned to:triggs» chadmills
Status:active» test

This is related to that other issue. I've been converting &apos; to "" up till now, so that "CNRRAs" would hit. On devel and test, I've tried changing &apos; to "'". On rep-test you can now find "Jane's Flying Metadata" with the apostrophe in the search string in both dlr/EDIT and rucore:

You searched: Jane's Flying Metadata in Title
1 - 1 of 1
Select All Items | Edit My Search | New Search
1
TitleJane's Flying Metadata
AuthorOtto, Jane Johnson
Date Created2022

#3

Note: in the debug solr output, it still looks like &apos;. The change is made just before the text is actually posted to Solr.

#4

Assigned to:chadmills» triggs
Status:test» active

I see the example on test you cited and it does seem to work, but I have two questions.

If apostrophes are being converted to blank entries why are they in the index on production?

Are there any other conversions like this, apostrophes to blank spaces, you are doing? We need to get on the same page so that when I send in a query I am following the same basic replacement/conversion rules.

#5

Assigned to:triggs» chadmills

They have been being nulled in the same process that I now use to convert them to ' chars - that is, at the last moment before posting to Solr - so they appear in the debug output as &amp;apos; (I think doing it like this had something to do with passing these reliably through XSLT.) I only reindexed a couple of objects for my test (Jane's), so the change isn't all over test yet till we dop a full reindex.

#6

Oh yes. This was modeled on a conversion of &amp;quot; to null. I probably should have done "'" when I first dealt with &apos; rather than follow the rule for &quot;

#7

Status:active» test

#8

Status:test» fixed

#9

Status:fixed» closed

Back to top