Unexpected search behavior when searching by title

Project:RUcore SOLR Searching and Indexing
Version:7.2
Component:Code
Category:bug report
Priority:normal
Assigned:triggs
Status:closed
Description

I searched for the following title in dlr/EDIT:
Legislatives '98: Démocratie où es-tu?

Fedora XML search finds this title if I enter "Legislatives". If I enter "Démocratie où es-tu?" Or "Démocratie où es-tu", Fedora XML search does not find any matching title.

Fedora database search finds correct record for title” Démocratie où es-tu?". But if I click on the title, it brings a different record. This happened yesterday afternoon on the production server.

Comments

#1

Version:5.1.2» 5.3

#2

Version:5.3» 6-x

#3

Version:6-x» 7-x

#4

I can find Démocratie où es-tu in the title with the dlr SOLR search and with the database search, which leads to the correct object. "Legislatives '98: Démocratie où es-tu?" causes Solr problems because of the ":" and the "'". The : is the Solr field designator. For the database search the ? at the end is throwing off the single page view.

#5

Status:active» test

#6

Assigned to:triggs» ananthan

Assigning this bug to myself to test.

#7

Version:7-x» 7.0

#8

Version:7.0» 7-x

This is on production system. I will test it when R7.0 is installed on production system. This affectsany end dlr/EDIT users only.

#9

Status:test» active

In RUcore:

Search:Legislatives '98: Démocratie où es-tu? or "Démocratie où es-tu?" results in 0.

Search: Legislatives '98 brings up this record.

#10

I suspect the ? is throwing off the Solr REST interface used by dlr/EDIT
<a href="http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=D%C3%A9mocratie+o%C3%B9+es-tu" title="http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=D%C3%A9mocratie+o%C3%B9+es-tu">http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=D%C3%A9mocrati...</a>
works, but
<a href="http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=D%C3%A9mocratie+o%C3%B9+es-tu" title="http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=D%C3%A9mocratie+o%C3%B9+es-tu">http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=D%C3%A9mocrati...</a>?
does not. Since this is a back end, utilitarian interface, I'm not sure it is worth trying to address this as an issue.

#11

Project:RUcore dlr/EDIT» RUcore SOLR Searching and Indexing
Version:7-x» 7.4
Assigned to:ananthan» triggs

<a href="http://software.libraries.rutgers.edu/node/2022" title="http://software.libraries.rutgers.edu/node/2022">http://software.libraries.rutgers.edu/node/2022</a>

#12

This behaves the same way with the RUcore search.

#13

Status:active» test

This should work now. To test, search for "Legislatives '98: Démocratie où es-tu?" in the title field on rep-devel (I've added this string to the title of a test object). The ": " string is converted to " " and the "?" character to " " before running the query.

#14

Version:7.4» 7.2
Assigned to:triggs» chadmills

#15

Assigned to:chadmills» triggs
Status:test» active

In the RUcore search the colon character is escaped before being passed to Solr. It has been that way since Solr was implemented. I think we should apply the same rules for consistency. As for the question mark I am a bit stumped here. A bug is open with Solr that mentions it interrupts trailing question marks as wildcards. If that is the case then escaping question marks will not help.

<a href="https://issues.apache.org/jira/browse/SOLR-4457" title="https://issues.apache.org/jira/browse/SOLR-4457">https://issues.apache.org/jira/browse/SOLR-4457</a>

This is currently what is searched for and replaced in the RUcore search.

$match = array('\\', '+', '-', '&&', '||', '!', '(', ')', '{', '}', '[', ']', '^', ':');
$replace = array('\\\\', '\+', '\-', '\&&', '\||', '\!', '\(', '\)', '\{', '\}', '\[', '\]', '\^', '\:');

#16

Status:active» test

I was able to escape both the : and the ? and get the hit for "Legislatives '98: Démocratie où es-tu?"

#17

Status:test» fixed

#18

Status:fixed» closed

Seems to be fixed.

Back to top