Cannot retrieve collection name probably because of diacritic

Project:RUcore dlr/EDIT
Version:8.x
Component:Miscellaneous
Category:bug report
Priority:normal
Assigned:triggs
Status:Moved to JIRA
Description

In dlr/EDI Collection Management, I search 'gimenez' and get no results.

Search daniel and you'll see the collection come up. Gimenez has accent acute on the first 'e'

Comments

#1

I notice that a search for "Giménez" itself does work. This collection search is an ordinary sql query. It is UTF-8 aware, but it doesn't do preprocessing of queries or allow defined character flattening as you can do in a search engine like Solr or Sphinx. I'll look into it though.

#2

I think there will be a way to set the mysql character set collation in such a way as to allow this type of search.
CHARACTER SET uft8 COLLATE utf8_general_ci
It will take some experimentation on the dev and test servers, but it seems definitely doable.

#3

Version:7.8» 8.1

#4

Version:8.1» 8.x

I believe we do not have a complete solution for this yet. I experimented with several mysql suggestions, but did not find the equivalent of the latin1 folding in Solr. The mysql character sets appear to be different on the different servers, which makes testing more difficult.

#5

Status:active» Moved to JIRA

Back to top