sort by descending/ascending ID not properly handling 3, 4, and 5 digit IDs

Project:RUcore SOLR Searching and Indexing
Version:7.6
Component:Code
Category:bug report
Priority:normal
Assigned:rjantz
Status:closed
Description

When I do a descending sort on IDs, it sorts in three batches, doing first the 3 digit IDs, then the 4 digit IDs, and finally the 5 digit IDs. For example, in the descending sort I get the following IDs - 2702, 2699, and then 26914. This appears to be a character sort when it really needs to be an integer sort.
Ron Jantz

Comments

#1

This has always been a problem. But I don't know what we can do about it. It's based directly on the Fedora database, whose "tokens" are the Fedora IDs rather than integers. They don't have an integer field we can use for sorting. That's one reason I was keen to do a real most recent sort like the ones we now have in the Solr interface with dateCreated and dateModified.

#2

Oops. I thought you meant the sort in the database search. We might be able to tweak these into integer fields in the Solr interface, though the dateCreated would achieve the same sort even now.

#3

Status:active» test

I've decided to create a new dynamic field, numid_i, that will allow numerical sorting on the FedoraID. It should be ready to test by this afternoon. I'm running the portalcron starting now.

#4

Assigned to:triggs» rjantz

#5

The sort still does not appear to be working. For example, select sort by "Ascending ID" and start at 60. This yields the following rutgers-lib:14308, rutgers-lib:14313, rutgers-lib:486, rutgers-lib:693.

#6

From earlier email:
lynx -source "http://rep-test.libraries.rutgers.edu:8983/solr/select/?q=*:*&version=2.2&start=60&rows=10&indent=on&sort=numid_i%20asc&fl=id,numid_i"

<result name="response" numFound="15176" start="60">
<doc>
<str name="id">rutgers-lib:14308</str>
</doc>
<doc>
<str name="id">rutgers-lib:14313</str>
</doc>
<doc>
<str name="id">rutgers-lib:486</str>
<int name="numid_i">486</int>
</doc>
<doc>
<str name="id">rutgers-lib:693</str>
<int name="numid_i">693</int>
</doc>
<doc>
<str name="id">rutgers-lib:1376</str>
<int name="numid_i">1376</int>
</doc>
<doc>
<str name="id">rutgers-lib:1379</str>
<int name="numid_i">1379</int>
</doc>
<doc>
<str name="id">rutgers-lib:1523</str>
<int name="numid_i">1523</int>
</doc>
<doc>
<str name="id">rutgers-lib:1568</str>
<int name="numid_i">1568</int>
</doc>
<doc>
<str name="id">rutgers-lib:1721</str>
<int name="numid_i">1721</int>
</doc>
<doc>
<str name="id">rutgers-lib:1724</str>
<int name="numid_i">1724</int>
</doc>
</result>

And now I get the following with the same command:

<result name="response" numFound="15176" start="60">
<doc>
<str name="id">rutgers-lib:486</str>
<int name="numid_i">486</int>
</doc>
<doc>
<str name="id">rutgers-lib:693</str>
<int name="numid_i">693</int>
</doc>
<doc>
<str name="id">rutgers-lib:1376</str>
<int name="numid_i">1376</int>
</doc>
<doc>
<str name="id">rutgers-lib:1379</str>
<int name="numid_i">1379</int>
</doc>
<doc>
<str name="id">rutgers-lib:1523</str>
<int name="numid_i">1523</int>
</doc>
<doc>
<str name="id">rutgers-lib:1568</str>
<int name="numid_i">1568</int>
</doc>
<doc>
<str name="id">rutgers-lib:1721</str>
<int name="numid_i">1721</int>
</doc>
<doc>
<str name="id">rutgers-lib:1724</str>
<int name="numid_i">1724</int>
</doc>
<doc>
<str name="id">rutgers-lib:1743</str>
<int name="numid_i">1743</int>
</doc>
<doc>
<str name="id">rutgers-lib:1819</str>
<int name="numid_i">1819</int>
</doc>
</result>

#7

Version:7.5» 7.6
Assigned to:rjantz» triggs
Status:test» active

Moving to R7.6.

#8

What is still the problem? This seems to be fixed from what I can see.

#9

Assigned to:triggs» rjantz
Status:active» test

The fix was not confirmed by Ron. It's not clear to me how we can test this.

#10

You can test by select asc and desc and checking the rutgers-lib IDs. The first (asc) should be 486 or something. To test over a wide range, you should use these sorts with search queries that might pull things from a variety of ID ranges.

#11

Status:test» fixed

I did a fedora solr search and selected "Order by IDs ascending" and "Order by IDs descending". Ascending order displayed rutgers-lib:486 as the first record. I scanned through the first five screen and the records were displayed correctly. Descending order displayed rutgers-lib:203145 as the first record and the records were displayed correctly in the next few screens.

#12

Status:fixed» closed

Back to top