Google scholar URL's still use mss3.libraries

Project:RUcore dlr/EDIT
Version:7.3.2
Component:Code
Category:bug report
Priority:normal
Assigned:triggs
Status:closed
Description

Looking at Google Scholar the advertised URL for accessing records and PDF's is mss3. While redirection is working as expected these mss3.libraries addresses need to be changes to rucore.libraries.

Perhaps using the lastmod or changefreq elements in the sitemap specification will help this.

<a href="http://www.sitemaps.org/protocol.html" title="http://www.sitemaps.org/protocol.html">http://www.sitemaps.org/protocol.html</a>

Also there is a reference to mss3 in a comment line in the top of the Google scholar sitemap. This likely has no effect but needs to be corrected nonetheless.

Comments

#1

I don't think the inclusion of the old $serverbase in an XML comment would affect anything, but I changed it to $rucorebase just in case. As for lastmod and changefreq, these elements are designed for files that change frequently and so need to be reindexed. The lastmod is supposed to be the modification time of the file and changefreq is an indication of how frequently the files is expected to change. Our files, of course, are not really meant to change at all. We could have lastmod be the date the sitemap cron runs and changefreq be the length of time between crons, though I'd be a bit worried that Google would penalize us for trying to trick them. Maybe a lastmod for each release date (or something near it). Then they might find something different in the objects at least.

#2

Status:active» test

The rucorebase in now used. We'll open a new "investigation" not tied to 7.3 of the how to get Google to replace the files in their cache. I'll reach out to Anurag again to see if he has any ideas.

#3

How would this be tested then?

#4

Any word from Google/Anurag?

#5

I haven't written to him yet. I was waiting till the release took effect so that he could test the live sitemap. I'll try writing him today.

#6

I heard from Anurag. The problem is that I was giving a normal 302 redirect rather than specifying a 301 permanent redirect. I've added
header("HTTP/1.1 301 Moved Permanently");
to the following files on rep-devel:
disseminators/
flvplayer.php outputdjvu.php outputsmil.php
dlr/
showfed.php outputthumb.php output.php outputds.php

To test these, type one of the old style URLs, e.g.
<a href="http://rep-test.libraries.rutgers.edu/dlr/showfed.php?pid=rutgers-lib:1743" title="http://rep-test.libraries.rutgers.edu/dlr/showfed.php?pid=rutgers-lib:1743">http://rep-test.libraries.rutgers.edu/dlr/showfed.php?pid=rutgers-lib:17...</a>
or
<a href="http://rep-test.libraries.rutgers.edu/dlr/outputds.php?pid=rutgers-lib:1743&amp;ds=PDF-1" title="http://rep-test.libraries.rutgers.edu/dlr/outputds.php?pid=rutgers-lib:1743&amp;ds=PDF-1">http://rep-test.libraries.rutgers.edu/dlr/outputds.php?pid=rutgers-lib:1...</a>
and make sure it still redirects. You can test the headers on a site like web-sniffer.net where you get:
HTTP Response Header
Name Value Delim
Status: HTTP/1.1 301 Moved Permanently
Date: Thu, 10 Apr 2014 18:55:07 GMT
Server: Apache/2.2.21 (Unix) mod_ssl/2.2.21 OpenSSL/0.9.8h PHP/5.4.14
X-Powered-By: PHP/5.4.14
Location: <a href="https://rucore-test.libraries.rutgers.edu/rutgers-lib/1743/" title="https://rucore-test.libraries.rutgers.edu/rutgers-lib/1743/">https://rucore-test.libraries.rutgers.edu/rutgers-lib/1743/</a>
Content-Length: 0
Connection: close
Content-Type: text/html

I suggest a small drop in tar replacement for the seven files listed above that could be called dlr7.3.2 and disseminators7.3.2.

#7

Status:test» fixed

Looks to be fixed now.

#8

Version:7.3» 7.3.2
Status:fixed» closed

Put in production on Apr 24th, 2014.

Back to top