VMC collection datastream labels and ID irregularities

Project:RUcore Jobs & Reports
Component:Job - production

I noticed there are any resource that are part of the VC collection where the PDF data streams have identifiers like PDFA1 or PDFA-3 with labels such as "Transcript1" instead of "Transcript" or "Student Work1" for PDF's that are transcripts and not student work at all.

I am manually going through the collection and fixing these by adding new datastreams using the PDF-1, PDF-2 identifier and fixing labels as I go.

After this is complete I will speak with Jie about normalizing/updating the statistics database so move the statistics that had been recorded to the appropriate datastream ID so statistics are not lost.

There are some that were ingested in late 2015 that are exhibiting this behavior so the workflow in WMS for this collection should be reviewed so this doesn't continue.



Closing. I normalized all of the PDFA-* marked datastreams to the corresponding PDF-* datastream maintaining and filename and labels that were in place. Some were found outside of the VMC collection, in China Boom. Those were corrected as well.

