i’m working on a special-purpose installation of Open edX containing several hundred courses; each of which has been re-imported many dozens/hundreds of times over the last 18 months. in some cases this results in orphaned data in quantities large enough to significantly slow screen rendering in Course Management Studio.
Unsuccessful attempts include the following:
manage.py cms delete_orphans. this finds and remove some ancillary items, but it misses the much larger volume of orphaned documents.
exporting all courses, deleting edxapp.modulestore.definitions and edxapp.modulestore.structures, and and then re-importing the courses. there is a minor bug in the course export related to the name of the course run – and i’ve found myself chasing my own tail trying to work around this minor problem.
manual “Search & Destroy” from the MongoDB shell. so far, i have not been unable to to create sound logic that identifies only orphans. that is, my Mongo queries might also delete documents that are not orphans.
has anyone else needed to perform maintenance of this nature on Mongo? any suggestions?