dave
(Dave Ormsbee)
September 14, 2022, 3:35pm
1
The first PR for the most recent set of work to remove Old Mongo from the system merged today:
openedx:master
← raccoongang:sagirov/EDXOLDMNG-62
opened 02:14PM - 01 Sep 22 UTC
## Description
This removes user-facing Studio edit support for Old Mongo cou… rses (courses that have a CourseKey of the format {org}/{course}/{run}). This does not affect our normal courses, which have CourseKeys starting with "course-v1:". After this commit:
- Old Mongo courses will continue to appear on the Studio course listing page, but are not clickable.
- Any attempt to directly access an Old Mongo course in Studio via URL fail with a 404 error.
- Course certificates will still be available for Old Mongo courses
- Old Mongo courses will continue to be returned by CourseOverviews and get_course_summaries() calls.
We decided against removing Old Mongo courses from the listing entirely because that would require very expensive CourseOverviews query to filter them out. Making that query more efficient would involve a database migration to add appropriate indexing, which is something else that we are looking to avoid. In general, we want to avoid changing how CourseOverviews work, in order to minimize risk.
This is part of the Old Mongo Modulestore deprecation effort: https://github.com/openedx/public-engineering/issues/62
For context on this (long running) DEPR effort:
opened 07:24PM - 10 Mar 22 UTC
DEPR
The DraftModuleStore (also sometimes referred to as "Old Mongo") is the interfac… e used to store courseware content for courses with course run keys of the format "Org/Course/Run", e.g. "edX/Demo/2012". Newer courses of the format "course-v1:Org+Course+Run" use the DraftVersioningModuleStore (also commonly called "Split Mongo"). This has been available since the [Birch release](https://edx.readthedocs.io/projects/open-edx-release-notes/en/latest/birch.html#split-mongo-modulestore) and the vast majority of sites have never used DraftModuleStore. One of the main motivations for the newer modulestore was atomic publishes, as it was possible to get partially applied updates if there was an error during the import process with DraftModuleStore.
Because DraftModuleStore has a very different data structure from DraftVersioningModuleStore, supporting both formats simultaneously has resulted in significant complexity, bugs, and performance issues going as far back as the initial implementation of cohorts. There are also thousands of extra tests run specifically to ensure compatibility across ModuleStores.
Some sites intentionally continued to use DraftModuleStore because of storage-related concerns. DraftVersioningModuleStore did not free up disk space used by old versions of the content. However, this problem has been addressed with tubular's [structures.py](https://github.com/edx/tubular/blob/master/tubular/scripts/structures.py) script, which can be run on a regular basis to prune unused old versions of course content.
## Proposal
* Juniper will display a message in Studio for all courses using DraftModuleStore, saying that this course format will no longer be supported, and urging people to create a re-run of the course (which will make a copy of that course in the DraftVersioningModuleStore).
* Koa will remove all support for DraftModuleStore. This will involve removing or modifying thousands of tests, as well as removing the MixedModulestore proxy class.
* Course Overviews using the old course format should still be supported. Old-style courses should not suddenly disappear from your list of enrollments, but any attempt to access courseware content within them (learning sequences, files and uploads) will fail.
* For the Koa release, there will be no Studio access at all for DraftModuleStore, and it will not be possible to do a data export from Studio for these courses.
Note: In previous conversations, I had discussed the possibility of having a data migration that would convert DraftModuleStore course content into DraftVersioningModuleStore courses while preserving IDs. I created a [proof of concept](https://github.com/edx/edx-platform/pull/17393) for this approach as a hackathon project. This has the upside of letting us get rid of a chunk of the old code without giving up compatibility, but it also had a number of strong drawbacks, including:
1. We would have to maintain a large set of tests that used both ID formats for course keys.
2. It would subtly change opaque keys such that two different keys would serialize identically–we would be adding course-run and version information to "i4x:..." style keys, but for data compatibility reasons we would have to serialize without that information. This has implications for course key caching and would make debugging much trickier–the newer modulestore itself derives its own keys to pass around, and we’d end up in a spaghetti of keys which sometimes have or don’t have run and version information.
Because usage of the old modulestore outside of edX itself is limited and we did not want to introduce any more complexity to what is already a major source of bugs in edx-platform, this DEPR is going the simpler route of removing support altogether.
## Compatibility notes
1. We won't explicitly delete the old course content. If an Open edX site upgrades to Koa without realizing the implications and rolls back to Juniper, their DraftModuleStore content should still be there.
2. We won't delete any student course state for these courses, so module state in courseware_studentmodule will be preserved.
3. The relevant key types (SlashSeparatedCourseKeys and Locations) will _not_ be removed from opaque-keys.
4. Our goal would be to preserve the functionality of other pages that are not directly courseware, such as the student dashboard. However some functionality is so dependent on the modulestore’s existence that it’s likely they will be disabled for these courses rather than trying to port them to work without a backing modulestore. While more investigation needs to be done, it is likely that most if not all of the Instructor Dashboard for old-style courses will be unavailable in Koa.
## Additional Info
Original Jira Issue: https://openedx.atlassian.net/browse/DEPR-58
## Useful comments
From Mike Terry
> Small update on the slow shuttering of access to these courses. I’ve landed a couple fixes that will slow down incoming enrollments:
> * Tests default to split store, instead of Old Mongo.
> * Old Mongo courses are [marked as hidden](https://github.com/openedx/edx-platform/pull/29945) (they no longer show up in prospectus searches, but they do have a page there still that ends up in search engines)
> * Old Mongo courses are [invitation only](https://github.com/openedx/edx-platform/pull/29987)
> * Shortly, I’m going to also mark Old Mongo courses [as non-marketable](https://github.com/openedx/course-discovery/pull/3309) so that prospectus won’t even generate a page for them.
>
> And after enrollment stops, at some point we’ll [turn off all access entirely](https://github.com/openedx/edx-platform/pull/29848) to learners.
## Approach
* Remove user access
* Incrementally remove functionality, to reduce rollout risk
In order to preserve CourseOverview metadata and further reduce risk, the end state is not a complete removal of Old Mongo, but instead a really limited version that only implements `has_course` and the ability to read from the root CourseBlock.
## Implementation Tickets
- [ ] https://github.com/openedx/public-engineering/issues/75
- [ ] https://github.com/openedx/public-engineering/issues/76
- [ ] https://github.com/openedx/public-engineering/issues/77
- [ ] https://github.com/openedx/public-engineering/issues/78
- [ ] https://github.com/openedx/public-engineering/issues/79
- [ ] https://github.com/openedx/public-engineering/issues/80
This DEPR was first messaged on these forums in late 2019.
For day-to-day updates of what’s going on, you can join the #tcril-raccoon-gang-old-mongo
channel in Slack.
3 Likes
sambapete
(Pierre Mailhot)
September 14, 2022, 3:40pm
2
@dave Like you once told me in a BTR or Contributors meeting a few months ago, we at EDUlib are probably one of the older sites (maybe @pdpinch and MIT too) that might still have some “Old Mongo” remnants. We did remove most, if not all, of the courses way back in 2019 or early 2020 when you first mentioned the “Old Mongo” deprecation. It feels years ago now…
dave
(Dave Ormsbee)
September 14, 2022, 7:35pm
3
Hey @sambapete ! Yeah, I talked to @pdpinch a long time ago about this, and they haven’t run Old Mongo courses on their own sites in years (if they ever did?). Stanford also ran these courses, but that instance wound down when Stanford moved their courses to edX.
MIT definitely ran Old Mongo courses on edx.org , but there was a lot of communication to those course teams over the years to get ready for this (and some really terrifying spreadsheets–thousands of these courses were created between the two production environments edX runs, and all of them had to be tracked down and understood). Actual access to the LMS courseware for Old Mongo courses was shut down before Nutmeg was cut.
But no matter how long the runway is, there’s always going to be something. We’re currently investigating instances where courses on other instances of Open edX link directly to static assets in Old Mongo courses running on edx.org . Meaning that when it gets shut off, we’ll be breaking the course experience for someone somewhere. If the list isn’t too long, I’ll try to get in touch with the relevant site owners.
Thank you very much for retiring your courses when you did. You are one of my heroes.
1 Like