Is deleting courses actually supported?

From time to time, folks ask how to delete a course, because there’s no way to do it in the UI (in studio or in lms).

The answer is usually to use the delete_course management command. There is also a web-based UI for deleting courses in the sysadmin django plugin from my team at MIT.

However, from time to time we see errors that arise from having deleted courses. We have one right now where a user who had taken a proctored exam, and then had the course deleted, will get an 500 error on their dashboard. They can still access (other) courses, but they have no way to see a list of all the courses they are enrolled in. The error seems to originate from a missing CourseOverview.

We can work around this bug by restoring the course that was deleted. Or we can submit a PR that gracefully handles the error.

But either way, it leads me to wonder if deleting a course is a considered a normal, supported behavior. I think it should be.

3 Likes

TDLR: I think the answer is going to be “yes” for content (but not other metadata) associated with a course, after work has been done to support the removal of the Old Mongo ModuleStore.

It’s not the case today. Course deletion is almost an afterthought, and can’t always account for the myriad of ways in which our other data models reference courses. Course lifecycle is not really designed for or tested against during feature development (it’s almost a chicken-and-egg problem in that sense).

I agree that it probably should be. The tricky part is figuring out what should happen to all the data associated with the course and developing a framework to categorize that data. Depending on the system, we might want to delete the data (bookmarks?), leave it in place (XBlock user state), or preserve it (certificates).

We also have to think about the failure/mistake/recovery edge cases. Is there an archiving process? What happens if we accidentally delete the wrong course and we want to restore from it? What should we do if we try to reuse the same CourseKey for an entirely different set of course content? Etc.

This issue is going to come up with the Old Mongo ModuleStore removal, which will have the effect of deleting a number of courses on edx.org. Note for others browsing this thread: Old Mongo has been deprecated since Birch, so the vast majority of Open edX installs have never used it–the details are in the DEPR ticket.

I believe that the most practical thing we can do in the short-to-medium term is to promote CourseOverview (or something like it) to be a source of truth for catalog-level information within the LMS. In the past, CourseOverview has served as a materialized cache for ModuleStore data, where the ModuleStore always represented truth and the CourseOverview was constantly regenerated when it fell out of sync. The shift here would be to make ModuleStore an input to CourseOverview data, but be in a place where CourseOverviews can live independently as well. So even if we delete the ModuleStore content for a course, it will still have a ModuleStore-agnostic “shell” that other LMS systems like Grades can point to.

That wouldn’t completely solve our course lifecycle issues, but I think it would sufficiently decouple the catalog-like aspects of the course from the content aspects of the course so that you could at least delete the latter without having everything explode. There would be a decent amount of cleanup, as there are tendrils of ModuleStore everywhere, but I think that promoting CourseOverviews gives us a tractable path towards that goal.

2 Likes

Thanks Dave. I think that’s an appealing approach. I don’t suppose there’s anything in CourseOverview that represents the lifecyle of a course, so that we could discover the course is “archived” or “deleted” and handle it appropriately?

In the meantime, does it make sense to catch exceptions from CourseOverview and log them instead of raising them? We traced our proctoring error message back to the credit app (which is a bit puzzling) and opened a WIP PR to catch and log: fix: catch and log the CourseOverview.DoesNotExist instead of raising by arslanashraf7 · Pull Request #29834 · openedx/edx-platform · GitHub

I don’t think there is today, but we’d have to add something to that effect. We’d also have to change how adding new fields to CourseOverviews works, because we do blanket re-generation of the row today and we won’t have the source data going forward.

I think that’s reasonable for most use cases.

Old Mongo removal came up in the last DEPR meeting, and I got a chance to talk to @Michael_Terry, who is pushing that work at edX and recently opened the following PRs:

Something that came out of that conversation is that we might be able to remove more of Old Mongo more quickly if we lobotomize it. We can leave just enough of it so that it can read a root course descriptor and return attributes–i.e. enough so that we can continue to use CourseOverviews the way we do today without refactoring. We could remove a bunch of other stuff along the way, like all write functionality, everything with assets, child traversal, import/export, caching, etc.

I still think that we need to promote CourseOverview to be truly ModuleStore-independent, but this will probably let us get rid of things more quickly and incrementally, with lower risk. Along the way, we’d want to add logging to make sure we emit warnings when things access the Old ModuleStore directly.