Currently, openedx-platform is split into two parts, CMS and LMS, which get deployed and managed separately. However, they share a codebase (openedx-platform), and there is a very fuzzy boundary between them in some places.
As you probably know, this fuzzy separation causes all kinds of confusion, headaches, and bugs; 1, 2, 3, 4, 5, 6 are just a few examples I could find in a few minutes of searching.
I would like to get people’s thoughts around setting the following as a long-term goal:
Removing all UI/frontend code from openedx-platform (already a goal AFAIK)
Combining the cms and lms REST API servers in to a single headless API service, which provides all the same REST APIs that the current lms+cms services provide separately.
Continue to categorize celery tasks as cms.high, lms.low, etc. so that celery workers can be scaled separately for LMS vs. CMS tasks as needed to prioritize student workloads over staff operations.
I believe that in terms of server URLs and REST APIs, this could be done in a largely backwards compatible way, but would obviously require some major changes on the python and operations side of things.
The benefits would include:
Much simpler system for developers to understand and work with, much more consistency across the board.
Generally more efficient use of resources - for a small installation, I suspect that a single ~2 GB openedx pod would serve a similar level of traffic as a 2 GB lms pod + 2 GB cms pod. Our core applications cache a lot of stuff in memory, and anything cached at the python/django level cannot currently be shared between cms and lms processes.
Simpler operations.
Fewer things to monitor.
Faster deployments (fewer pods to deploy, no need to separately run migrations for CMS vs LMS)
More standard invocation of ./manage.py without our idiosyncratic service_variant prefix.
Probably simpler authentication and authorization flows
I imagine there are some downsides though, not the least of which is it would be a major overhaul of how deployments work. Within the platform codebase, the various modulestore APIs which currently depend on the Django settings to return published vs. draft content would have to be made more explicit. But otherwise in terms of the code in openedx-platform itself, I think that an initial version of this might be achieved more easily than you might think, as long as we focus on initially consolidating only the urlconf and settings, without moving too much actual code around.
I agree, and I know @kmccormick has also brought this up as a direction we want to work towards in the long term.
I’d be careful about this assumption. Studio exercises the modulestore in different ways, and in the past that has led to large memory leaks. I don’t know the current state of things, but in the past it was not uncommon to see Studio per-process memory usage explode out to 2-3X LMS memory usage, which may become a problem if the pool is shared. In any case, we’d have to test and validate this assumption. I’m hopeful that this will be less of a problem when we shift course content writes to openedx-core.
What were you thinking the rollout would look like? Do we migrate Studio things into “lms” piece by piece until “cms” doesn’t really do anything other than redirect? That way nobody has to change their scripts until they want to, but all the MFEs just start calling into the LMS URLs?
Yeah. Thinking on this a bit more, the shift to openedx-core for content might be a prerequisite anyways. All the openedx-core APIs are fairly explicit about draft vs. published content, but the legacy modulestore APIs do a lot of “automatic” draft vs. published fetching based on the Django settings (i.e. if you’re in LMS vs. CMS) and provide totally different XBlock runtime environments between LMS and CMS. So we’re already laying the foundation for this as we move more stuff from modulestore to openedx-core/openedx_content.
Yeah, more or less. I haven’t done any detailed planning or evaluation of this yet, so this is just a very preliminary guess. But something like:
Either migrate our course content to openedx-core (talk, slides, code) or convert the modulestore APIs to be more explicit about draft vs. published (so that calls like modulestore().get_item() become published_modulestore().get_item() or modulestore().get_published_item(), etc.).
Hopefully we can just go with the first option, and the modulestore APIs can become a legacy read-only/published-only API without many changes.
URL by URL in the CMS URLconf, add the URL to the LMS URLconf; test the behavior and address issues until they are returning identical responses whether accessed via LMS or CMS domain. Then remove it from the CMS URLconf and replace it with a redirect or a proxy.
Same thing for management commands.
Eventually the CMS is just a redirect/proxy service and doesn’t even need to be a django app anymore, and can be replaced with some nginx forwarding rules or something like that.