Our plugin system for edx-platform is backwards (from Django practices)

Please forgive the potentially inflammatory title. This is something that’s been on my mind for a while, but I haven’t had time to really think through a decent proposal around the specifics. I still don’t have that time, but I wanted to get the topic out there in case others want to pursue it (or tell me I’m crazy).

To be clear up front, I still believe:

  • Django Plugins were a good thing that has significantly advanced our extensibility story and made the best of the situation we had.
  • Hooks (events + filters) are a good thing period, and would continue to exist in some form regardless of anything proposed below.
  • You absolutely should be using these mechanisms to extend edx-platform functionality today.

Our Plugins vs. Django Conventions

A reusable Django app is built with certain conventions that make it easy to pip install and include them into your own Django project, potentially alongside your own apps. Your project defines some top level URLs configuration and then delegates certain directories to be handled by those different apps. But the ultimate control comes from your project file.

The edx-platform repo contains a whole constellation of apps. Many of them are built using conventions that make them theoretically reusable, but the deployment process as a whole assumes that edx-platform is the only thing running. You don’t have a project and load pip-installed edx-platform apps into it–you run edx-platform itself, use its project files, and tell it where to find the configuration values that will customize it to your site. That’s the way it organically grew in the early days, because the overriding priority for edX at the time was to have a functioning site, and the extensibility use case wasn’t our highest priority at the time.

The framework that we created to allow for plugins deals with that situation as best as we could manage. It doesn’t fundamentally change how edx-platform operates or is configured, but instead gives apps a way to inject themselves into the process and dynamically add their apps, URL configuration, etc. It’s apps-as-dependency-injection, rather than apps being explicitly included via a separate project settings file.

Why it matters: Plugin CI

Most plugins are going to call into something in edx-platform–enrollments, course content, scheduling metadata, etc. But these plugins can’t call into edx-platform for real during their unit test runs. So instead, they make do with mocks, or they create filters that are built a certain way and assume that they will be called correctly. Their plugin’s CI runs with a bunch of mocked dependencies, edx-platform runs its own internal CI checks, and then the two get mashed together during deployment.

Regressions can easily slip through due to that post-CI combining of code. Maybe someone touching edx-platform code altered the behavior of a filter pipeline in a way that had unexpected consequences. Maybe the API was extended to handle a certain case when a setting was active, but didn’t properly test what happened when said setting was something else. Maybe there was a change to a model that was never part of any api.py file, but was so important and had been unchanged for so long that people just started using it anyway–because it was the only way to get the information that they needed.

Whatever the cause, there’s a wide set of failure modes that neither the plugin nor edx-platform have any way to effectively guard against during their CI builds.

What would the alternative look like?

At a high level:

  • edx-platform becomes a pip-installable thing.
  • There are still settings modules that ship with edx-platform.
  • If you have a generally reusable app that extends edx-platform functionality, it can be in its own repo, run its own CI, and have edx-platform as a dependency.
  • If you need to extend the platform for your site and have a number of customizations, you would make a repo for those apps, create a project in that new repo, add edx-platform’s default apps into your project along-side your own, and run CI for your apps, treating edx-platform like the big dependency it is.
  • We slim down edx-platform’s messy configuration and settings-based app/URL modification to make this sane to work with.
  • We figure out how Tutor and other deployment methods could change to make use of this.

The openedx-events repo still exists in this scenario. We still want a central repository for the versioning and documentation of events, as well as the ability to send these events over a message queue to external services that can read them.

Filters still exist. There are places where we really do want dependency injection, e.g. rejecting an enrollment, modifying Unit rendering, etc. That doesn’t change. The main change would be that we could actually exercise those views with our plugins in them.

Stable APIs are still important. This doesn’t obviate the need for stable, documented APIs. API changes in edx-platform can still break plugin code that builds on top of them. But this at least lets us better catch those issues before deployment.

It would eventually be a replacement for Django Plugins as they exist today. But the migration shouldn’t be terrible, since there are direct Django analogs for almost all the things that plugins actually do–map URLs, add apps, configuration, etc.

Why now?

I wanted to bring awareness to this because there are a lot of promising things on the horizon, where we’re looking to simplify configuration and operation, or expand on the hooks framework for extensibility. The extensibility story for Open edX has never been better, and we’re continuing to make significant investments there on both the back and frontends.

At the same time, we’re going to be whittling down more and more of edx-platform. We already have a project going to extract the XBlocks that live in edx-platform into a new repo. Between that and the ever closer switchover to MFEs, edx-platform will likely lose a lot of its static asset bloat.

I think that all of this opens up a window of opportunity for us to do something that would have been far too much effort back when we were first developing our Django plugins infrastructure. I also think this would help address one of the biggest shortcomings of our backend extensibility story, and improve confidence in testing and deployment.

3 Likes

I don’t think by any means that you are crazy. We need extensibility and we need stable ground for education projects to explore their ideas without making the development or maintenance so costly that nobody is willing to try.

Some of the ideas you are listing in the alternative look similar to the backstage architecture. The technologies are different for us mixing django and react but conceptually not too far away.

I’m not sure I understand why you are saying that our plugins don’t match Django conventions. In the Django world, projects are not pip-installable. Are you saying that we should break that convention?

It seems to me that instead of importing the edx-platform project, 3rd-party Django apps should import an Open edX library – and this library would be pip-installable. Is this what you had in mind?

1 Like

@regis: By backwards, I mean that if I want to create a site that uses allauth for authentication, I pip install django-allauth, create a new project in my own repo, and then add references to the relevant allauth apps and settings to my project file.

I do not download allauth to a local directory and then use a bespoke plugin mechanism that allauth implements to inject the rest of my apps into allauth’s project files and deploy that. The dependency relationship is flipped.

The goal in doing the re-shuffling would be to make it so that the plugins can catch regressions that occur because of changes to the edx-platform pieces they’re functionally dependent on.

It seems to me that instead of importing the edx-platform project, 3rd-party Django apps should import an Open edX library – and this library would be pip-installable. Is this what you had in mind?

I think we might be saying the same thing, assuming the Open edX library in question has all the contents and apps of edx-platform in it. I do think that it would still need to have its own settings files, or settings helpers because there are so many apps and default settings.

So if I have my own customized project built on top of Open edX, my settings file might look something like:

INSTALLED_APPS = openedx.config.base_installed_apps()

INSTALLED_APPS.extend([
    "third_party_openedx_app_made_by_another_company",
    "my_special_app",
    "my_other_special_app",
])

Sorry, my reply was unclear. I meant to say that 3rd-party Django apps should import a small, minimal Open edX library, which would provide a subset of core features from edx-platform. (which is kind of like what you are working on, right?)

I believe that this is the point where your approach would starts to break. It would mean that all 3rd-party Django apps would have to create their own project structure, including with very complex settings. In my opinion, this reversed dependency would introduce a lot of complexity, with minimal benefits.

If the real problem at hand is the testing of 3rd-party Django apps, then why not start working from there? Do you have a specific app in mind? (maybe the upcoming forum app?)

Is it really that much more complicated than what we have today? Say I create a new project and do:

from lms.envs.production import *

INSTALLED_APPS.extend(
    # my apps here
)

ROOT_URLCONF = 'my_urls.py'  # this would include lms.urls first, and then add my apps

Could something like that work and be simple enough?

I’m not that worried about the forum app because edx-platform calls the forum app, and not vice-versa. The forum app repo will have its own internal tests, and then edx-platform can add integration tests for how it uses that app. If the forums app deviates from edx-platform expectations in some radical way (e.g. an API call is renamed), edx-platform tests will break on the upgrade.

It’s when the dependency goes the other way that I don’t think we have a great solution–we have some workable mitigations, but it could be a lot better. For instance, take 2U’s learning-assistant plugin. It has to import stuff from edx-platform to do its job, so they isolate and wrap all those imports and calls in one module:

Then the test suite of the plugin has to mock those calls out for CI purposes, because their real implementation is in edx-platform. For example:

So if someone at some future point in time moves or renames the get_transcript function in xmodule.video_block.transcripts_utils, this plugin will break, and no CI step will catch it. That function isn’t even part of a public API module, so people might not think twice about moving it around. In fact, all the transcripts code is in the video_block package, so it would probably naturally be extracted when that moves out of edx-platform.

I don’t mean to pick on this plugin in particular–it was literally just the first one I opened up when fishing for an example. We just don’t have good alternatives here, and the more functionality we push into plugins, the worse this problem will get.

We have use that pattern of having the imports in a separated module and lots of mocking for a long time for our plugins. We have lately started testing integration using tutor to launch a release version of edx-platform where we also install the plugin code to test.

A better way of doing this is so welcomed.

How does this proposal relate to learning core (a small pip-installable core that plugins can build against?) and Attacking the Monolith by Extracting a Core?

1 Like

I still think that it makes sense to extract out a smaller core, and that doing so is more feasible than trying to whittle down edx-platform. People should eventually be able to write plugins that only require that extracted core. If a plugin were built and its only dependency was Learning Core, it wouldn’t be as vulnerable to the “dependency I mocked out of existence in my tests actually changed” issue I highlighted above, because they could just import and use the library APIs directly. They would also likely be using a much more stable, well documented API, with much faster build times.

But there’s a long road from here to there for the many, many features that plugins need to use. For the above example, it was transcripts. Maybe another plugin is looking at student verifications, or course modes, etc. Doing this work would be about making these types of plugins more resilient in the short to medium term. I think that it’s worthwhile as long as it doesn’t add too much complexity.

It might also help to accelerate getting some things out of edx-platform more quickly and safely–as a sort of compromise between “wait until all the right APIs are extracted” and “YOLO it together in production”. This is just off-the-cuff speculation from me though.

I guess there is a risk that doing this work could reduce the urgency of extracting pieces out into a smaller core, and be counterproductive in that sense. But I don’t think that’s really the tradeoff being made in practice. The learning assistant plugin uses transcripts, graph traversal of content, course metadata, and user roles–getting all that stuff extracted is too much effort for a small plugin, so it’s just going to do the dangerous unstable dependency mocking and hope for the best. Flipping the plugin/edx-platform dependency relationship would give those plugin authors a much cheaper way to have confidence about their builds and deployments.

It’s still not a great plugin author experience of course, given how much stuff needs to be spun up for edx-platform to work. But it should still be easier than either fully extracting all the dependencies or spinning up Tutor in CI–especially since a number of folks like 2U and MIT don’t run Tutor in their production environments.

Unfortunately no. Production settings are complex and many of them need to be overridden to work. See for instance the Tutor production settings template (notice the include statements).

But then, someone could theoretically refactor these settings to be plug-n-play and easily importable into 3rd-party apps. That’s a lot of work but hey, it’s just software after all, so that would be doable (and a nice to have in all cases). But it will still not make testing completely reliable, because the testing environment will be very different from the production one. In particular, because of: feature toggles, env variables, static assets, production dependencies (I’m probably forgetting some other items).

So, sure, tests will detect API changes more frequently than a mock. But your solution is attempting to bypass a more deeply rooted problem, which is the lack of a stable Open edX library API. The tests you are describing are typically classified as “integration tests”. Once we start testing the get_transcript API, then we are no longer unit-testing our code: we are testing someone else’s code. And that shouldn’t be responsibility of the 3rd-party app maintainer.

A “right” solution would be to move the get_transcript function to a pip-installable “libopenedx” package. That library would be small, documented, mypy-typed and stable. The solution that you are proposing is not really leading us closer to that ideal solution.

Apologies for being unclear. I realize that production settings are complex in this manner. What I was clumsily trying to ask was whether we could take the complex production settings module that we already generate, and then make minor adjustments to make them work in this mode–given we’d have to at minimum change the apps loaded and the URL mappings.

True. I don’t think the two strategies are mutually exclusive.

The question for me isn’t whether this is the Right Thing. Having plugins call into a monolithic blob of tangled code that makes no API guarantees is clearly not our end goal for the platform. The question I want to explore is whether this is Less Wrong than what we’re already doing. We shouldn’t let perfect be the enemy of good.

I’ve seen this kind of regression when I was at edX. While at Axim, I’ve poked people to revert changes that would have caused this sort of regression. The fact that eduNEXT is spinning up Tutor for integration tests leads me to think this sort of thing has happened to them as well.

I believe that making plugins safer and easier to build will encourage people to write more of them, and that’s a net win. But even if it doesn’t take us closer to our ideal state, it’s worthwhile if it significantly improves the reliability and testability of plugins in the ecosystem as we’re building those better APIs. (Assuming it’s not impractically expensive to do so.)