Configuration for the Hooks Extension Framework

Hello everyone!

together with @mgmdi and @morenol we have been hard at work trying to make the Hooks Extension Framework come to life.

This framework will make the platform highly extendable and many in the community may want to use it, so we’d like to get some input from you about the ways we can use to configure it. Also, there has been a lot of work on the edx-toggles front and the doc-a-thon effort to write documentation of settings and we don’t want to go against all the progress made there.

The issue at hand now is that we need a way to configure a list of functions (actions or filters) that will be called at different places (triggers) in the code of edx-platform.

So, for a string like:

"openedx.lms.auth.post_login.action.v1"

We need to define a list of functions:

[
"from_a_plugin.actions.action_1",
"from_a_plugin.actions.action_n",
"from_some_other_package.actions.action_1",
# ... and so.
]

And also some extra variables:

"async": True, # ... and so.

And at some point, some heavily configured instance will have a handful of those. Perhaps more. If we get everything right, eventually there are 44 triggers planned.

So we are looking into a complex dict with many keys and nested lists. :dizzy_face:.

We have considered now 2 places where it makes sense to write this configuration.

  • A dict in the Django settings.
  • In a view of the AppConfig of your plugin.

There are pros and cons of each method and the key here is good balance.

About the django settings:
the good:

  • It is very standard, everyone should know how to change it by now.
  • Can be altered without installing plugins.
    the bad:
  • It is hard to document a large dict.
  • Could grow into something difficult to manage.

About the AppConfig
the good:

  • each plugin can extend the config to add it’s own actions and filters without collisions.
    the bad:
  • it’s not possible to control the ordering of different actions being connected to the same trigger by different plugins.
  • for updates, an operator must install a new version of the dependency which usually is longer and more difficult than changing vars and restart.
  • not easy to configure by tenant if you use site configs.
  • requires a plugin.

Let’s breathe. If I still have your attention, you are awesome! Have any feedback? Please reach and let me know.

Continuing…

We looked into SettingToggle from edx_toggles.toggles, but it looks to work best for dicts of Boolean keys.

SettingDictToggle("HOOKS_CONFIG", "openedx.lms.auth.post_login.action.v1", default={}).is_enabled():

is_enabled does not fit the bill.

After all this long talk finally the questions. Do you have a particular preference here? Are we missing the obvious answer? Is there something you definitely would like us to avoid giving the circumstances?

Other things we considered, but decided against:

  • ConfigModels: we did not want to tie this to a migration and a whole new model. Also they are very rigid and the whole thing here is flexibility.
  • Waffle Flags: we require more than a boolean flag.

I’ll tag some people that might want to chime in here. Sorry for the noise. @pdpinch @nimisha @braden @dave @regis @jhony_avella @eric.herrera

2 Likes

I’m in favor of putting it in the Django config. It’s simpler on the implementation side, more predictable, and easier to see what’s going on. For serious integrations, it’s likely you’d want a whole settings file that does just this, and gets included into the main settings file (or maybe you pull it programmatically from JSON or some such).

Regardless, one big dict of {str: List[str]} doesn’t sound too bad. If installs get to the point where it’s difficult to manage, then we can figure out alternative mechanisms to compile out a config based on some more terse configuration. But I’m not convinced we’ll really get there or that we know what the ideal ergonomics of that would actually be before folks start managing a lot of these.

Adding a few other comments below. I caveat all of the following with the acknowledgement that I am not the target audience for this. I’ve been in the room for these conversations in the past, but at the end of the day, I realize that I don’t do this part for a living. I offer the items below as suggestions and possibilities, but I rely on you folks to determine whether these use cases are at all plausible or worth the complexity.

Django Signals for broadcasts

Can we split the synchronous transforms and the broadcast notifications mechanisms completely, even if the writeup for both are part of the same proposal? General notifications can be done using Django signals–it’s well understood, and we don’t really have to provide an async option as long as we document the everywhere-in-platform pattern of “catch the signal, queue a celery task”.

Ergonomics of Writing the Hooks

If I’m writing the hook that provides openedx.lms.auth.post_login.action.v1, how is that made clear to me that this isn’t something I can casually refactor, but is in fact a big, public promise? A separate hooks.py module? A decorator along the lines of @hook('openedx.lms.auth.post_login.action.v1')

Filter-style hooks sound like middleware

Do filters have the capability of altering parameters to the function (as well as altering the results coming back)? If so, they sound a lot like middleware. I think that’s a good thing actually, but it makes me wonder if these filter functions should be classes instead, so that you could get called with some sort of initialization. Maybe that’s overkill, but it might also be a straightforward way to provide the request object context to something that wouldn’t normally use it, in case your filter needed to check some querystring params or some such.

Async

Django 3.2 is right around the corner, and async views+middleware come with that. I’m not sure when edx-platform will upgrade, but it’s at most a year away (since 2.2 LTS support ends at that time).

Integration hooks may lean towards things that are I/O-heavy and could benefit from being async. I don’t think we should implement this right away, or even really spend too much time specifying the details. But it might be good to think about how we might do this at a high level so that we don’t box ourselves in. This might be another case where having a class would give us the flexibility to do a sync and async path down the road. I expect there are patterns we might copy from middleware here as well.

2 Likes

Hey @dave,

Thanks a lot for the thoughtful answer. About locating the configuration in the django settings we were also leaning towards it, because of the lack of control of the ordering we would have using the Plugin App Configs.

About the other comments you brought up, we are very happy to take your feedback into account even if you might not be the target audience for this. You have extensive experience in a larger scale of operation and might be seeing things in advance that we can not even imagine.

I’d like to follow up on those.

Ergonomics of Writing the Hooks

Agreed. It is super important to make clear how big of a public promise this is. So far we thought we would try to maintain this via unit-tests. Basically testing that the triggers are being called with the correct input and the output of the filters is used as the parameter for calling the functions they are designed to transform.
Now, for the tests, we had in mind using the definition of the test to hightlight the individual documentation of each trigger. E.g: https://github.com/eduNEXT/edx-platform/blob/fmo/initial_hook_framework_docs/docs/guides/hooks/pre_enrollment_trigger.rst.

The same could be done at the place in the code where the trigger is called.

from edx_django_utils.hooks.triggers import trigger_filter

...

# Public hook. See: https://github.com/eduNEXT/edx-platform/blob/fmo/initial_hook_framework_docs/docs/guides/hooks/pre_enrollment_trigger.rst
out = trigger_filter('openedx.lms.enrollment.pre_enrollment.filter.v1', user=user, course_key=course_key, mode=mode)

...

This comes as the example of our current implementation of the trigger_action and trigger_filter functions. We have it as an internal PR to own fork for now ([BD-32] feat: Add Hooks Extension Framework tooling by mariajgrimaldi · Pull Request #6 · eduNEXT/edx-django-utils · GitHub).

Do you think that is enough to prevent anyone from casually changing the public signature of the functions?
I’m interested and curious as how the decorators could work for this.

Filter-style hooks sound like middleware

Do filters have the capability of altering parameters to the function (as well as altering the results coming back)?

Yes.

If so, they sound a lot like middleware.

I tend to think in the middleware that goes before and after the django views and this do differ a lot from that. But in the broader sense of middleware, yes it does. Now, the django middleware list does also run in a a pipeline since the request that you pass to the first middleware is returned and pass to the second middleware and so. However we are drawing inspiration in the social-core pipelines for the mechanics of what the pipeline should return and how to stop it or raise exception from it.

I think that’s a good thing actually, but it makes me wonder if these filter functions should be classes instead, so that you could get called with some sort of initialization. Maybe that’s overkill, but it might also be a straightforward way to provide the request object context to something that wouldn’t normally use it, in case your filter needed to check some querystring params or some such.

I think of filters as “the agreed upon places” where developers working outside the platform core get access to data that determines business rules of the application. I like the idea of the classes and this ties nicely with the next point but for now it does seem like overkill. However doing it now, even with very simple classes, can make the public promise more stable since we already have the base class and there is no need to ask developers in the future to go from function based hooks to class based ones.

Async

I was unaware of the async views+middleware thing coming for django 3.2. I have made it a task for our team to review it and make plans accordingly. I agree that we might not need to do anything yet but we need to leave enough space in the design to accommodate when the time comes. Using classes instead of functions to implement the actions/triggersfilters could very well be the way to go.

Celery

So far we have been writing this code in the edx-django-utils library. If we want to add support for celery there we would need to include celery as a dependency. I don’t know if this is a big thing and we should try to avoid it or just go ahead and add it.
If we wanted to avoid it the only thing I can come up with is to put this in a different (new) library that depends on edx-django-utils and includes celery. Thoughts on this?

Django Signals for broadcasts

I left this for last because it the one where I am having a stronger opinion and I wanted the previous context before answering.

I see a lot of value in using django signal and I was initially the proponent of using signals everywhere. However after the firsts rounds of feedback, you helped me understand that we where not actually gaining much from constraining the use of actions to the django signals pattern.

Since we are introducing a new way of extending the platform, we are asking developers and operators across the community to learn yet another pattern to get Open edX to do what they need. I’m giving this a lot of weight and I’m trying to make it just one pattern that lets you alter the core in 44 ways. This might force a bit of a compromise in the way we write the framework, but I feel strongly against adding 2 patterns which let you alter ~20 places each.

Keeping the two mechanisms similar would facilitate developers to write an Action that can be used at any trigger in the platform and can foster a community supported list of integrations to third party services (CRMs for example). It would still be possible to do it if we separate the mechanism, but it would raise the entry barrier.

As long as the configuration and calling mechanism is the same, I’m all for using the django signals pattern internally. But I don’t think this was your idea. Now, if we do it, I still have the question whether this means creating new signals for the triggers or re-using the existing signals for the Django models that we already have (for example, CourseEnrollment.post_save)? I am inclined here to use new signals, because a big part of this proposal is the general and individual documentation that would let you find the available triggers. Connecting to the post_save of an existing model is already permitted in the current form of the plugins and I have not seen much use of that pattern anywhere.

Am I missing the point of why you support separating the mechanisms? Perhaps I’m putting to much weight on the whole keeping the two things similar. Does anyone else have an opinion here?

Thanks a lot for all the amazing contributions to this conversation.

Hey there @Felipe and @dave!

I want to thank you for your comments on this crucial discussion. Also, I’d like to invite you all to check the updated version of the Hooks Extension Framework: Tooling PR. There, we took into account some of your comments on, for example, how to configure the framework!

Please feel free to leave any comments or feedback!

I think this is a significant concern. I hope that we can make it very easy for people or machines to update the configuration when installing a new plugin.

For example, when we onboard a new developer to one of our bigger projects that’s built on Open edX, we have some scripts that run which help them configure their devstack, like this:

docker exec -t edx.devstack.lms bash -c "grep BLOCKSTORE_API_AUTH_TOKEN /edx/app/edxapp/edx-platform/cms/envs/private.py || echo BLOCKSTORE_API_AUTH_TOKEN = \'edxapp-insecure-devstack-key\' >> /edx/app/edxapp/edx-platform/cms/envs/private.py"
docker exec -t edx.devstack.lms bash -c "grep PROJECT_PLUGIN_AUTHORIZED_USERNAMES /edx/app/edxapp/edx-platform/lms/envs/private.py || echo PROJECT_PLUGIN_AUTHORIZED_USERNAMES = \[\'${BACKEND_LMS_API_AUTH_USERNAME}\'] >> /edx/app/edxapp/edx-platform/lms/envs/private.py"

Or as a second example, in a project README you often have installation instructions like “Please add "somelibrary.filters.1st_function" to the LMS_HOOKS setting.”

Something like those are fine, and simple. If typical use cases work like those examples, then I think this is great.

On the other hand, if some plugin projects are going to have to say something like this:

To install this plugin, go to your LMS HOOKS_EXTENSION_CONFIG and find the synchronous (not async) pipeline configured for "openedx.lms.auth.post_login.action.v1". If it doesn’t exist, you can create it using this template:
"openedx.lms.auth.post_login.action.v1": {"pipeline": ["myco.myplugin.post_login_action"], "async": False}.
If it does exist, insert "myco.myplugin.post_login_action" into the pipeline, but ensure it comes before any “otherco.otherplugin.post_login_action” handler or filter, but after any "openedx.lms.auth.utils.user_hook_filter".

…then I think it’s too complicated and it’s going to be a bit of a pain for people, especially people who are just trying to install and use a plugin rather than configuring some elaborate custom use case. In such cases, I think it’s best to try to find a way to push the complexity into the code and have it handled automatically, so that the configuration can be simpler.


Hmm, now that I’ve written this out, I feel like I’ve just re-stated some of the points you made in the original post :stuck_out_tongue:

That’s a good point. I guess in light of that, I also like the django configuration way because it’s pretty explicit and easy to reason about. If it’s a pain for people as it gets bigger, we can take another look.

Exactly. At some point I can imagine that some sites/projects could have a complex configuration, such as the one you are mentioning. I do however think that the most common scenario is going to be something like:

HOOKS_EXTENSION_CONFIG["openedx.lms.auth.post_login.action.v1"] = "your_plugin.filters.1st_function"

I also imagine this will go in the settings/production.py or settings/common.py module of a plugin. So in theory the simple cases are covered without you needing to do anything outside of installing the plugin.

We are aiming at making this easy to get started and powerful once you know what you are doing, that is why we made the current version very forgiving in terms of the configuration

Only very sofisticated cases will need something like the option 1 from the PR. Now, if this becomes widely used by the community I’d be in favour of creating more tooling that makes it easier to review and configure. Here I’m thinking of a “is_staff” only view to look at your configured hooks and also functions that make a complex configuration dict easier to handle.

1 Like

To give more context for everyone commenting or just joining this conversation, I created a PR with the documentation of the general framework and a few hooks. For these, we are already proposing a a public signature.

Here is the PR link: [BD-32 ]Hooks framework documentation by felipemontoya · Pull Request #27157 · edx/edx-platform · GitHub

and here is the way the docs would look once rendered: edx-platform/index.rst at fmo/initial_hook_framework_docs · eduNEXT/edx-platform · GitHub

I’m still working on examples, but I’ll be updating the same PR. Maybe you already see it with no more TBDs

Hey there! I wanted to add some comments,

About celery

We’ve talked about using celery tasks as a way of executing asynchronously our pipeline mechanism, -this pipeline is part of the tooling that’s being implemented on
edx-django-utils-. This will mean, among other things, adding support for celery.

Well, this is our proposal to make this possible:

Request object serialization

We’ve identified use cases for sending the Request object to the async Pipeline and for that we want to propose creating a custom serializer for it as a Hook Utility. Having that, before calling the async pipeline we will -programmatically- handle the serialization, so any exception will be raised.

Please let us know what are your thoughts on this :relaxed:

Configuration
FWIW, I am also supportive of using Django Settings as the mechanism for registering Hooks - for the aforementioned reasons.

Ergonomics of Writing the Hooks - signaling to devs that this is a public API
To build upon the ideas above of using clearly documented unit tests and doc-annotations, I encourage us to consider piloting Consumer-based Contract testing for these Public APIs. We can establish a standard for where these types of tests are located, how often they are executed, and the process for fixing the APIs when they break.

Celery
Can someone clarify what the conclusion is on the dependency on celery? It seems to me that using Celery is a decision of the Hooks developer and need not be coupled with the Hooks framework itself.

Django Signals
It seems we’ve gone back and forth on this a few times. I asked for more clarification on why we wouldn’t use Django signals on the OEP as well.

To be clear, I agree it is an anti-pattern to have Plugins subscribe to Django data-model changes directly. Instead, I imagined Plugins would subscribe to higher-level events and domain concepts. The way I had been thinking about this was that we would have 2 types of Plugin APIs:

  1. Django Events. A published place in the monolith to discover available “core events” for Plugins to subscribe to and act upon (data flow from monolith ->to Plugin).
  2. Python APIs. A published place to find Python APIs supported by the monolith that Plugins can rely upon (request flow from Plugin → monolith).

Then, eventually, if/when the Plugin becomes a microservice (because its complexity and value has increased):

  • #1 would be replaced by a MessageBus Event (from monolith to microservice).
  • #2 would be replaced by a REST API (or GraphQL) call (from microservice to monolith).

I believe the direction that the Hooks framework is taking is addressing different use cases from what I had in mind. Hence, I’ve requested adding a list of Use Cases on the OEP to clarify what types of extension capabilities we are targeting.

Thanks,
Nim

3 Likes

I don’t think you need the celery dependency. It’s an extremely common pattern for people to listen for some sort of signal/trigger in process, and then use that to immediately spawn off a celery worker if they want to.

I hope I haven’t muddied the waters with my previous talk about async. I think that the hooks framework should have a rough plan for eventually having Django async views support baked in, so that hooks that alter the response of a synchronous request don’t unnecessarily block the thread. A scenario for this would be if every request to student dashboard requires a hook-triggered call to a web service to alter the results before returning the page to the user.

I think that the notify-only pattern of “detect when an event occurred and launch a celery task to do something about it” doesn’t need to be baked into the framework. Whether we use signals or a custom thing, this framework only has to get as far as “notify the piece of code that an event has happened”, and we should leave the problem of what happens after that to the listening code. Most of the time, spawning a celery task will likely be the right thing to do. But maybe it’s just writing one row in an audit log, in which case keeping it synchronous is fine.

Ah, I see. I think I get where our disconnect is. Please correct me if I mischaracterize your position:

We’re both trying to find a way to minimize friction by reducing the number of things that developers have to understand to extend the platform. You believe that having both signals for broadcast events and a separate mechanism for filter-style hooks introduces two very different interfaces for the same extensibility use case, so we should consolidate on one. Since signals are not capable of ordering/pipelining, that one mechanism should be these filter functions.

From my perspective, we already have an established convention of using signals to broadcast and catch system events. I count 145 instances of @receiver() decorators to catch these events in edx-platform alone, and the vast majority are logical, lifecycle events like course_published or comment_voted, GRADING_POLICY_CHANGED, etc. (as opposed to low-level database ones). So when I see a proposal to do broadcast and filter-style hooks using a new mechanism, I see duplication and confusion for the broadcast use case. Do I listen for the broadcast piece using the signal version of the event or the hooks version?

I echo @nimisha in supporting making new, semantically meaningful signals based on higher level events and domain concepts (with possible low-level additions for things that need strict replication, like user tables, but that’s a tangent that’s not relevant here).

I believe that the two modes this runs in are actually really different, and that trying to make a unified interface for them will start causing us problems down the line. The broadcast events can be decoupled across time, repos, and eventually across services. The filter functions are very tightly bound to being in-process with the LMS. Even if the broad intent and audience are the same, the data, lifecycle, and dependencies are dramatically different. To me, it’s like comparing a bike with a car–both nominally used for transportation, but in very different ways and with very different tradeoffs.

Assuming these two pieces were split in the implementation, I think my stab at the broadcast half of the API would be:

  1. Create a new repo for a python lib that defines new event signals with proper documentation.
  2. Start importing those signals into edx-platform and, if possible, replace the ones that edx-platform is already emitting with these new ones. In some cases, it won’t be compatible. In some cases, it might be as straightforward as copy/pasting.
  3. Have an eye towards working with OEP-41 and the new message bus work that @feanil and others are working on soon.

I think that a really awesome end-state for the broadcast use case would be to be in a place where there is either one open-edx-events lib (or multiple ones based on subdomain) that defines all the events that are emitted by the system. It’s easy to test because you have no direct code dependency to edx-platform directly, as long as you import this events lib for your tests.

This path also allows for the migration pattern @nimisha pointed out, where a plugin migrates out of the process via message bus. When the message queue system is implemented, this library (or another one) could be made to know how to send events to and receive from a message queue. Then you could still listen for the same Django events in some entirely different service, and the library would know how to pick up JSON serialized messages and translate them into those local Django Signals that your app would still listen for in the same way. That does impose the major constraint that the Signals can’t pass along edx-platform internal data structures and models–a constraint that would never work for filters, but would be reasonable for these broadcast events.

To me, the filter-based hooks go in completely the opposite direction. No platform-agnostic serialization, no message queue involvement, no notable overlap with existing conventions. For that use case, having a pipeline style list of functions/classes for manipulations makes sense to me.

Thanks again for doing all the work to put this proposal together and going through these feedback cycles. :smile:

3 Likes

Circling back to the configuration aspect, if we split up the broadcast use case to use a set of signals that lives in a separate repo, then that half of the proposal doesn’t need any explicit configuration. You’d catch the signal in code using built-in Django mechanisms, like:

"""
My extension module...
"""
from django.dispatch import receiver
from openedx_events.signals.authoring.v1 import COURSE_PUBLISHED


@receiver(COURSE_PUBLISHED)
def listen_for_course_publish(sender, course_key, **kwargs):
    # Do something here

At that point, the config would only be necessary for the filters, e.g.:

OPENEDX_HOOKS_CONFIG = {
    "FILTERS": {
        "openedx.lms.enrollment.pre_enrollment.filter.v1": [
            "myplugin.check_enrollment_allowed",
            "thirdparty.embargo.enrollment_check",
        ],
        "openedx.lms.enrollment.pre_unenrollment.filter.v1": [
            "hotel_california.checkout.unenrollment_ok",
        ],
    }
    # Other top level config, like debug maybe?
}

This might also help with the config becoming unwieldy, by reducing the number of things that need to go in it. I’m guessing that people will want to listen for “this happened” events more often than they want to actually alter the results? If so, that means that most of the time they wouldn’t need to put a config at all–they just import from the signals library and use the Django-standard method to receive from it, and they’re done. The config is only used for the more advanced use case of altering results.

Removing the broadcast use case from the config also reduces the number of permutations we’d have to care about a bit. There’s no need to use sets for broadcast receivers for instance–FILTERS can always be a mapping of strings to the list that represents the ordered pipeline of calls.

One part that’s much fuzzier for me is the celery pipeline use case. I had been assuming that any filter pipeline would always execute in-process, so I think I’m just misunderstanding the intent behind adding celery to the mix there.

Thank you.

3 Likes

Last week @nimisha, @dave and I happened to meet at a core commiter coffee break and we used some time to discuss the open points here.

I left the meeting full of clarity on how to proceed. I’d like to bring you all up to date on the current version of the proposal.

Django Signals for broadcasts

this is perfect. It sums up exactly where I was coming from and now I also get your point and Nimisha’s. Now, the current plan is pretty much what you outlined above.

  1. Create a new repo for a python lib that defines new event signals with proper documentation.
  2. Start importing those signals into edx-platform and, if possible, replace the ones that edx-platform is already emitting with these new ones.

What I really like about this, is that given that this would be a small library, anyone creating plugins should be able to add it as a dependency and even pin it to the same version the platform is using.
This means that as a plugin developer I also get to catch miss match errors when the library is updated instead of waiting until I install the plugin in edx-platorm and try to start the server.

This does create some confusion about the preferred way of connecting to the signals. I can go at it directly in the code:

"""
My extension module...
"""
from django.dispatch import receiver
from openedx_events.signals.authoring.v1 import COURSE_PUBLISHED

But also, plugins have a way of connecting to signals via the AppConfig

'signals_config': {
    'lms.djangoapp': {
        'relative_path': 'signals',
        'receivers': [
            {
                'receiver_func_name': 'start_tenant',
                'signal_path': 'django.core.signals.request_started',
            },
       .....

But maybe it just works when I put 'signal_path': 'openedx_events.signals.authoring.v1.COURSE_PUBLISHED',. Maybe its not even an issue. We’ll be testing this to find the recommended way of using it. I suppose the documentation of the library is the right place to put this (and link it from the long documentation about the framework).

While we are at it, it makes sense to move what we already have as a PR at edx-django-utils to the events library. I was thinking of calling it openedx_hooks.

Celery

The second most discussed point was the inclusion of celery into the framework for async processing, but this was mostly a matter of making it easy for developer to do the right thing in terms of performance. Now that we have established that receiving the signal and then registering the async task is a common pattern, we can leave that to the users of the framework. We will still make lots of examples and leave it clear in the documentation that this is expected and encouraged. However I agree there is no need to make it part of the library.

This removes previous concerns we had about serializing certain objects, such as requests of users, so overall it makes the proposal simpler. A good day in the office.

Ergonomics of Writing the Hooks

This point was not in disagreement, we all know we want to offer a stable environment for developer to build on top of. It is just a matter of how to.

I was not familiar with this, but after reading the concept I got very excited. I want to give this a big shot and see what we can do. However, given my lack of experience there I would still consider it a pilot. I hope is a successful pilot though.

Other pending work

That was the big part. Other things in my head now are:

  • Clarifications at the OEP: I will update the OEP to reflect what I just wrote here.
  • Async django views: I agree with Dave, the proposal should at least have a rough plan for when this becomes available.

On the topic of scope, we also agreed that the goal here should be to implement the architecture of the framework and start the implementation of a few key hooks.
I’m currently thinking that the candidates would be actions and filters for before and after:

  • login
  • registration
  • enrollment
  • certificate generation

This key places should be sufficient to open the door. Once that is done, we can get started on the standard python APIs project. This will allow developers to not only have a stable place to write their own extension logic, but also a stable way to access some platform objects so that the extension logic can do more interesting things.

I hope I covered everything. Thanks to all of you for all the great ideas, counterpoints and all sorts of feedback. This is making the framework a lot better than we would ever have thought.

3 Likes

Thanks for this excellent summary of our conversation!

Regarding Consumer-based contract testing, also check out this page created from a spike by a former incarnation of the edX Test Engineering team: https://openedx.atlassian.net/wiki/spaces/TE/pages/1018134699/Discovery+Consumer+Contract+Testing+with+Pact

cc: @Dawoud_Sheraz who is actively looking into Contract Testing, though he’s probably focused on contracts across services, rather than plugins.

It took me a while, but I just updated the OEP-50 PR.

This makes clear the usage of Django Signals and removes Celery from the discussion by making it a recommend pattern for developers using the hooks. Also added some considerations for async views.

Now, the only remaining open topic for me would be the Ergonomics of Writing the Hooks.

I went deep into the pact.io documentation, got started with using contracts and the contract broker. However the implementation of pact.io is heavily geared towards contracts that are written for a http API call. The pact library makes a mock http server and intervenes the unit testing process by responding with the test data to actual http calls or by calling http endpoints in the provider if that is the part of the contract being tested.

We are not using http APIs here. Instead we are going by the more traditional API meaning. This left me loving the pattern (will use it for another ongoing project), but at lost as to how to apply it to this situation with the current library.

I though of a hacky way forward. Write some code that converts the function call into a http API call and viceversa. But we are again hitting problems with serializing complex objects. I don’t like it at all.

The second idea I came up with, is to adapt the idea of defining the signals in a new library to the filters.

We define the signals in a new and lightweight library, so that both the platform and the plugins can add it as a dependency pinning the version. This way the definition of the signal is the same for both parts.

We can do the same for filters. We define the a function for each specific filter at the openedx-hooks library and we make sure that we call it in the platform. The plugin can trust that public promise then. If someone is refactoring the platform, then they will have to update the openedx-hooks library and this informs anyone counting on this for their plugin. Update 21.04: After more discussion we will better make it two libraries openedx-events and openedx-filters so that they can evolve at different speeds.

It makes sense that we also put the tooling for the pipeline, filter configuration and tracing mechanisms in there as well. Currently we have it as a pr in edx-django-utils.

It looks like this at the library

"""
~openedx_hooks.filters.enrollment.v1~
openedx_filters.enrollment.v1
"""
from openedx_filters.pipeline import run_pipeline

def pre_enrollment_filter(user, course_key, mode, **kwargs):
    data = {
        "user": user,
        "course_key": course_key,
        "mode": mode,
    }
    out = run_pipeline("openedx.lms.auth.pre_enrollment.filter.v1", data)
    return out.get("data")

And at the platform:

"""
edx-platform/common/djangoapps/student/models.py module
"""
from openedx_filters.filters.enrollment.v1 import pre_enrollment_filter
from openedx_filters.exceptions import HookException

.... # somewhere in the enroll method
    try:
        hook_result = pre_enrollment_filter(user, course_key, mode)
    except HookException as err:
        # do something
....

This would be instead of having the platform run a generic function like run_pipeline or run_filter directly.

I’m very happy with the way this is turning up. What do you guys think?

Thanks a lot for all the great discussion so far.

2 Likes

Thank you all for the feedback. :relaxed:

Following Dave’s suggestion about the OEP-41, we made two pull requests, one on Open edX filters naming and the other on Open edX events naming. In those PR’s we describe a proposal about the naming inspired by the OEP.

It’ll be great if you can take a look and advise us on using the naming format @dave :raised_hands:

1 Like