Contentstore views refactoring

Hi, I was having a thought exchange with Kyle on architecture of cms/contentstore api views. There are some open questions there, and I’d love to know what you think about them.

Context:
I did a refactoring a little while ago, starting first with the xblock_handler. The reason was that we are creating an experimental authoring API, and wanted to add a public REST API that just wraps the internal studio API. Since there was just an enormous xblock_handler method, we wanted to reuse that logic in the rest_api folder.
Complicating things was that rest apis in contentstore inconsistently can be found in /views, /api/views, rest_api/views. We decided the new API should live in rest_api/views, though the original xblock_handler is defined in /views.
To refactor it, I basically introduced a “Service Layer” the way DDD has it, where the API view does not invoke calls to the modulestore API directly but instead views are just tiny methods that call business logic. As you can see here, the xblock_handler view does nothing but return a call to cms.djangoapps.contentstore.xblock_storage_handlers.view_handlers.handle_xblock .
Generally I’m quite happy with the state of now having slim view methods that just call the business logic from somewhere else, and can now reuse this logic easily.
What I’m not so happy with so far is the naming of various ...storage_handler.py files and that they are just lying around in the contentstore/ folder.

Ensuing Questions:
How should file naming ideally be and where should these service methods with business logic lie?

  • We already determined that there’s a naming conflict with Xblock Services so we should try to not use the term service in this.
  • Since I in fact introduced a service layer following DDD, I was thinking of just sticking these new files into a contentstore/service_layer folder, although again we have a problem with the naming conflict.
  • I may want to propose an ADR that’s only cms/contentstore specific that says due to us having so many different APIs and API folders, we want to follow this pattern of keeping view methods tiny and letting service functions handle the business logic.
  • We discussed whether api service methods should actually go into the api/ folder which is for python API stuff according to OEP-49: Django App Patterns — Open edX Proposals 1.0 documentation. I don’t think so however, since we’re talking about private/internal business logic that is specific to certain views and not a public interface.
  • We discussed whether all this business logic should go into utils/. The caveat there is that we are not talking about some reusable helper functions but about the complete business code for views that in turn calls modulestore. Maybe having a huge amount of business logic in there would be a code smell.
  • Kyle suggested that contentstore is just too big and that “If I had time to do a big refactoring, I think it’d split it into 2-3 apps, and have an api for each app that defined the domain-level operations. That would allow utils to be smaller.”
  • I’m wondering if that would not make things even more complicated - depending how the refactoring looks. Basically the way e.g. xblock_handler works is that the xblock_handler API endpoint is called, and then it does business logic that decides how data is provided and created using modulestore. The concrete database handling and so on is already all already abstracted into contentstore. Is another djangoapp in front of contentstore the way to go? (it’s just a question for me, in the end this may be a very good idea.)
  • From my perspective the issue is more that you didn’t really have an application architecture for contentstore so far: when an endpoint is called, everything in this djangoapp is just done in one huge method in the view file, with a few private methods in the same view file and some helper function in utils. My attempt was to not to just have large methods that do the complete logic for an endpoint in the djangoapp, but to split this into a part that provides the external API interface, and a part that does the business logic and calls modulestore and puts together the result objects and so on.

So far, I have no idea what the best way is to handle the architecture here. Happy for input!

This is a great idea.

So it seems like there are interfaces at a couple different layers here. The interface presented outside the contentstore to other python modules, and the interface we want to present for rest access patterns. Both seem to be accessing the same core business logic is that right?

Perhaps a core module would be useful that is then access by api and rest_api?

Here is the PR for the ADR I mentioned. Feel free to comment on it / do additional review as you see fit, any feedback will be taken into account.

1 Like

That’s a very good idea. I like the naming core, it seems very in line with our naming conventions so far.
Would you say core should be a folder in the contentstore djangoapp? That’s probably how I would do it.

Yes, there are at least 3, potentially 4 APIs/interfaces (e.g. AJAX calls, private REST API, public API), so it’s getting too complex quickly.
There’s further subdivision to v0/v1 and so on.

So my view is it’s completely fine to have multiple APIs as long as you just define the endpoints but have the business logic as a separate “core” as you call it that is common to whatever interface you want to build.

I also get Kyle’s point about contentstore becoming too big and wanting to split it into multiple djangoapps, but I doubt it warrants the effort at this point.

Yea, I agree with this. I think a new core folder under contentstore.

Yea, that makes sense but it may become more obvious where the seams are once all the core logic is separated from the various interfaces that make use of it. So it doesn’t feel like this change would prevent us from breaking things down further in the future.

2 Likes

All the above sounds good to me.

One recommendation I’d add is the use of a data.py module for immutable domain-layer attrs classes (dataclasses are good too, they just weren’t available when that OEP was written) which can be passed around in place of models or entire xblocks. Example. If there are data classes that you’d rather not expose in the public API, maybe you could have two data modules:

  • cms/djangoapps/contentstore/data.py – domain objects exposed by the public python API
  • cms/djangoapps/contentstore/core/data.py – domain objects for internal business logic

Another recommendation is to be wary of deep nesting and long names. There’s a non-trivial cognitive load that is added when we have modules paths like openedx/core/djangoapps/content/foo/bar/bar_providers.py instead of, e.g., common/core/foo/bar.py. I know you’re working within the existing framework of edx-platform’s folder structure, so there’s only so much you can do here :slight_smile:

Lastly, once the refactoring is done, if we like how the end result looks and think it’d generalize well to other apps, I suggest that we update OEP-49 with the structure.

Well, that sounds very reasonable. I looked at OEP-49 once more and what you say about data.py modules seems to follow this. It sounds like these domain objects function a bit like DTOs, right?

Now to clarify: currently our contentstore djangoapp has different APIs, but without digging through every single file, I currently don’t see any specific Python API. Most things are directed at various frontends or API clients. So I am not currently aware of anything we have to refactor exactly in this way, but if we do find something, I agree.

As for internal domain objects - in ..../contentstore/core/data.py - that sounds like a good idea. So the way it would work is I would define such a data class in data.py and then for example if I receive a GET request the view would call some factory or provider function from the /core folder which will fetch data, construct a data domain object, and then return this object to the view, is that correct?

I definitely want to include the part about using a “/core” folder in the existing ADR. A subsequent refactor would be very simply to move the few files that already relate to a service layer - e.g. xblock_storage_handler - into this folder.

Establishing domain objects for internal business logic via a data.py file doesn’t have an easy refactor right now. Should we put this idea in the same ADR as well or define this separately?