Thanks for this detailed TEP @kmccormick! I’ll outline my thoughts below.
It’s really that you found a new method to make Open edX development more convenient and faster. Persisting the python virtual environment and other folders makes it much easier to install custom requirements and hack on external dependencies.
That being said, I’m worried by a couple issues raised by your TEP.
dev/local/k8s disparity
In this TEP named volumes are introduced in dev
mode only. I understand this rationale: people should be careful with what they install in their production platforms. Having stateful volumes makes it harder to troubleshoot things in production. Also, this concept does not apply to k8s
deployments.
But I expect that users will customize their environments in dev, and they will want to apply the same changes to production. In many companies, it’s the same people who hack on Open edX (dev) and have to handle deployments (devops). They will have trouble understanding why they “can’t just run tutor local run lms pip install their-custom-xblock
” and see their xblocks appear in the studio.
We can certainly make it very explicit in the docs that this is a dev-only feature. Still, I expect that it will create some frustration.
Stateful containers
The lack of state in containers is one of the things that make Tutor more intuitive and less risky. Consider the following points:
- To uninstall an Open edX platform, it is sufficient to delete the
~/.local/share/tutor
folder. - Same for backups: migrating from one server to the next requires just one
rsync
command.
(These items are less important in dev
mode than in local
. This means that we can’t really resolve the disparity issue mentioned in the previous section.)
The fact that the containers are stateful will mean that we need an easy way to clear their states (i.e: the named volumes). Which brings us to the following item:
Volume freshness
Of course it would be great if we could automatically detect when volumes should be cleared. But it’s difficult to clear volumes as infrequently as possible and in a way that is not-too-surprising for the end-user.
It would be great if we could clear volumes on image building. Unfortunately, the IMAGE_BUILT
action hook would not be relevant, because we are unable to detect when an image build is triggered in some circumstances. For instance, when we run docker compose run --build ...
.
Thus, I think that we would mostly have to rely on manual volume deletion. This means that users will have to remember to run this command once in a while.
Opinion: are we applying a band aid on a wooden leg?
I don’t know how properly resolve the issues listed above. Which leads me to question whether we are doing things the “right” way.
The following is a personal opinion, and an open question: isn’t the fact that it’s so difficult to troubleshoot Open edX by bind-mounting directories from the host an indication that there is a deeper upstream issue? (i.e: in edx-platform) Aren’t we attempting to get around design issues in edx-platform that should not exist in the first place?
I don’t mean to deflect responsibility to you Kyle. Of course I know that you’ve already worked a lot on making edx-platform more compatible with Tutor, for instance by moving dependencies out of the common/lib trunk. I want to ask if more can be done in edx-platform to bring the repo more in line with modern development practices. For instance:
- Can we move
node_modules
to a different folder, such that it’s not overwritten when we bind-mount edx-platform? That way we would not have to re-runnpm install
. - Is there anything else we can do to avoid collecting static assets in dev?