Last week I ran into an issue where I was trying to debug an issue around ProblemBlock
when it was executing instructor Python code (customresponse
). Normally this runs through safe_exec
and codejail, which is supported in Tutor via a plugin. I ran into some issues getting this running properly on my Mac though, and I ended up just hacking something on my local machine (using unsafe execution) in order to diagnose my original issue.
Coincidentally, I had also recently read this interesting article on Python WASI support. Python 3.11 has added support for building to wasm32-wasi (though only at tier 3). That means we can invoke it from a WebAssembly runtime like wasmtime
.
So with a quick bit of late night hacking, I came up with this proof-of-concept that uses WebAssembly to execute instructor code instead of codejail:
(WARNING!!! THIS IS TERRIBLY HACKY PROOF-OF-CONCEPT CODE THAT YOU SHOULD NEVER, EVER USE IN PRODUCTION!!!)
What can it do?
It will allow customresponse
code to access context passed in from the ProblemBlock
, like the anonymous_student_id
. Variables you set in your customresponse
python will also be returned to the problem, and can be used in the HTML presented to the student. Grading also works. All executing through a WebAssembly sandbox.
All code execution happens on the server-sideāthere is no wasm code being sent to the browser. Just think of it as a WebAssembly version of codejail.
It doesnāt support the code prolog though, and there are no libraries like scipy
installed. Also, it doesnāt do the trick we do with random2
to emulate Python 2.7 random module behavior, so the values you get in randomization arenāt going to be the same.
How does it work?
It puts an entire wasm32-compiled Python 3.11 binary and lib in a new directory in edx-platform. We construct the script thatās going to run in a special folder, write the globals the sandboxed script is supposed to see via a JSON file, execute the script through wasmtime, and then read out another JSON file for the values to pass back.
FWIW, coding it this way was just to make debugging easier, and any real solution would use dynamically generated tempdirs, or possibly some env related variable passing mechanism, or maybe even interface types, whenever those stabilize and are widely adopted. Any bundled Python images would also go somewhere else. Again, this was a quick hack just to prove the concept could work.
Why is this exciting?
Iām excited by the idea of WebAssembly for a number of reasons:
Ops/Deployment
- This should simplify deployment. Whether itās running in the LMS process or on a separate service, itās not going to require AppArmor, which should make it easier for non-Linux platforms to set up and run.
- We can potentially remove scientific libraries from edx-platformās dependencies.
- The wasmtime runtime has a notion of āfuelā that you can give to an invocation, which is a much more deterministic count of instructions executed than the time-based mechanisms we use for codejail. Weād still have to guard against malicious
sleep()
calls and the like, but doing most of our accounting with fuel credits means that we wonāt be susceptible to sporadic failures for really intensive problems that happen to be executed during server restarts or other times when the system is under unusual CPU load.
Instructional Possibilities
Open edX has always been a go-to platform when you really have a particular learning experience you want to deliver and are willing to roll up your sleeves and code it yourself. Embracing WebAssembly could supercharge the things we already do well in this department, and serve as a springboard for new innovation. Just a few examples:
- We can offer course teams multiple versions of a given set of scientific libraries for their course, along with a richer offering of libraries for their grading. We can eventually reach a point where we donāt have to choose between breaking content and updating a library to a non-ancient version.
- If weāre offering multiple versions/options (e.g. via backwards compatible, opt-in attributes on the
<script>
tag inProblemBlock
), we might be able to offer all kinds of other options as well. Rust, JavaScript. Or even variants of Python like MicroPython, which are much smaller in size and memory usage than CPython, and may be more suited for more dynamic grading that doesnāt need scientific libraries. - Could we trust this enough to execute student code?
- We could potentially open up many more grading and async grading possibilities, and the sharing of graders through libraries. The line between ānew XBlockā and āJS frontend + backend grader contentā gets a lot blurrier.
- We could contemplate running course team code in places weāve never seriously entertained it before, like analyzing events pulled from OARS.
Itās not that we couldnāt do these things before, necessarily. But I expect it will become a lot easier to implement them in the next year or two, and that ease will enable a lot of innovation.
What are the challenges?
This space is rapidly developing, but there are still rough edges. Even if Python compiles to wasm now, many interesting libraries are not yet available.
For instance, one major hurdle would be getting the full scientific stack running, because wasm support is lacking in Fortran compilers. Pyodide gets around this by translating Fortran to C and then compiling that, but that only supports older versions of Fortran, and they canāt update to even the relatively old versions of those libraries that edx-platform uses. That being said, lFortran looks like theyāre really focusing on SciPy support (LFortran can now parse all of SciPy to AST - Announcements - Fortran Discourse, LFortran Breakthrough: Now Building Legacy and Modern Minpack -), so Iām hopeful that will work itself out in the coming months.
Whatās next?
All these possibilities sound like fun, but I think the first step is to seriously evaluate the feasibility of creating a backwards-compatible codejail alternative using WebAssembly, and to make safe_exec
optionally use it. That will mean prototyping, figuring out a rough plan for getting the versions of the libraries we currently support, analyzing the security story, measuring performance and memory usage, determining tradeoffs between various wasm runtimes, etc.
Once we have a decent start on that, I think we can start to take the lessons learned and start having more forward-looking conversations about what else we could do with this in platform.
More to follow on this, but Iād love to get thoughts from folks. Also, if anyoneās already been working on this, please let me know! Iād love to pick their brains.