ADR for WebAssembly-based code sandboxing

I’ve just opened a pull request for an architecture decision record to create an experimental Wasm-based safe code execution system that could one day grow into a replacement for codejail:

The full rationale for this is in the ADR, but the short highlights are:

  • Power
    Multiple runtimes in the long term means that we can use languages other than Python, or offer multiple builds of the same language with different libraries specialized for different fields.

  • Maintenance
    When Python 2.7 reached end of life, many course authors had to update their problems. But if the entire Python 2.7 distribution we used were a single Wasm file, we would have just kept it around and not asked anyone to update their content…

  • Development
    Codejail requires Linux on the Docker host and can’t run on macOS, making it hard for many of our developers to work on it or debug issues related to it.

I wrote a thread on the topic of using WebAssembly before:

The big thing that’s changed between then and now is that the WebAssembly Component Model has landed. That means that instead of having to deal with memory buffers or hacking shared files like I did in my proof-of-concept, we can now easily have real, typed interfaces between Wasm components and the hosts they run on. Things like:

package example:component

world example {
    export add: func(x: s32, y: s32) -> s32
}

It also has support for more abstract types like strings, lists, and records.

Anyhow, I think this is really exciting stuff, and I hope you do too! Please let me know what you think either in the ADR PR comments, or in this thread. :smile:

4 Likes

Can we run “sensitive” code – i.e. code that contains solutions to assessments – in this context without worrying about it leaking?

I presume that you can’t “view source” on a WASM. Can you decompile it?

Yes, absolutely. It would work just like codejail in that the code never be sent to the browser, it would only execute on the server side.

You could, if you had access to the Wasm file, but the browser never sees that file, and your own source code isn’t compiled to Wasm. We’d have a Wasm file for each runtime (e.g. “python 3.11 with scientific packages”, “javascript ES14”), and your code would be fed to that for interpretation at the time of request.

My proposal is that we extend the <script> tag in ProblemBlock to allow the author to specify a runtime. Based on that value, it finds the right .wasm file to run the code against. Specify no runtime and it goes to codejail just like it does today.

I’m currently planning to hold this ADR open until the middle of next week and merge on the 22nd (Wednesday). If you are interested in this topic and want to review the ADR but don’t have time to do so in the next week, please just drop a comment to that effect on the PR and I’ll hold it open for longer.

This ADR has been merged, but I’ll still watch it for a while if folks have comments/questions. Thanks folks. :smile: