Build-Test-Release notes (Lisbon 2022)

Notes for the BTR portion of the meeting

Interests

  • Common tool for deploying in k8s +5
  • Automating deployments
  • Testing blocker for Nutmeg

Nutmeg

  • June 9th releaese date
  • Need to upgrade every official tutor plugin
    • mfe, notes, xqueue, indigo, minio, discovery, ecommerce (increasing difficulty order)
  • Why? Plugin v1 API?
    • v1 upgrade is not the reason we need to upgrade
    • every Open edX release has breaking changes, and plugins need to upgraded thusly
    • Topic on the tutor forum with instructions (LINK)
      • roughly, for every plugin:
        • bump the version, build the images, see if it works
      • Regis could use help!
      • Issues anywhere for these upgrades?
    • Folks with unofficial plugins also need to upgarde
      • Felipe: As maintainers of a tutor plugin, when/how should we upgrade?
      • if you test with the new release and it works, bump the tutor version that the plugin depends on!
  • Is tutor’s release already out?
    • The is a WIP pull request – if you are upgrading a plugin, work off of that branch
  • Testing for Nutmeg
    • Dean is coordinating this- he has a comprehensive testing plan and leaderboard (LINK)
    • Plugin testing in test plan?
  • Testing blocker for Nutmeg
    • Files and uploads interface - broken in Nutmeg branch
    • studio-frontend :frowning:
      • Context: is react in studio, but not an MFE
    • Felipe: EduNext has a fix for this on Lilac
      • We should land this in master and backport it to Nutmeg ,
    • Why isn’t this broken on edx.org?
      • (Confirmation: it isn’t broken on edx.org)
      • Phil will reach out
    • First-time contributor found the solution

Kubernetes

  • How are people doing it?
    • OpenCraft built Grove
      • Currently in testing
    • Lawrence’s cookiecutter
      • In use for ~6 large production platforms
      • What could be done other than Jinja templating to standardize things across instances/repos
      • A lot of maintenance to keep terraform working
        • AWS is always changing… terraform is always changing
        • Lot of work to keep these things up to date
        • OC has terraform code that works for digitalocean. Not being bound to AWS be very nice.
        • Some sort of infra-as-code that is not vendor-specific would be a powerful thing for the BTR to provide
        • Contribution of manifests for different environments
        • Tutor has a templating engine. Why have different tools with their own templating engines?
          • Tutor uses Jinja2 to render its config environment
        • Shared mainteancne of Tutor plugins for different vendors.
        • Or: tutor-terraform plugin
        • As a Terraform APIs changes, it can take a lot of time to get Terraform to plan & apply again
        • couple different pieces: (1) underlying terraform to specify manifest files for different vendors, and (2) the terraform data sources on top of them.
      • Sometimes spending hours of time to do something in Terraform that would take ~40mins to do by hand.
      • Upside: An hour to build an environment from scratch.
      • ^^ ends up being a wash when one person is doing it. Could be valuable if we are cooperating, but as an indvidual it’s not worth it
      • DigialOcean is lighter weight and more stable, more amenable for terraform maintenance
      • If enough of us were using AWS, would AWS be willing to reach out and help keep an AWS deployment of Open edX stable?
        • This is happening / has happened - there is a reference deployment on AWS for Open edX that they’d like to have available.
      • OpenCraft has not found terraform for AWS to be a huge maintenance burden. They keep as much of their Terraform in common between providers as possible, which helps.
    • EduNext is building a k8s pipeline (Shipyard)
      • In prodution for one site, in use for multiple staging sites
      • How does it work?
        • Uses Tutor build and test image
          • Using Tutor to build image, but not using the overhangio images.
        • Terraform to create clusters
        • in-house templating engine to make manifest files to publish to GH, whatever
        • ArgoCD syncs between manifests and infra
      • Felipe thanks folks for sharing their different pipelines
      • Differences wrt other piplines: Needing to add boilerplate to different repositories vs. central management
      • Grove
    • @fghaas 's tox pipeline
      • Using for production pipelines
  • We could keep doing our own different solutions, or we could collaborate. Or both! Common solution + everyone’s own spices and flavours
  • Does a shared solution end up in the openedx organization, or another home? Who maintains it, decides how it evolves?
  • Is there a common core that we can agree upon?
  • Building support into Tutor itself?
    • Regis is very excited about seeing all these projects on top of tutor
    • but does not want to maintain a central solution himself

Grove & Tutor

  • Grove is trying to move as much of it’s work on top of Tutor as possible.

  • Move as much of Grove’s work into Tutor plugins as possible.

  • Moving instance-specific code to Tutor plugins as possible

  • OC & EduNext deploys multiple instances to the same cluster

    • OC uses repository per-cluster.
    • Grove has a UI for mapping instances ot clusters.

Codejail

  • What is it?
    • Safe execution sandbox
    • Limits the libraries that Python can use
    • Uses AppArmor for resource limiting
    • Native Insatllation would do this in-process with edx-platform
      • This is not amenable with k8s
      • So, in Tutor, you need a micro-service. edx-platform calls out to the service, the service grades the code, and sends it back.
      • tutor-codejail plugin handles this for Tutor
        • App Armor needs to be added to kernels of servers separately, though.
          • This can be done at the Docker level?
            • Kinda, but the profile is loaded from the host.

How can tutor help you share, publicize you work?

  • Regis’s current thinking:
    • List of plugins, with links out to maintainers of those plugins
  • There is a balance between individual provider’s differentiation vs. what we want to share as a community
  • Grove is free and open
  • Public and collaborative are not the same thing (source available vs. open source)
  • Some folks just learned about Grove, Shipyard
  • Lawrence is sharing his tool for collaboration, but would rather be contributing to a common project

Commonalities between these solutions

  • K8s
  • Terraform
    • Different vendors, but lots of terraform logic is common between the different vendors
  • What are the things everybody needs to do
    • first it was dockerfiles,
    • now k8s, and terraform
    • these are different layers
    • what layer do we want open vs not?
    • We have Tutor for generating templated Dockerfiles, et al
      • And K8s
      • Let’s focus on making those good using this engine we have, before we go to far into other layers
  • Build the map with all the holes, so folks can easily hook into those holes
  • Cooking
  • This might be its own space / group / meeting series
    • And/or just write things down
    • Or just opening PRs
    • Lawrence would love to put his code in someone else’s project!
    • Lists of plugins
    • Demos of plugins
      • At BTR
      • Posted videos
      • Public vs BTR?
        • public
      • Livestreams
        • Sync questions can be posted
        • Async questions can be on the forum or whatever
        • Less work than videos sometimes - no editing, OK if things go wrong
5 Likes

Here are the notes from the frontend working group meeting that followed right after the BTR meeting at the Lisbon conference today.

We are hijacking the BTR WG meeting to talk about frontend, which makes sense in the context of MFE deployments…

  • If we keep going with the current MFE implementation we’ll face important problems.
    • important build time & bundle size
  • Config 2.0: where do we stand?
    • Shitij has implemented something to make it easier to configure the discussion MFE. (is this it?)
  • Felipe: because MFEs are the shiny new thing, we keep adding more and more of them. But have we reached an agreement that MFEs work for us? By default eduNEXT does not turn all MFEs on.
    • We have concerns abut MFEs:
      • Example: theming, header+footer customization
      • People are stuck on Lilac because of MFEs (ex: OpenCraft, Edulib)
      • Concern: as backend Python devs we have trouble switching to React.
      • Feature parity lacking: in authentication (Peter Pinch), ecommerce
    • Adolfo: the stated original goal is to have MFEs for every page in Open edX. The initial target was end of 2019.
      • We have 11 different standards.
      • tCRIL: we need to finish before we switch to a different standard. (if any)
      • RĂ©gis: I’m fine with MFEs. But the initial plan needs to be adjusted.
      • Adolfo: what do we do next?
        • RĂ©gis: CONFIGURATION!!! (summary: Configuration is currently part of the build step, which means that all MFEs need to be rebuilt for every config change). OpenCraft and eduNEXT are both working on this issue.
          • Felipe: their solution loads configuration at runtime. There is a small issue with a blocking HTTP call but this is an implementation detail.
          • RĂ©gis: I would love to have a quick-n-dirty solution such that people can pull the mfe Docker image.
        • Felipe: theming.
          • Adolfo: How do we do theming? 2U deploys themes via node brand packages. Maybe we can load themes at runtime.
            • Kyle: it seems difficult to load the theme at runtime.
            • Keith: Shitij has a solution to switch themes at runtime.
      • We need to rally around these two issues and solutions: 1) dynamic configuration to fix build time 2) runtime theme switching to reduce the number of images we have to handle.
      • Which MFEs should be included in Nutmeg? There are 3 candidates: authn, discussions, and ?
    • Keith: The “Getting started with MFEs” (in the devstack) instructions are really hard (here). Tutor does not work much better.
3 Likes

FYI: On the studio issue

It didn’t fail on edx.org because edx uses npm ci[1] to install the pacakges in pacakge.json of the platform while tutor uses npm install[2]. The difference is the first will uses same exact versions without trying no install newer versions while the later might uses newer version (depending how the version is defiined) hence the patch to fix it in maple expliciity pins the version. I have also found the orignal root cause which is incorrect babel configuration[3].

[1] tutor/Dockerfile at 1a4c904d7f2b1473a56b9972225f43fcdf67ec72 · overhangio/tutor · GitHub
[2] configuration/deploy.yml at f676c356a5424a52ebff01da7a8a7d96189f2579 · openedx/configuration · GitHub
[3] fix: empty build due babel incorrect configuration by ghassanmas · Pull Request #341 · openedx/studio-frontend · GitHub

4 Likes