Meeting 2023-01-10
Thanks to everyone for joining! I won’t be cc’ing everyone since we had quite a large cohort today . Video recording to follow.
Cliff notes:
- 2U representatives discussed points of collaboration:
- Very interested in how the community deploys codejail
- Autoscaling
- Possible open sourcing of their helm charts
- A new #wg-devops Slack channel has been created.
- Moises submitted the first PR to tutor-contrib-multi.
- A new initiative for Devops collaboration was announced.
Transcribed meeting notes
Xavier
- Introductions
- Catch up from Xavier on what we’ve done so far for the newcomers.
- Are there any areas that they’re interested on collaborating, etc?
- Requested further recap from Felipe
Felipe
- We decided to collaborate based on Braden’s thread
- This evolved into this working group which is working on tutor-contrib-multi
- Obviously with the different issues that it entails
Xavier
- The configuration repo has been deprecated and that we were all working on the same thing. Is it possible to get something nice/maintainable.
- Last month we started the review of the current approach and with specific steps to verify that the current approach works to validate that this is a good base to be working on.
Adam
- Joined edX 4 years ago and researched deploying to EKS clusters.
- Took the notes app and containerised it. It’s been the only service running in k8s for 2.5 years.
- Was deployed using Kustomize. They needed better customization especially with respect to liveness probes.
- It’s still closed source, but it’s running about 7 new Django api endpoints. Some are public, but not the rest due to license concerns, etc.
- Considered using an umbrella chart like tutor-contrib-multi, but migration is difficult, especially codejail.
- They’re still considering how they’ll be able to deploy all services using k8s instead of the old configuration repo.
- Asked who is running codejail behind Flask?
Felipe
- eduNext used 3 approaches to deploy codejail.
- Docker in docker didn’t work.
- Created tutor plugin
- Changed edx-platform enough to choose whether it runs within the same process
- Now they’re using a Flask service.
- GitHub - eduNEXT/tutor-contrib-codejail: Tutor plugin to configure/run Codejail using a REST API service vs GitHub - eduNEXT/codejailservice
Adam
- 2U will consider eduNEXT’s approach to codejail
- 2U’s has mostly figured out autoscaling and can contribute to the effort.
- Currently using Nginx, but is interested to what the community is using as it turn out to be useful.
- Best practices for Kubernetes comes up within the org.
- How to do liveness probes
- Processes that have permission to write to disk
- Etc.
- There’s a [fairwinds article)[Kubernetes Configuration Benchmark Report)
Xavier
- Are there things that we’ve worked on that 2U is interested in?
Adam
- Codejail behind a Django (not a Flask API) would be great as they appreciate the consistency of it.
- They’ve got an internal testing environment and will be tested from there.
Xavier
- Meeting is too short, what about next steps?
- A good step after this meeting might be for Adam to comment on any of the open tickets, like the codejail one.
- Then we could discuss there and have async discussion.
Braden
- Collabaration helps us all even if not directly using the helm chart, because everyone benefits from the small changes.
Felipe
- Working group for Devops
- How do we move codejail to a common/shared roadmap.
- We could tackle multiple projects at a time instead of limiting ourselves to a single goal.
Regis
- Created a new initiative for the Devops working group.
- There are already projects that are devops related (three)
- It doesn’t make much sense to have a working group for all of these projects.
- Instead Ed proposed a Devops working group, where all the Devops related projects will live.
- Each project can have each own Slack channel, leaderships, governance, rules.
- Otherwise it can be handled within the Devops working group.
- Github project GitHub - openedx/wg-devops: Issue repository for the DevOps Working Group
Adam
- How does this differ from BTR?
Regis
- BTR is more concerned with creating code releases and as such is distinct from Devops.
Xavier
- What’s the approach to communicating with the Devops WG?
Regis
- Avoid synchronous meetings if possible.
- Most of how the group is going to work is described in https://openedx.atlassian.net/wiki/spaces/COMM/pages/46793351/Open+edX+Working+Groups
Adam
- How to decide something fits within Devops or somewhere else?
- How would we spin up/down working groups depending on the project?
Xavier
- We try to take care of the issues for deploying larger instances.
- Helm was a good starting point, but there could be changes in future especially with differences between small/large providers.
- Is 2U interested in continuing the discussion?
Adam
- Not sure.
Xavier
- Adam mentioned some of the current issues that 2U is also phasing. Could be helpful to comment on the tickets.
- Or on the forum?
Regis
- Trying to push forward refined/groomed issues. Good first issues to engage folks to start working on them.
- Can this project define important work to attract newcomers?
Adam
- 2U is trying to figure out the best way to deploy Kubernetes. In terms of stability, etc.
- There’s nothing on scaling yet. Expects that 2U can start there.
- 2U can contribute, but will very likely not use the project for a while.
- Internal helm chart is fully featured. Enterprise ready at this stage.
Daniel
- Discussed with legal on open sourcing the helm chart.
- Following up with them again this week.
- Expects no opposition, legal just doing their due diligence in terms of license/contributor guidelines.
- Should hopefully be done by the end of this month.
- Best practices is really important.
- Scaling/Liveness Probes/Metrics used for Scaling/Observability/Monitoring
- All of the above are good areas for collaboration
Jeremy
- Trying to think about how to smooth the learning curve to how to deploy in production.
- We’re deploying to k8s, but not developing with k8s.
- Hoping to find a way to have more consistency across environments.
Adam
- Autoscaling
Braden
- Lawrence would be the best person to speak to at the moment.
Lawrence’s mic wasn’t working.
Xavier
- Go over issues. Discussed the Nginx + Cert manager task with Moises.
- HPA with Jhony.
My sound dropped out here (low headphone battery), so I didn’t get the full conversation
Jhony
- Talked a bit about Karpenter and approaches to autoscaling.
Xavier
- Objections to next meeting at same time in 2 weeks.
- Quickly went over the issues in the Git repo.
- Further discussion to happen async.