I’m using tutor v17 and tutor-contrib-pod-autoscaling.
I kept getting error status with code 500.
Sometimes I could load LMS pages, sometimes I could not, 9 of 10 times I couldn’t.
I tried to check everything.
mongodb and mysql was running fine.
worker node storage had alot of free space.
worker nodes usage was low (<50% CPU and Memory).
All pods were running, one or two of them have status CrashLoopBackup, but there was more than one pod running.
I tried tutor k8s logs and the only error that I saw was: “no python application found, check your startup logs for errors”
The number of users was low at that time, around 40 users/min.
Honestly, I have no idea what happened and what next to do.
I reverted to tutor v16.1.7 and it kind of running fine
Still not sure what was the root cause, but killing worker nodes and replacing them with new ones did fix everything.
The old worker nodes were in Ready state and healthy though.