Hello
We are facing an strange issue in uamx:
- from time to time, randomly, the LMS container stop accepting more requests and our platform returns a “Bad Gateway 502” error, preventing users to enter into our platform.
- checking
tutor local status
, all containers are up and running, but inside the lms container the uwsgi workers won’t respond to requests. - we need to kill the uwsgi processes inside the LMS container (
tutor local exec lms bash
and “kill” to the wsgi processes). uwsgi are restarted automatically by a daemon - this can also be achieved with
tutor local restart lms
. tutor local exec lms reload-uwsgi
won’t do nothing, as the uwsgi won’t respond to requests.
We are runnig tutor behind a Proxy, following these instructions. Caddy logs are like the following:
tutor_local-caddy-1 | {"level":"error","ts":1693380559.2195,"logger":"http.log.access.log0","msg":"handled request","request":{"remote_ip":"150.244.22.164","remote_port":"47169","proto":"HTTP/1.1","method":"G
ET","host":"uamx.uam.es","uri":"/"},"user_id":"","duration":59.821887904,"size":0,"status":502}
tutor_local-caddy-1 | {"level":"error","ts":1693380559.2194364,"logger":"http.log.error.log0","msg":"EOF","request":{"remote_ip":"150.244.22.164","remote_port":"47169","proto":"HTTP/1.1","method":"GET","host"
:"uamx.uam.es","uri":"/","headers":{"Accept-Language":["en-US,en;q=0.5"],"Dnt":["1"],"Sec-Fetch-User":["?1"],"User-Agent":["Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/116.0"],"Acc
ept-Encoding":["gzip, deflate, br"],"Connection":["keep-alive"],"Upgrade-Insecure-Requests":["1"],"Accept":["text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8"],"Sec-Fetch-M
ode":["navigate"],"Sec-Fetch-Site":["none"],"Cookie":[],"Sec-Fetch-Dest":["document"]}},"duration":59.821887904,"status":502,"err_id":"1w94a8mr2","err_trace":"reverseproxy.statusError (reverseproxy.go:1299)"}
We are runnig 16 wsgi workers for lms and 2 wsgi workers for cms. According to this discuss topic they seem to be propertly configured (thought we need to grep “wsgi” instead of “processes”):
While monitoring our machine’s memory and cpu usage, none of them were saturated when the issue started:
So we are now facing out these questions:
- is our system setting the workers propertly?
- can it be a problem with docker/tutor installation?
- how can we prevent this from happening?
Thank you very much in advance