While testing koa.test01 on Ubuntu 20.04, I’ve encountered a few issues with database communication.
We’re running MySQL on a backend system and we never had database issues with MySQL 5.6 and previous Open edX releases.
It all starts with a few error messages of the type “2020-11-25T12:35:32.282538Z 2997 [Note] Aborted connection 2997 to db: ‘discovery’ user: ‘discov001’ host: ‘ip-10-0-0-71.ec2.internal’ (Got timeout reading communication packets)”. It could be any user connecting to the database.
Then at one point, I get “too many connections” and the LMS and CMS just freezes and I need to restart MySQL in order to unfreeze them again.
I was wondering if anyone testing Koa right now has encountered these error messages in their MySQL logs? I was able to reproduce this on two different installations now. I am therefore a little bit worried.
By the way, does Open edX provide specific parameters to use for MySQL? I don’t remember seeing anything special.
Thus far it could be a case of MySQL 5.7 handling CONN_MAX_AGE differently than under MySQL 5.6.
Since I am running on a test environment, I decided to change CONN_MAX_AGE from 60 to 0 and ATOMIC_REQUESTS from false to true in /edx/etc/discovery.yml. Seems to work…
Using the mytop utility, I could definitely see less threads for discovery in a sleeping state and waiting to reach the default timeout of 28800 seconds before closing a connection.
I did check the previous behaviour under Juniper and it seems I had the same issue with sleeping threads in our Production environment running Juniper, Ubuntu 16.04 and MySQL 5.6. I guess MySQL 5.7 under Koa and Ubuntu 20.04 behaves a little bit differently and does not like when Django tries to reuse a “supposedly” dead connection.
From what I read today, it could be what is causing the “Aborted connection to db” and “Got timeout reading communication packets” message. There could also be a connection that is not closed in discovery that didn’t trigger this behaviour under MySQL 5.6. I’ve read a few reports of people saying their application was working fine with MySQL 5.6 but had issue with aborted connections with MySQL 5.7.
It would be really nice if people testing koa right now could check their MySQL logs to see if I am the only one experiencing this issue. Thanks in advance.