Hi everyone,
apologies for the hodgepodge of a subject; I am trying to wrap my head around a few things as I write this.
We are currently in the process of planning our transition to Palm. All our production platforms are upgrades that were originally installed multiple releases back; we are not launching any new ones on Palm. We run with tutor k8s
, with RUN_MYSQL: true
, meaning our RDBMS instances run in a container in the same Kubernetes cluster and namespace as the rest of each Open edX platform.
For certain reasons we have always run with MariaDB 10.4, which is a MariaDB release that is binary-compatible with MySQL 5.7. With Palm, Tutor moves from MySQL 5.7 to MySQL 8, which means we are now considering to migrate to Oracle MySQL, but it doesnāt seem like thatās the worst of our worries right now ā instead, itās the transition from the utf8
character set to utf8mb4
.
@dave discussed this in a post last year:
And then much more recently, the encoding transition appears to have caused some major issues for people who upgraded from Olive to Palm, as evident by this emergency fix that @regis put into Tutor, and which precipitated the Tutor 16.1.0 release:
Iām having a bit of difficulty reconstructing what happened around this issue in the interim (that is, between July 2022 and August 2023), since the release notes for Olive and Palm are both silent about MySQL encoding considerations. (I also checked the Nutmeg ones for good measure.)
What I can do is run the following query against one of our Olive databases:
SELECT COUNT(CONCAT(TABLE_SCHEMA, '.', TABLE_NAME, '.', COLUMN_NAME)) AS column_count,
CHARACTER_SET_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA='openedx'
GROUP BY CHARACTER_SET_NAME;
ā¦ and that returns:
+--------------+--------------------+
| column_count | CHARACTER_SET_NAME |
+--------------+--------------------+
| 2555 | NULL |
| 1559 | utf8 |
| 7 | utf8mb4 |
+--------------+--------------------+
So for some reason, there are already some utf8mb4
columns mixed in between the lot of utf8
(aka utf8mb3
) ones. (I wonder why, by the way. @jmbowmanās comment on the aforementioned Tutor PR sounds to me like this is unexpected.)
So, whatās the way forward in this situation? Ultimately I believe we want everything to be utf8mb4
, but the discussion on Tutor PR 890 seems to indicate that this is far from trivial. What Tutor does now (as of 16.1.0) is set --character-set-server=utf8mb3 --collation-server=utf8mb3_general_ci
when invoking mysqld
, and Iām not sure what effect that has on data that lives in columns that are already set to utf8mb4
.
Perhaps someone can shed some light on this. I guess the extremely condensed version of my question is this:
If one upgrades
- from Open edX Olive running on MySQL 5.7 or MariaDB 10.4
- to Open edX Palm running on MySQL 8.1 with
--character-set-server=utf8mb3 --collation-server=utf8mb3_general_ci
,
then
- is there any manual in-place data conversion that needs to be done, and
- are any problems expected to arise from the fact that an existing database might already contain
utf8mb4
-charset columns?
Thanks in advance!
Cheers,
Florian