Upgrading MySQL charset to utf8mb4

Just tested on the latest version of MySQL. Everything seems to be working fine… at least for the moment.

Only noticed one strange thing during the migration. The title of one of the courses “Apprendre à apprendre” became “Apprendre à apprendre”. And that’s only for the title of the course, everything else is well written.

Course outline screenshot

For the migration, one of the @regis plugins has been forked and modified. The plugin can be installed at this address: Tutor plugin to install the latest version of MySQL and adding support of utf8mb4 · GitHub

We can see that Tutor uses the latest version of MySQL with the collation utf8mb4_0900_ai_ci:

Terminal screenshots

UPDATE:

Seems to works well on a fresh install

Course outline from fresh install

How the fresh installation was made:

Summary
  • add charset: "utf8mb4" to tutor/templates/apps/openedx/config/partials/auth.yml (DATABASES[‘default’][‘OPTIONS’][‘charset’] = ‘utf8mb4’)
tutor plugins install https://gist.githubusercontent.com/Abdess/3ed9ed1d42821d00a5cf2481870df26f/raw/tutor-mysql8utf8mb4.yml
tutor plugins enable mysql8utf8mb4
tutor config save
tutor local quickstart

I extracted the database schema from a fresh install with the charset: "utf8mb4" parameter to see the difference with and without the plugin:
openedx_schema5.sql (498.6 KB)
openedx_schema8.sql (515.6 KB)

Example of a MySQL5 (left) and MySQL8 (right) table

For the experiment I replace the remaining utf8mb3:

Replacing remaining utf8mb3
docker exec tutor_local_mysql_1 /usr/bin/mysqldump -Q -d -uroot -pXXXXXXXX --default-character-set=utf8 --skip-set-charset openedx | sed 's/utf8mb3/utf8mb4/gi' | sed 's/utf8_bin/utf8mb4_0900_ai_ci/gi' | sed 's/utf8mb4_general_ci/utf8mb4_0900_ai_ci/gi' > openedx_schema.sql
docker exec tutor_local_mysql_1 /usr/bin/mysqldump -Q --insert-ignore -t -uroot -pXXXXXXXX --default-character-set=utf8 --skip-set-charset openedx > openedx_data.sql
docker exec -it tutor_local_mysql_1 bash
mysql -uroot -pXXXXXXXX
DROP DATABASE openedx;
create database openedx default charset utf8mb4 collate utf8mb4_0900_ai_ci;
exit
exit
cat openedx_schema.sql | docker exec -i tutor_local_mysql_1 /usr/bin/mysql -u root --password=XXXXXXXX openedx
cat openedx_data.sql | docker exec -i tutor_local_mysql_1 /usr/bin/mysql -u root --password=XXXXXXXX openedx

It is when I replace the utf8mb3 by utf8mb4 that the encoding bug appears:

Screenshot

But when I make a modification in the studio, the issue disappears:

Screenshot

Also I just noticed another bug. The LMS displays the n-1 changes.
For example:

  • If I replace “introduction” by " introduction s" then “introduction” is displayed on the LMS
  • “introduction s” → “introduction d” = “introduction s”
  • “introduction d” → “introduction” = “introduction d”
1 Like