New `edxapp_csmh` MySQL database in Juniper

Juniper introduces a new mysql database that is used by edx-platform to store the state changes of problem modules for every student. These data are stored in the coursewaremodulextended app (as far as I understand). This database is known as the student_module_history in the edx-platform code base.

Once the ENABLE_CSMH_EXTENDED feature flag has been set to true (which is the default in Juniper), there is no turning back, because it affects the behaviour of the 0011_csm_id_bigint migration from the courseware.

I would like to know if there are specific precautions ot be taken to create this database and to migrate existing MySQL from Ironwood.

The Juniper page on Jira states that “We are looking at doing large migrations for the courseware_studentmodule table to prevent running out of primary keys. These will likely be large, semi-manually done migrations on CSM and it’s history table(s). Nothing is settled yet, but if we get this in before Juniper we will need to share our runbooks for anyone upgrading to it.” (cc @bmedx)

I wonder if this comment concerns the edxapp_csmh database?

I found a couple migration scripts in the configuration repository that lead me to believe that special precautions need to be taken when upgrading. On the other hand, these scripts date back from 2016…

I have faced the following error when viewing the courseware in the LMS:

django.db.utils.ProgrammingError: (1146, "Table 'openedx_csmh.coursewarehistoryextended_studentmodulehistoryextended' doesn't exist")

I managed to resolve this error by running some of the migrations specifically on the student_history_module database:

./manage.py lms migrate --database=student_module_history coursewarehistoryextended

However I did not find any trace of such a migration command in the configuration playbooks, so I wonder how it all works?

1 Like

The silence is deafening!

I did some investigation, and found the explanation. The student_module_history has existed for a long time, but was optional until Juniper. The use of this database is controlled by the ENABLE_CSMH_EXTENDED feature flag. This feature flag was False by default in Ironwood (although it was enabled by the configuration playbooks) and switched to True in Juniper: https://github.com/edx/edx-platform/commit/0befab339b8731f2a7742bba1b4f08e9a7b3e99e

The explanation, given in the comment, is the following:

Write new CSM history to the extended table. This will eventually default to True and may be removed since all installs should have the separate extended history table.

As a devops, I must say I was surprised by the above statement. From an optimization perspective, it makes zero sense to have separate databases if they both run on the same physical server. Thus, 90-99% of Open edX administrators have nothing to win by enabling this feature. Indeed, most people just apt-get install mysql and don’t rely on cloud services to host their database. Most often, this is the best way to proceed.

Moreover, the deployment playbooks suggest that all LMS migrations should be run on this student_module_history database. Running those migrations is precisely the most time-consuming step in the installation process, which is too bad if most users don’t need it.

Anyway, setting ENABLE_CSMH_EXTENDED = False solved my problem. I’m writing this in case having a separate student_module_history database becomes mandatory one day. Then I’ll have a post to point to :wink: