Migrate data from Koa version to Sumac version

Hi everyone,
I currently have an Open edX Koa instance deployed using the native (Ansible) installation. I am planning to upgrade to Sumac and migrate the deployment to Kubernetes.

The application upgrade part is manageable. However, I am unsure about the data migration process for:

  • MySQL (edxapp, ecommerce, credentials, etc.)

  • MongoDB (modulestore)

My main concerns are:

  1. Data integrity (especially course structure, user progress, and grades)

  2. Minimizing system downtime during migration

I’m looking for:

  • Recommended migration steps or workflow when upgrading from Koa → Sumac alongside moving to K8s

  • Any best practices or lessons learned from real migrations

  • Whether it is safe to dump/restore databases directly and run migrations, or if the upgrade needs to follow intermediate releases

If anyone has done a similar migration, I would really appreciate your guidance or references.
Thank you!

1 Like

Hi @vuthehuyht - regretfully Koa is so old, and the Ansible deployment so deprecated, I’m not sure who would know anything about this. Searching “koa migration” yielded a bunch of results: Search results for 'koa migration order:latest' - Open edX discussions

You might also search for posts that describe the migration from the old-style deployments to Tutor deployments.

@vuthehuyht Hi, we recently did a Juniper to Sumac migration and the way we did it was just getting hands on with DB dumps. The steps were broadly

  1. Dump the MySQL/MariaDB
  2. Dump the MongoDB
  3. Create a test instance of Sumac with all the plugins that we needed
  4. Dump the Sumac’s DB
  5. String manipulate the Juniper DB’s dump to set the collation to utf8mb4 to match the Sumac dump
  6. Load up both the dumps into indepdent instances of MySQL 8
  7. Use Atlas to do a schema diff between Juniper & Sumac
  8. Apply the diff on the Juniper DB instance → This migrates the old juniper data to match the sumac schema.
  9. Dump the migrated Juniper data.
  10. Manipulated the dump to include some stuff like TRUNCATE statements for tables that had default data on Sumac and some custom SQL to migrate Django’s internal tables.
  11. Load up the dump onto the Sumac instance
  12. Run tutor init and a few edx-platform management commands to update things like course overview, index..etc.,

The instance we migrated had about ~12 Gig’s uncompressed MySQL data and ~30 Gig’s of MongoDB data. The MongoDB migration itself was fairly straight forward, it required creating some custom script to migration course overview data onto MySQL. We ran multiple tests on staging data and then still had adaptations to make for production data and estimated about 6-8 hours of migration time. Despite some challanges that came up during the actual migration, we were able to get everything migrated with about 8-10 hours of downtime (I am a bit fuzzy here).

1 Like

Thank for your reply.

About handling the upgrading of MySQL, I agree with that.

However, I still get some issues about dumping MongoDB. According your way, you export data on Mongo in Juniper (it seems Juniper use mongodb verion 5) after that you import into Mongo version 7 on Sumac version, right? During the migrating data, do you get any issues?

1 Like

The MongoDB was migration was eventless. There are 2 ways to do it, IIRC mongoexport & mongodump. One of them is a binary archive dump and that’s what we used for the migration. We did do a JSON export of the modulestore collections after the migration, because we needed to update some URLs in the content to point to the MFE URLs instead of the old LMS based ones. But that’s only needed if the content needs to be updated.

What issues do you run into?