PostgreSQL and Open edX

If you run Open edX on PostgreSQL (or have tried to), could you please write a bit about your experiences here? There have been recent discussions in areas like full text search or specifying collation settings in fields that could potentially couple us more tightly to MySQL. We can maintain abstractions to try to avoid that, but that comes at a cost, and I’m trying to understand to what extent PostgreSQL support is valuable to folks.

Full disclosure: My personal position for a while has been that Postgres is a better database both on its own merits and in terms of its support in Django–but that the benefits don’t outweigh the switching costs. Or at least, the work necessary to safely migrate everything is not more important than the dozens of other things we could do with that developer/ops time and effort.

That being said, I have no idea how many folks are actually running PostgreSQL in production already, how important it is for folks who want to run it, etc.

Thank you.

2 Likes

We don’t have any clients that run Open edX on Postgres, but… it would be nice to leave that door open.

I dream of a world where we can run the stack with only a single relational database service (and without mongo… unrelated thread, but here’s links in case people want them, see Store modulestore’s course indexes in Django/MySQL and Replace cs_comments_service with pluggable alternatives). One of the working discussion plugins is Discourse, which requires Postgres, and it would be sp fantastic if one day we could run both Open edX and Discourse on the same sql db service.

1 Like

I just want to add my personal opinion on this:
It’s also worth considering that the time and effort needed to support PostgreSQL is a setup cost meaning it occurs once, while the benefit of using PostgreSQL is continuous. So the tradeoff might be worth it from a long term perspective.

1 Like

I respectfully disagree with that. PostgreSQL would be a one-time setup cost if we could pull a trigger and migrate everyone over in a relatively short period of time. That’s how I would think of it this was a commercial codebase at my company. But I believe that as an open source project, we’d be supporting both databases simultaneously for multiple Open edX releases. We would only really be able to enjoy the benefits of Postgres when MySQL was completely dropped for a given service, and we could stop distorting our schema to appease it. Maybe it’s still worth it, but we’re probably looking at a years-long transition to ensure that we’re not leaving behind big chunks of the community.

I can see at least four paths:

  1. Status quo: MySQL (+ MariaDB?) is official and well tested, but aim for PostgreSQL compatibility and accept patches to fix any issues.
  2. Fully support both databases, by increasing our testing with PostgreSQL.
  3. Lean into MySQL, by allowing more MySQL-specific features via libraries like django-mysql.
  4. Lean into PostgreSQL, with an eye towards long term migration.

In addition, we could try to fund development efforts to improve Django’s support for MySQL, which would fit for approaches 1-3.

So it turns out that we added django-mysql as an edx-platform dependency back in 2018, so it’s extremely unlikely that PostgreSQL has worked for the LMS or Studio at all since that time.

Edit: After reading a little more, it looks like we use django-mysql just to use its ListCharField, which seems to just build on CharField, and shouldn’t lock out other databases (which makes sense, since the tests are backed with in-memory SQLite).