Anyone use file caching for course structures?

The course_structure_cache is where the SplitModuleStore shoves the very large structural metadata about a version of a course. These cache entries have the following odd characteristics:

  • Often quite large (MBs in size)
  • Extremely high cache hit rate, often upwards of 99.99% if your site gets a decent amount of traffic.
  • Traffic is often extremely imbalanced in a sharded cache environment–a couple of large, popular courses could send 50% additional traffic to one of your nodes, forcing you to upgrade all the nodes as a set when 2/3 nodes are actually underutilized.

Given this, I’m curious if anyone has tried using Django filesystem caching to store this particular cache, and what your experiences were if you have. It would mean that each webapp frontend does some redundant work in writing to its own local cache (and MongoDB does a little more work), but I think that tradeoff would be acceptable for most folks given how high the cache hit rate is in general.

My biggest concern would be how the local file cache performs. I would normally expect the OS’s page cache to make those reads perform well (very few of those entries would be “hot” at any given time), but I’m not sure how this works out in a docker-ized world, as Docker-based deployments seem to have had a very negative impact on file IO performance in other areas.

This is not something I’m likely to have time to experiment with anytime soon. It was just an idea that occurred to me while I was mucking around in that code for other reasons, and I thought I’d mention it.