Help please: Very slow load time (10 seconds) for courseware on sections with several subsections and xblocks!

Problem Overview

We have a course with a sub-section that’s loading pretty slowly – about 20 seconds of CPU time in both devstack and production instances with pretty good specs.

Testing Environments

We’ve checked on both vanilla master upstream devstack instances and our Hawthorn production instances and we found consistent slow load time.

Performance Analysis

Execution time of CombinedSystem.render

The sub-section consists of 8 units with about 6 blocks (Xmodules or XBlocks) on each unit. In each page view there’s a total of 57 calls for xmodule.CombinedSystem.render ranging from a couple of milliseconds up to one second per call. Which is alarming and adds up pretty quickly to pretty slow load times.

Number of calls to x_module.get_child

What’s more alarming and confusing is the number of calls to x_module.get_child. There are in total 523 calls to the get_child method.

:memo: Living document :memo:

I’m still debugging the issue and will be adding more details to it including a cleaned and redacted export of the course to allow testing it on even more environments.

Questions

  • Is this peformance normal? Have anyone else faced this issue?
  • Is it a caching problem that we forgot to configure?
  • Is having about 48 blocks in a single sub-section too much, even if it looks pretty usable for the learners (except for the load time of course).

Logs

We internally use Honeycomb.io to debug our performance issues and I’ve included a screenshot of the trace we have, as you can see it’s pretty tall trace which isn’t healthy.

To illustrate my point I’ve also added a log line (diff is in this PR) and it shows the log lines below:

The load time gets doubled in Juniper (upstream master) because each xmodule render gets called twice compared with only once in Hawthorn:

edx.devstack.lms                  | 2020-08-18 14:43:29,576 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,187 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,188 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,289 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,339 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,568 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,568 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,584 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,722 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,722 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:30,858 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,269 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,269 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,284 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,495 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,511 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,561 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,612 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,613 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:31,776 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,004 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,020 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,067 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,116 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,117 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,137 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,416 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,432 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,468 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,517 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,517 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,536 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,777 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,793 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,825 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,872 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,873 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:32,892 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,172 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,188 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,232 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,233 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,249 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,505 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,520 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,564 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,565 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,580 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,848 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,874 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,891 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,909 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,926 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,941 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:33,957 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,013 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,013 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,030 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,050 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,064 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,327 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,355 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,370 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,386 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,403 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,422 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,439 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,496 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,497 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,515 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,535 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,551 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,814 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,859 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,859 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,881 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,897 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,916 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,935 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:34,954 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,009 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,010 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,029 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,293 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,340 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,341 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,360 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,378 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,394 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,413 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,430 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,485 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,486 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,503 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,770 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,786 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:35,802 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,068 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,085 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,101 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,384 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,400 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,418 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,686 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,702 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,718 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:36,983 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,000 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,015 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,030 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,765 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,781 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,797 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,813 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]

The last two entries shows the difference more clearly:

edx.devstack.lms                  | 2020-08-18 14:43:29,576 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
edx.devstack.lms                  | 2020-08-18 14:43:37,813 CRITICAL 4102 [xmodule.x_module] [user 3] x_module.py:1975 - xmodule.CombinedSystem.render: block.display_name=[redacted]
1 Like

This is fascinating.

  1. Did you replicate the issue in dev mode? that would allow you to profile the issue with the django debug toolbar.
  2. Do you have a minimal exported course that you could share to help us replicate the issue?

Thanks Regis!

Did you replicate the issue in dev mode? that would allow you to profile the issue with the django debug toolbar.

Yes, the issues has been reproduced with consistent results in in the following environments:

  • Vanilla master (post-Juniper) devstack.
  • Our Hawthorn devstack.
  • Our Hawthorn production environment.

Do you have a minimal exported course that you could share to help us replicate the issue?

Not yet, I’m working on this, it’s a manual process that is a big tedious since I need to redact all course data. But I’ll do it soon.

1 Like

@omar @regis It’s fairly simple to replicate this, you just keep adding components and the subsection load times get worse and worse (which you’d expect, there’s more to load).

Here’s a basic course I threw together for you guys to use for testing in case it helps: Slow Course (268.9 KB)

It’s a exaggerated for effect, but I’ve clocked a 13.934 second load on the “Burn it all down” subsection and 2-10 second loads on the others. So you should be able to use it to replicate this issue.

2 Likes

We’re dealing with a course like this right now – turned out that it’s using python to render 12+ blocks in the sequence. That means each user needs to spin up 12+ codejails just to load the sequence. I don’t understand what the blocks are doing yet, but each one is using importing numpy.

@dave likes to collect these courses. It’s be nice if they weren’t all from MIT. =)

@pdpinch CodeJail could be the key to this! That’s pretty unfortunate. Thanks a lot for the tip. I’ll try my luck next week and see how can we resolve this issue.

I managed to replicate the issue (and by the way I’m not running Codejail). I narrowed down the issue to the CoursewareIndex.render method, so I added some profiling around this method. Here is the result: coursewareindex.render.prof.txt (464.7 KB)

The first few lines are:

         12218256 function calls (11968233 primitive calls) in 22.663 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   17.992   17.992 /openedx/edx-platform/lms/djangoapps/courseware/views/index.py:437(_create_courseware_context)
    104/1    0.001    0.000   15.660   15.660 /openedx/venv/lib/python3.5/site-packages/xblock/core.py:200(render)
    104/1    0.001    0.000   15.660   15.660 /openedx/edx-platform/common/lib/xmodule/xmodule/x_module.py:1962(render)
    104/1    0.002    0.000   15.660   15.660 /openedx/edx-platform/common/lib/xmodule/xmodule/x_module.py:1419(render)
    104/1    0.001    0.000   15.660   15.660 /openedx/venv/lib/python3.5/site-packages/xblock/runtime.py:803(render)
        1    0.000    0.000   14.841   14.841 /openedx/edx-platform/common/lib/xmodule/xmodule/seq_module.py:272(student_view)
        1    0.000    0.000   14.586   14.586 /openedx/edx-platform/common/lib/xmodule/xmodule/seq_module.py:390(_student_or_public_view)
      140    0.001    0.000   14.555    0.104 /openedx/venv/lib/python3.5/site-packages/debug_toolbar/panels/sql/tracking.py:197(execute)
      140    0.008    0.000   14.554    0.104 /openedx/venv/lib/python3.5/site-packages/debug_toolbar/panels/sql/tracking.py:130(_record)
      143    0.003    0.000   14.504    0.101 /openedx/venv/lib/python3.5/site-packages/django/db/models/sql/compiler.py:1112(execute_sql)
        1    0.000    0.000   14.480   14.480 /openedx/edx-platform/common/lib/xmodule/xmodule/seq_module.py:365(_get_render_metadata)
        1    0.001    0.001   14.249   14.249 /openedx/edx-platform/common/lib/xmodule/xmodule/seq_module.py:498(_render_student_view_for_items)
  217/145    0.003    0.000   13.582    0.094 /openedx/venv/lib/python3.5/site-packages/django/db/models/query.py:1240(_fetch_all)
      208    0.005    0.000   12.845    0.062 /openedx/venv/lib/python3.5/site-packages/django/db/models/query.py:49(__iter__)
       19    0.000    0.000   11.234    0.591 /openedx/edx-platform/common/lib/xmodule/xmodule/vertical_block.py:114(student_view)
       19    0.002    0.000   11.234    0.591 /openedx/edx-platform/common/lib/xmodule/xmodule/vertical_block.py:45(_student_or_public_view)
      144    0.000    0.000    9.103    0.063 /openedx/venv/lib/python3.5/site-packages/django/db/models/query.py:259(__iter__)
      140    0.075    0.001    8.535    0.061 /openedx/venv/lib/python3.5/site-packages/debug_toolbar/utils.py:48(tidy_stacktrace)
    21777    0.061    0.000    8.178    0.000 /opt/pyenv/versions/3.5.9/lib/python3.5/posixpath.py:370(realpath)
    21777    0.577    0.000    7.664    0.000 /opt/pyenv/versions/3.5.9/lib/python3.5/posixpath.py:378(_joinrealpath)
...

I think the key is the xblock.core.render method: it has 1 primitive call and 104 recursive calls (104/1). These recursive calls come from the xmodule.seq_module.py._render_student_view_for_items:

if render_items:
    rendered_item = item.render(view, context)
    fragment.add_fragment_resources(rendered_item)
    content = rendered_item.content
else:
    content = ''

In this piece of code, the render_items variable is defined as follows:

render_items = not context.get('exclude_units', False)

Because the context does not contain “exclude_units”, render_items is true. This context is created in the CoursewareIndex._create_section_context method. If I add "exclude_units": True there, then the page load time is considerably reduced – but then none of the units are working anymore :sweat_smile:

Just a wild guess: Could it be that with the switch to horizontal sections it is no longer necessary to render all units when displaying one of its siblings?

At this point I don’t have any real idea what I’m doing anymore, so I guess I’d better stop…

I love, love talking performance (thank you for tagging me on this @pdpinch!), and I will come back here later this week when I have more time to look at some profiler output. A few high level things I’ll say in case it’s useful:

  1. The courseware does much of the work of rendering the entire sequence because there was metadata that needed to be grabbed w.r.t. the contents of each unit, and at the time, it was doing 90% of the work of rendering everything when showing a single unit anyway. The balance of this has shifted, both because we put less metadata in the sequence header row now, and because certain XBlocks are slower to execute.
  2. As pointed out in this thread, CodeJail is a known source of performance problems.
  3. CCX and particularly the intersection of CCX + inline discussions is another known source of courseware performance problems (the way inline discussions XBlock makes calls to the root course object breaks Modulestore-level caching).
  4. You’ll generally want to disable django-debug-toolbar when profiling, because it adds a significant amount of ORM-introspection overhead, which can distort your results.
  5. Even though it’s not visible, something like 10-20% of the load time of CoursewareIndex.get is spent on building up the table of contents that used to be visible in the left hand navigation, but is now only used to put in values for “previous” and “next” buttons when they need to go to the previous or next sequence.
  6. I’ve personally shifted to the view that the best long term thing we can do about XBlock runtime performance is to take the navigational responsibilities away from XBlock entirely in the LMS, and to only invoke the XBlock runtime one unit at a time–i.e. stuff it into the smallest box possible. This is what’s being done with the new Courseware MFE, and is part of the thinking behind the Arch Decision Record for the Role of XBlocks.
  7. Thank you, thank you, thank you for including precise course content for reproduction. :smiley:
2 Likes

Okay, I did some profiling of this last night. I’m doing a very quick/rough writeup before I start my work day because I’m further behind on my PR reviews than usual these past two weeks, and I don’t have time to do a more thorough analysis. :frowning_face: There’s a lot to dig into here.

Reproduction Steps

If you want to check it out for yourself:

  • Use the perf_debug_slow_seq branch I put up with hacky profiling middleware.
  • Create a course and import the Slow Course into it. I named mine course-v1:Appsembler+OmarMattPerfTestCourse+2020-08-28
  • Create a stats folder in your edx-platform directory.
  • Load http://localhost:18000/courses/course-v1:Appsembler+OmarMattPerfTestCourse+2020-08-28/courseware/7a62b1fd2b2a4a22a351a6c6d55af2f1/b215a03a16fa4d1687d8921521fda1a5/ once without profiling to make sure any process-level startup overhead is accounted for.
  • Add profile_name=/edx/app/edxapp/edx-platform/stats/omar_seq1.stats as a querystring parameter to that URL (with the first Unit selected), and make the request.
  • Examine the newly genereated stats file with snakeviz in a separate virtualenv.
  • Or just look at the stats file I generated locally. omar_seq1.stats (541.5 KB)

Running on my local devstack:

  • This sequence takes 8+ seconds to load with the django debug toolbar enabled, on all requests after the first one (which is much longer).
  • It’s about 5 seconds with the debug toolbar disabled (which is why it’s disabled in my branch).

Analysis

Some big items stick out:

  • ~31% of the time is spent in add_staff_markup, which is what builds all the debug information and UI that is visible only to course teams. In general, staff views will take longer to render than students (though there are some penalties in the other direction, particularly since being staff short circuits various access checks). There is some crazy old legacy stuff in there, and I did not have time to dive into the details.
  • ~20% of the time is spent doing various template variable substitutions (replace_static_urls, replace_jump_to_id_urls, replace_course_urls) which boil down to a whole bunch of regular expressions sub calls.
  • ~25% of the time is spent on edx-when’s fallback behavior, where it walks up the entire ancestry tree whenever you ask for an attribute value related to start/due dates. I believe that’s actually a backwards-compatibility feature so that edx-when doesn’t break when it’s encountering a course that has not been published since edx-when was created.
  • An underlying issue that is deep under the covers everywhere is that OpaqueKeys are actually surprisingly slow and wasteful. I had a big PR that really tweaked that performance but had to revert because of safety issues. You can see that whole saga in my PRs for opaque-keys. This has a potential of maybe 10% savings across this (based on time seen in OpaqueKey’s __eq__ method, though that potential savings probably heavily overlaps with the potential edx-when savings.

Remediation: Low Level

There are things here that we can dive into:

More efficient template substitutions

Template variable substitution can be done more efficiently. The vast majority of content carries no substitutions at all. Even for those that do, all that information is available to us at the time of publishing form Studio. The things we’re making substitutions with are configured in LMS, but detecting whether the substitutions exist and where the slots are could be done at publish time.

edx-when

The backwards compatibility behavior in edx-when’s ancestry crawling is something that could either be turned off or made more efficient. This an example of a problem that’s endemic to the way XBlocks work–it’s a very obect-centric instead of data-centric view of the world. The interface for every individual block that is fetching field data looks up its ancestry list without realizing that all its peers are doing the exact same thing and recalculating the same ancestors over and over again. It’s possible to add caching to make it work under the covers, but we have to careful of memory leaks. A request cache might work reasonably well here though.

opaque-keys

There be dragons here. That being said, I think that the PR that’s been reverted a few times now is actually pretty close to landable. But it’s possible to break things in really subtle and dangerous ways, as the OpaqueKeys are both absolutely critical to everything in Open edX and way more complicated than they have any right to be.

Staff-only Debug Code

I honestly haven’t looked at this in ages, but I’d bet it hasn’t been seriously profiled for optimization in years.

Remediation: High Level

Okay, with all that being said, the time taken to actually render the XBlock for the Unit directly (as you would for a mobile view) is about 250 ms. This would be a request that looks like: http://localhost:18000/xblock/block-v1:Appsembler+OmarMattPerfTestCourse+2020-08-28+type@vertical+block@057c430ebfa644abb7f39552a22f5bb0

We have mounds of complexity in edx-platform with respect to the XBlock runtime, and much of that comes from supporting all kinds of weird access patterns and maintaining backwards compatibility. I’ve fought this battle for a long time in various PRs (like this one). Various types of caching feed the complexity of the system because the core shape of the APIs weren’t built with it in mind. It’s too easy to introduce regressions that spiral out of control because the problems scale with the size of the entire sequence.

I’ve come to the conclusion that the only sustainable path out is not to make the XBlock runtime do the things it does faster, but to ask it to do less. For 99% of use cases, XBlocks only need the world that is their own Unit. That’s a much more tractable problem, and it’s the direction we’ve gone in with the new Courseware MFE work. We’re building a new Course Outline piece that runs entirely outside of the XBlock runtime (and is much, much faster). I talk a lot more about this in my post on Arch Decision Record for the Role of XBlocks


Apologies to anybody reading this who is waiting on a PR review from me. Performance has been something that I’ve been pushing for a long time at edX, and I can’t miss the opportunity to have this conversation with the community at large. :stuck_out_tongue:

Take care.

3 Likes

Thanks @dave and @regis! That’s a lot of help. At least now I know that I haven’t broken my fork and it’s a common issue. While the solution – mostly MFE – is not immediate, we’re heading to it and that’s a great news.

Now all my questions were answered, we need to come up with some workaround to those problems.

Hey @dave we have been facing a similar problem but with course outline page. We are using memcache as a caching mechanism. The problem we are having right now is loading the course outline page for the first time takes about 10-11 seconds which is way too much.

On a little more profiling we found out that the issue is Block Structure.

block_structure:836:_collect_requested_xblock_fields is something that is taking a lot of time, we though of adding a persistent block structure cache but that gets signaled when a course is update or added and we are showing completion of course on the same page so we have user related data as well and maintaining that in a persistent cache doesn’t make a lot of sense.

Would be really helpful if you can point me to some direction where I can optimize the performance of this page.

Thanks in advance.

This code should be running during the course publish phase, not during LMS rendering of the course outline. The idea behind Block Transformers is that the “collect” phase has the expensive operations that build the appropriate data structures when you make a change to the content of a course, and only the “transform” steps run when rendering in the LMS.

I can think of a few of things that might cause this off the top of my head:

  1. This is an older version of the code that did not properly persist the collect phase to S3. I think the very first implementation only backed it with memcached, and there was a bug for a while even when we first added the S3 support where it wasn’t being properly updated.
  2. The signal handling is broken for celery tasks. This is a very weird task in that it’s a signal emitted by Studio but the data building happens in an LMS worker process. We use a custom router to make that happen, and it’s possible that this configuration is somehow broken for you. Since then, I’ve come to the conclusion that this is wrong and we should run the collect() phase entirely in Studio, but that’s my personal opinion at the moment and is not reflected in code.
  3. Your Block Transformers configuration isn’t set to use S3/storages at all (this was added after the initial implementation).

Good luck!

1 Like

Hey @dave,

Thanks a lot for the help here.

We are using AWS ElastiCache for our use cache, what I observe was the eviction rate on the server was high in memcache and the time validity for course outline cache was only for 24 hours.

So the conclusion that we landed up on with all these evidence was if we increase the cache server size and make cache persistent by making the Block Structure configuration time validity to None so that it doesn’t hit timeout.

I have not been able to test it with a lot of users but objectively it looks like this is a variable solution which can help us to reduce course outline page load time.

Thanks for taking out time to reply to my query @dave it means a lot. Do let me know what do you think about this solution.

Cheers!

That’s the “site is falling down from load and we need to make a quick adjustment with no new infrastructure” mitigation, but I would strongly urge you to instead specify a backend storage strategy in your EDXAPP_BLOCK_STRUCTURES_SETTINGS dict (STORAGE_CLASS, STORAGE_KWARGS, DIRECTORY_PREFIX, PRUNING_ACTIVE). The PRUNING_ACTIVE attribute is supposed to help control storage usage due to frequent publishing. I don’t remember what release that was a part of, but you should be able to see references to it in edx-platform code if your version supports it.

The problem with putting it into the cache is that it gives you the illusion of permanence, but it has some nasty edge cases that can bite you at the worst possible times.

For instance:

  1. You have a popular course that suddenly crosses the threshold for the max size of the slabs that memcached will store (1 MB by default). Your cache set calls won’t return errors, but they also won’t do anything, and you suddenly see an unexplainable, site crushing surge in CPU usage as your code runs the collect phase on each request.
  2. Your site traffic has dramatically increased due to some recent publicity, and the CPU usage on your memcached cluster is dangerously high. So you decide to upgrade the cluster, but that means that you’re getting all new machines and all of a sudden every collected piece of block structure data is thrown out. Now every request is triggering a full block transformers collect phase call. Worse than that you’re recalculating the same courses many times because maybe eight separate gunicorn processes get the cache miss and each of them thinks, “I’d better recalculate this so it’s available next time.”

If you’re running a low traffic site, this might be fine. But these are real issues we’ve run into in the past.

Best of luck!

2 Likes

Hey @dave ,

I did manage to read about EDXAPP_BLOCK_STRUCTURES_SETTINGS, I have few doubts which I want to clear before I proceed with any change.

Right now we don’t have STORAGE_CLASS, STORAGE_KWARGS configured for BLOCK_STRUCTURE_SETTINGS so by default it will be using memcache. Which for a low traffic is and less courses is not an issue but if we go ahead for high traffic and lot of courses it will start creating issues. Because of slab size and multiple gunicorn worker trying to access the cache.

Am I understanding your explanation correctly?

So now if we use STORAGE_CLASS, STORAGE_KWARGS and supply a S3 bucket and configure it to use it we can make course outline page faster and also increase the availability because now gunicorn workers will get the cache without timing out?

Please let me know if I am understanding this correctly.

Thanks a lot in advance.

Hey @dave

This is just a follow up on what we did and an explanation of what I understood from the code.

The investigation of code led me to two old PRs:

https://github.com/edx/edx-platform/pull/14571/

https://github.com/edx/edx-platform/pull/14770/

These PR points to: https://openedx.atlassian.net/wiki/display/MA/Block+Structure+Cache+Invalidation+Proposal

which can give more idea but it’s not publically accessible so I had to read through.

From what I saw it felt like we switched from BlockStructureCache to BlockSturctureStore. This means we did move away from the cache only approach to the cache and storage approach. This is what is called Tiering. (I got to learn about this in the process)

Essentially what is happening is a request tries to access BS data when it is not found in cache it enters Collect Mode and generates that data for the course which is a compute-heavy process. And as Dave pointed out in the above comment if there are more than one workers accessing it and missing cache this process is carried for more times.

This leads to slower response time of the course. The first solution we went ahead with is because we have smaller cache size we thought if we vertically scale it and make the cache persistent that should solve our problem. That’s because cache will always be present and should serve the request.

Having said that and as Dave pointed out slab size of Memcache is 1 MB so if the BS is greater than that then it would not store it in cache and not even raise an error.

Here comes our solution of adding a S3 storage here, this is the AWS Config to extend in lms.yml. I am using s3 storage as the storage strategy.

BLOCK_STRUCTURES_SETTINGS:
  COURSE_PUBLISH_TASK_DELAY: 30
  TASK_DEFAULT_RETRY_DELAY: 30
  TASK_MAX_RETRIES: 5
  STORAGE_CLASS: 'storages.backends.s3boto.S3BotoStorage'
  STORAGE_KWARGS:
    AWS_ACCESS_KEY_ID: '<placeholder>'
    AWS_SECRET_ACCESS_KEY: '<secretkey>'
    AWS_STORAGE_BUCKET_NAME: 'bucket-name'
DIRECTORY_PREFIX: "/directory-name/"
PRUNING_ACTIVE: true

Once you have done this you probably need to regenerate and store the BS data in S3 bucket you can use

./manage.py lms generate_course_blocks --all_courses --with_storage

This will save it in the bucket, now if there is a cache miss the workers won’t go in Collect Mode rather make a network call and get the data from the bucket. This made a ~10 second into a ~5 second call.

There are few switches you need to activate:

block_structure.storage_backing_for_cache

block_structure.raise_error_when_not_found

block_structure.invalidate_cache_on_publish

Also introduce a BlockStructureConfiguration with version 1 and cache expiration to None.

Let me know if I missed something or I am understanding it wrong.

3 Likes