Enhancing Learning experience (when learner navigate between units inside sequence)

This was orginally posted on FWG slack channel ref

I wrote orginally:

Hey folks: Recently I was thinking and also doing testings regarding how to improve the performance of the learning MFE and in particular regarding when a learner is: navigating between units in sequence.

The current flow of the MFE, is that it would load an iframe which its src would be an xblock component,
the iframe src would request to this ourseware-chromeless.html template/page from the platform.

So this means for every change UnitId, the MFE would change the iframe.src with a new src link corresponding to the unit the learner just navigated to

Assuming the following:
Fact: Typically the learner after finishing a unit would navigate to the next one

If we agree on fact, we can enhance the user/leaner experience by pre-rendering the next unitId, however since we are using iframe to render the unitId, then how’s responsibility is it pre-render the next unit?
While doing some quick testing it seems that pre rendering wont be useful if done by the parent document/MFE it would be helpful if 1) the pre-rendering is done by iframe and if 2) the navigating is done by iframe as well:The implementation would be first if MFE send/post message to the iframe to do pre-render e.g.

window.addEventListener("message",function(event){ 
if(event.data.type === 'prerender'){ 
    let link = document.createElement("link")
    link.setAttribute("rel","prerender")
    link.setAttribute("href",event.data.nextUnitId) 
    document.body.appendChild(link)
 }
}

And then when user wants to navigate to next unit we could send a postMessage again to ask the iframe to change location e.g.

window.addEventListener("message",function(event){ 
  if(event.data.type === 'relocate'){
   location.href = event.data.nextUnitId 
  }
}

However this would break some already existing code, because at this point iframe document.referrer value is no longer the MFE (in particualr it would break the communication between the MFEs and iframe regading loading, scrolling…etc)

Anyway before investing more time into this I would like to your opinions on this, for example is there a way to enhance expirence without asking the iframe to pre-render , or is changing the location of the (iframe by the iframe) a security concern which has no way around.

Note: this implementation might be easier if imjplemented with legacy_learning because then both the iframe and the learning would share the same origin, but however I don’t think it worth implemeting that for the legacy course expeirnce since it will be fully deprecated soon I guess

@dave replied:

This kind of discussion may be better suited for the forums, but a few quick thoughts:

  1. I think the fundamental idea of making unit transition cheaper makes a lot of sense.
  2. Pre-rendering would have to be done in a way that doesn’t break researcher analytics assumptions.
  3. Understanding where the major bottlenecks would help us to better evaluate possible solutions.

For instance, if the server side of it is relatively quick for a Unit, but we’re wasting lots of time loading the DOM and re-fetching all the Mathjax resources and such, we might be able to use a library like Turbo to have the Unit iframe do a DOM-diff style of loading. I did a proof-of-concept with this a long time ago, and the main issue was that some XBlocks rely on pageLoad events to trigger initialization off of that wouldn’t properly fire in this case (ORA2 for example).

3 Likes

IMHO this flow is where the problem lies: rendering a whole unit in the iframe. If we instead rendered each block in an iframe, we’d be able to optimize/pre-render to our heart’s content.

The server-side APIs to do this already exist, if I’m not mistaken. But it would likely require significant changes to the learning MFE.

Loading each block separately certainly possible, but that has its own performance downsides.


This is an optimization problem, and I think it’s worth examining exactly what the bottlenecks are. Very broadly, we’ve got at least the following:

Server-side

  1. Course-level overhead to render the XBlocks.
    This is stuff like the high overhead of pulling back the structural metadata for a course so that inheritance can be computed, and typically grows with the size of the course as a whole.
  2. Feature-specific overhead.
    For various reasons, some site-enabled or course-enabled features cause significant overhead. Some examples are the Completion API or CCX courses.
  3. Content-specific overhead.
    There are particular content-types that can be slow to render. Running a ProblemBlock that is using a custom response type requires running a separate, sandboxed Python process before rendering something to the user. That can be really expensive, particularly if there are multiple problems in a Unit. Using content from content libraries also introduces a lot of overhead because of the data representation used.

Sometimes #2 and #3 interact in funny ways. The last time I looked at it, the inline DiscussionBlock paired horribly with CCX courses because DiscussionBlocks reached up to the root CourseBlock node in order to grab some configuration values, and the mechanism used to cache that result is not working properly for CCX.

Client-side

  1. Lots of JavaScript, much of it likely unnecessary.
  2. Plenty of CSS, probably mostly unnecessary?
  3. Post-load rendering, e.g. Mathjax processing (even if your thing doesn’t need MathJax).

@arbrandes: So going back to the idea of loading each block individually: Many of these costs aren’t that different whether you’re loading a single block or a whole unit (which is typically only 3-4 blocks on average). Splitting up a Unit into three sub-blocks may do little more than triple the work being done by the servers and browser.

Now, @ghassan’s suggestion side-steps the bottleneck question somewhat because no matter what the bottleneck is, pre-fetching should improve things. My concerns about this are:

  1. As I mentioned before, it can cause confusion in analyzing learner behavior, and would need to be carefully tested and messaged.
  2. Depending on how it’s implemented, it may actually slow things down.
    Some of these issues are CPU-bound because of some of the crazy JS that’s getting loaded, and pre-loading something might harm the performance of the thing you’re actually trying to look at right now. This may also be difficult to properly time–e.g. maybe the current unit has loaded, but MathJax is still processing the visible content while it’s simultaneously trying to prefetch. This sort of thing is particularly bad on Android browsers.
  3. It will create additional weird edge cases.
    Will resizing work properly for complex XBlocks like ORA2 that do things on page load? Will it handle cases where we replace Unit content with top level message notifications? Caching like this will add complexity, and will likely cause at least some regressions. Will it handle race conditions properly when people flip back and forth quickly? How should it behave when the next unit is in the following sequence?

This is one of those things where I suspect that the initial implementation will actually be pretty straightforward, but the long term maintenance costs will be substantially higher.

Now, to be clear: None of these are blockers. I think that the kind of pre-load optimization listed here is feasible if we’re careful, but I would strongly urge that anyone who wants to work on this first precisely measure what the bottlenecks are–where in the code they occur, what situations/data cause it, etc. It may be possible that there are other, simpler optimizations that can be just as effective without introducing these risks.

2 Likes

Thanks for the in-depth and, as usual, insightful reply, Dave.

To clarify, I’m coming at this with the Libraries v2 (and also LabXchange) design in mind. I admit I neglected to take into account that grabbing individual blocks from a course context (versus the less rigid implementation of “learning contexts” which we currently back in blockstore) carries its own performance penalties. Plus, even in the library authoring MFE it’s already clear that rendering multiple blocks at once quickly bottlenecks on the browser.

Which is to say: agreed. This direction is likely not worth exploring for the learning MFE as it stands. We’d probably arrive at the same conclusion that led to learning contexts and Libraries v2 in the first place. (And as far as I know, reworking courses to be backed by learning contexts is already being considered for the roadmap, if not already in.)

Still, I’m not entirely convinced prefetching will be worth the effort, either. It would be nice to have some rough numbers from a proof-of-concept, however hacked together. As in, “yeah, a naive prefetch implementation breaks analytics and ORA2, but makes unit transitions imperceptible 80% of the time”. Is that something you’d be interesting in trying, @ghassan?

Alright, thank you both guys for your valuable input @dave and @arbrandes, I got valuable input for your discussion.

There are various things we can discuss, but I guess we can all agree, that we need to get numbers about the bottelnecks, i.e. knowing the problem is have the solution. That being said, my action point, is to go run some various tests, get some numbers and then report them back.

Action point

Running test for testing server-side response time

Fisrt I will investage server-side performace based on

I will re-run the tests by keep tweaking the following variants:

  1. Making an instance of course that is large, not sure what large is, should this be the #sequences in the course, #units in a sequence the #blocks in units, all of them?
  2. Create course that is a CCX instance, and enable completion, for enable completion, I am not aware of what this exactly means, is it same as this?, isn’t it a spesfic type block, so that we can consdir it related to Content-spesfic
  3. So I will try on units which include, +1 problems, and see how the number of problems in a unite affect the loading time.

So far these variants from 1 to 3 since are server related, it should be suffient to test them without running or using the browser. I can just make a request and mesaure the response time, I will also for sure keep an eye on the internet speed (if not locally between the client and the server).

But this not to say those variants should be ignored when running rendering tests, its just I don’t think, we need a browser from stage one, since all we are concern about is server response.

Testing render-time:

I will give a though of that, once we have some numbers from server-side tests, so that I assume the result of the first stage might influence how the enviroment of client-side or/and circumstances.

Regarding the concern of analytics

Analytics: I have given a though of that, and I think analytics events which learning MFE triggered should be trivial to handle, for tow reasons:

  • All the code responsbile of firing analytcis events is contained in the learning MFE, are easy to reason about or predict.
  • Since most of the events related to this expirence are fired based on user interaction, this shouldn’t influence (if done correctly), those iframes events I showed above, aren’t realted to any analytics events.

However regarding any events that might be fired in server-side, I am not really sure about that, while going through the flow of chromeless, I found an event would be fired if the request is coming from mobile native apps, but there also some bits related to the completion, should need to be accounted for.

For other concern might worth to discuss in deep after we are sure the root cause of the problem. It’s just one thing I want to add is that, the question of when to prefetch or prerender isn’t a trivial one*. In an ideal world this would depends on many variants, one of them is that making sure the current unite is loaded, (which the learning MFEs makes frequent calls to the iframe to decide about), other parameters, could be: the length of the content of the current unit, how long on avarage other leaners spent on this unit, how good is internet speed, how free the server is, how good the client resoruece…etc, and then all those parameters would be fed into a ML model which would calculate time delta, and once the unite is loaded, we trigger prefetch/preload after time delta… And of course time-delta would be best calcualted as on corn-job basis, but then do we really need to calucalte time-delta for each unit in each course, may be better have one for the sequence, probaly worth to bring this to the Data.

* For example there is a dedicated library, called guess.js, which is just dedicated on answering very simliar question, by using data from Google Analytics.