"Live" Development: Learner Backpack / Data Storage (done!)

For a long time I’ve wanted to have some sort of data store for edX learners. A small amount of information that we could access in a manner similar to HTML5 Local Storage, but that would be stored server-side and not tied to their browser. It was proposed (more than once) but never implemented, and I eventually gave up on it.

Last week I figured out how to do it.

I need to talk myself through it anyway, so I’m going to go through the development process here in the open for those who might be interested.

Requirements:

  • Store data.
  • Accessible via JS.
  • Readable by other JS on the page.
  • Data is tied to the learner, not to browser or session.

Constraints:

  • It has to be a relatively small amount of data. You’ll see why. I’m aiming for 100k per course.
  • It has to be basically invisible to the learners. If they can find it through the browser’s javascript console, that’s fine, but it shouldn’t make them wonder if something’s wrong with their course.
  • My courses run on edx.org. I can’t get XBlocks into the edX codebase - Harvard does not have a secret “in” for this - so I can’t do this by enhancing the vertical XBlock as I might want to. It all has to be with existing tools.
1 Like

Method:

  1. Create a Custom Javascript Problem. These problems already store learner data in the “problem state”, which is stored on edX’s servers.
  2. Connect that problem with my existing javascript package, hx-js, so that I can create a few functions to more easily read and write to the problem state.
  3. Have hx-js iframe that problem into every page, using its XBlock URL. Give it a tiny little iframe way out of view, and make it invisible to screen readers. Because the iframe is coming from the same domain (indeed, the same course), communication between the frame and the page should be relatively straightforward.
  4. Hit “Submit” on that problem every time I want to store data.

Limitations and Potential Issues:

  1. Problem submission takes a second or two. The data storage rate will be slow.
  2. It will take a moment for the iframe to load. Student data will not be available until it does.
  3. The iframe should only be created once. If going to a new unit in the same subsection creates a second one, that’s a problem.
  4. XBlocks are not designed to store large amounts of data this way (as per a conversation I had with @dave ). He mentioned that 2MB per learner was causing service outages. I’m hoping that 100k will be suitable.
  5. The learner will be able to see the problem in its regular location on the page, as well as on the Progress page.
  6. XBlock URLs are based on the problem ID. The less we have to put that in by hand, the better.

Here is the wonderful solution to #5 above. The storage/problem will be hidden in our syllabus quiz, as a “Click here to indicate that you have read and agree to the honor code” button. A tiny part of the problem state will be reserved for that part of the problem, and the other 99.9k will be available for other storage.

Q&A:
Q: Is this a horrible Frankenstein of a solution?
A: Yes! Yes it is.

Q: Will this work on mobile?
A: Mobile app, no. Mobile browsers, yes.

Q: Is it particularly brittle?
A: Strangely enough, probably not!

  • XBlock URLs are (to my understanding) the way the mobile app works, so they’re not going away any time soon. Hopefully @antoviaque and other folks working on Blockstore will tell me if there’s any potential interference there.
  • Custom JS Problems are listed as “full support”, so they’re not going away.

Q: Where are you going to deploy this?
A: First, on an internal test course. After that, it’s going into our standard boilerplate course so that we can use it in all our new courses.

Q: What are you actually going to do with this?
A: I’m glad you asked! We could…

  • Use it to show and hide content later in the course based on learner activity early on.
  • Keep short essays or goal-setting documents to show to our learners, so they can revise them.
  • Enable Custom JS problems to keep state across multiple instances, or to inherit state from a previous assignment.
  • Do lots of fun game-oriented stuff.

Progress!

This is the JSInput problem that will serve as our data store. Right now it doesn’t have anything fancy in it, it’s just a “do you agree” checkbox.

The next step will be making it easy for javascript developers to access the data, and creating a way to check the amount that’s stored before we send it to the server. No reason to let malicious people store enormous files, after all.

I feel like we’ll probably have to store two versions of the problem state in javascript: one that reflects the “original” state when the problem was opened, and one that reflects the “current” state after changes have been made. (Original and current are within the current session of use, not global.) That way when someone tries to store something that pushes the data over the size limit, we can reject it without clearing everything that’s already there. Edit: Admittedly, someone malicious could go in and alter the “original” state too, so I’ll add a fallback that reverts to a blank problem state if the “original” is too big.

We can’t go back to the edX data store to check the “original”. It’s only passed once, when the problem is instantiated.

Further progress!

  • Get, Set, and Clear commands have been implemented.
  • The size filter is working, at least in as much as it’s rejecting things that are clearly enormous. Tested with Array(1000000).
  • The setter successfully clicks the “Submit” button and stores things in the problem state.

What’s next?

  • HX-JS now needs to create the iframe.
  • To make that easier, I need to edit the XML and give this thing a better filename than f83995f72c9f4e9b95e40802473b4fac.
  • Possibly an autosave function. Compare the current state to the original state every…minute? And if they’re different, save it.
  • Might throw some LZ compression at this as well. When the data is less than 2-5k, it might make it larger, but we might save some on the higher end.

It just occurred to me that you could get data from an entirely different course with this. As long as you’re still registered for that course (and there are no visibility blockers), XBlock URLs will still show it to you. Neat. If the course were running on a different Open edX server, we’d have to do this with postMessage() instead of just looking at parent.document, but it could still work within certain parameters.

Wow, this is weird and cool. Thanks for sharing your process here!

1 Like

Yet Further Progress!

That commit does the following:

  • HX-JS now creates an iframe to the backpack problem.
  • If it loads properly, it makes the Get/Set/Clear functions available to the page in general.
  • If not, it still creates those functions but they just return null.

I also switched up the Boilerplate Course so that the problem has the filename backpack.xml. The Course Template Builder now includes the backpack, so anything you make starting from that base will have it.

Programming is the art of reducing the number of bugs in your code.

Previous bug: calling parent.whatever within a function invokes that at the same level at which the function is called. When you boost a function up through some iframes, it looks for the current parent. So all the functions I brought up from the innermost frame (Get/Set/Clear) need to refer back to the original parent frame. I think I’ve got that fixed now.

Current bug: I can only set the state once. Later calls to hxSetState do submit the probloem, but the state is not properly updated. Hmm.

I am now in a fascinating state where the first time I call a function, it runs normally, and the second time I call it, I’m getting an older version. I know because I added a ton of console.log() calls to the function, and those all fire the first time but not the second time.

Honestly I just want to know where it’s getting the old version.

Clearing my cache didn’t fix the problem.

Calling hxClearData runs into the same issue. In fact, for all I can tell it’s calling the same function. Or just, like, only returning “true” and then moving on with other stuff.

Testing out the function calls at the innermost iframe shows me they’re still working there. It’s the connection between them that’s the problem.

I think it’s between hx-js in the outermost window (the edX page) and the iframe in which I load the problem via XBlock URL. hx-js publishes the functions, but it only does that once, when the iframe initially loads. I need it to happen every time.

I should probably have the inner window fire an onload thing, huh. Ok. Let’s see if I can get that working. I’ve got some pieces from the Qualtrics Grader Problem that I think I can use for this.

Oh, postMessage(), why didn’t I use you in the first place?

Now instead of listening for the (first, duh on me) time the backpack loads, I’m picking up every time it loads. The functions are getting reconnected properly and I can store things appropriately.

Huzzah!

More things to do:

  • Do I want other functions? Should I have a hxClearAllData() and an hxGetAllData?
  • There’s a potential “simultaneous editing” issue for someone with two browser windows open on two different pages. Do I care?
  • Updating the ReadMe files.
  • Probably an autosave.
  • That LZ compression I mentioned.

Added a GetAllData function.

Updated the ReadMe.

Added compression. It seemed a little broken at one point, but I think that’s because there were old cached versions of some files lying around. Because things are loaded within iframes within iframes, that’s going to be a major concern any time this is updated. We’ll want to update as rarely as possible so learners don’t have to clear their cache.

I realized as I was doing this that auto-saving wasn’t actually necessary, because any use of hxSetData() saves. It’ll be on developers who hook into this to keep a good cadence with their storage - not too often, but not too rare.

I think we may be done! Here are the links to the pieces of this:

Backpack problem and instructions:

HX-JS - get your updated version:

There are still some oddities and potential issues. Some pages send a “Backpack ready” message twice, and I don’t know whether that’s actually problematic. I don’t know whether certain data might be stored in a way that breaks the state - I vaguely remember needing to escape percent signs when storing problem states, and with the compression you can’t even tell whether you’re storing a percent sign. Gonna have to watch for glitches.

Comments and questions are welcome.

HX-JS and the Learner Backpack problem are available under the MIT license. Feel free to use them in whatever way you want.

1 Like

Addenda:

  • One issue with this solution comes if someone does not submit the assignment. With no state stored for the learner, edX never calls the JSInput setState() function, which is where our code makes the get/set/clear functions available to the world. This may not be the worst thing. Preventing learners from accessing parts of the course until they agree to the honor code is something I’m ok with.
  • I had some issues with the compression, which I think were related to Javascript storing all strings in UTF-16 and whatever’s storing it on the edX side using something else. I switched the compression to compressToEncodedURIComponent() instead, which is not nearly as compressy but which should be more reliable.
  • Added the ability to input a whole object worth of data at once, rather than just one key/value pair at a time. This makes it possible to “batch” data input so that the problem doesn’t have to refresh as often.

I’m working on the first application of this. I’ll post here when it’s ready. Who knows, maybe we’ll even fold it into the next version of Studio Advanced.

After some substantial work, I have created the first application for this: the Journaling Assignment!

Learners can write in rich text (thanks to the summernote editor), it gets saved in the backpack, and they can submit it for a grade. The current version just checks submission length, so this is “required activity” rather than “graded essay” but we’ve already got several courses where this would be useful.

There’s one real weird thing about this: you can’t have the summernote editor in the same component as the jsinput problem. If you do, the getState and setState functions for the jsinput problem don’t fire. If you have any idea why, suggestions are definitely welcome - we’re stumped over here. Gonna try to fix this one, so if you want to hold off on implementing this I don’t blame you.

However, I also got something nice on the way to this! If you use hx-js, you can also use the editor on its own without the assignment. Just make a div with the class hx-editor and it’ll automatically get converted into a rich text editor. The data-saveslot attribute will tell hx-js where to store things, so you can show students their own writing again later on.

Quick update: I’m just about done with the journaling assignment. My last hurdle is a focus issue in Safari that comes up while auto-saving. I think I know what’s causing it, but I’m still working on fixing it.

Welp, it’s broken.

No idea what happened. It just stopped working on edx.org one day. Even Partner Support didn’t know what happened, because no one keeps a changelog, even internally. My best guess is that some iframe security setting changed, but who knows.

It was in a course that was going live in a week. We had to strip it out.

Immensely frustrating.

Dang. I’m very interested in being able to access learner data in a question. I’m new to edx and am trying to figure out how to access learner data in a question.

I don’t understand why you’ve had to use the iFrame.

Since you don’t have to deal with CORS could you JS to get and set the data in the server using an API call?

Or fill localStorage if its empty or replace it if the remote “latest update” timestamp is < the locally stored “latest update” timestamp?

To my knowledge there are no API calls I can make to an edX instance to store data in the server, or to retrieve a learner’s answer to a question.

JSInput questions are different: there’s a Javascript object (part of the “problem state”) that is created when the problem loads, and sent to the server when the problem is submitted. Whatever we store in the problem state for this particular kind of problem, we can retrieve later.

Does that help?

Yes that probably helps. Seems like I can’t readily do what I want. I was hoping to be able to change or add to the question content e.g.

"Good job Paul, after a rough start you’ve gotten the past 4 questions right. "

For the JSInput question is the “problem state” object empty or is it initialized with any data?