Howdy! Iāve looked into this a fair amount, from my PhD thesis to more recent research, so I feel like I can answer pretty authoritatively. If you want to skip the reasoning, scroll down to the bold part.
You are correct that completion events cannot be relied on.
The amount of time between when someone loads one page and when they load the next page is not something you want to rely on either. You donāt know whether someone has read a page for 20 minutes, or whether they read it for 10 minutes and then visited the restroom. When they loaded a problem and answered it an hour later, were they working on the problem or did they watch YouTube?
You can remove some obvious outliers (27 hours between page loads means they went away and came back the next day), but what amount of each outlier do you want to count? 10 minutes of it? 5% of it? None of it? Are you going to keep track of the actual length of the content, the grade level of the writing, and the learnerās language proficiency when determining whatās an outlier?
Video metrics also cannot be relied on. You do not know that the learner actually watched the video, only that the video played. Someone might have pressed āplayā, made a sandwich, and skipped to the next page. There are learners who do this for a variety of reasons, most of them involving the Treasured Green Checkmark of Warm Fuzzy Feelings. Other motivations for not actually watching a video include a disinterest in that particular topic, or the fact that they already know it and just want to get the green check and move on.
If it sounds like Iām saying that you canāt measure time on task, that is correct. You cannot measure time on task.
You can measure proxies for time on task, but the relationship between those proxies and actual time is dependent on the individual and on their out-of-platform activities that day. If you tell me that one individual learner spent more time in a course than another learner, Iām not going to believe you, because you cannot measure that.
Now, if you tell me that the time on task for an entire course is, on average, longer than another course, then Iāll believe you. We can make proxy measurements, and the huge variability in individual measurements can be averaged out pretty well at the whole-course level. I might believe it for cohorts within a course, depending on how large those cohorts are. āHow large does it have to be?ā is probably a research question. āHow close is the average of a particular proxy to the average actual time?ā is another good research question, one that will require tracking actual time spent in some reliable manner (not eye tracking or online proctoring).
Iām not against having a proxy measure in Aspects, but it would need to be labeled as a proxy and not counted in hours or minutes. It is better to make an arbitrary decision and A/B test it than it is to make a decision on the basis of faulty data, and I would rather not have us mislead people.
Long answer, I know.