Dear Open edX users and developers,
tl;dr: the tracking module collects a lot of personal data that we do not need. Can open edx turn it off by default?
We have run an Open edX instance for a Digital Security training platform for a few years now. Our target audience includes people with elevated security risks, like journalists and activists.
After upgrading to Juniper, we investigated what personal data we gather in the log files. And we were shocked. The following data is collected by the tracker
module almost every time a user clicks anything in Open edX:
{"accept_language": "fr,fr-FR;q=0.8,en-US;q=0.5,en;q=0.3", "page": null, "event": "{\"GET\": {\"videoId\": [\"xxx\"]}, \"POST\": {}}", "context": {"org_id": "x", "course_id": "x", "course_user_tags": {}, "path": "/courses/course-v1:x", "user_id": xxxx}, "host": "learn.totem-project.org", "time": "2021-02-22T10:47:47.576859+00:00", "event_source": "server", "event_type": "/courses/course-v1:x", "ip": "xxx.xxx.xxx.xxx", "username": "x", "referer": "x", "agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:85.0) Gecko/20100101 Firefox/85.0"}
So here we find:
- IP address
- Browser
- Operating system
- user name
- page the user came from
Although it is possible to disable the tracking module (we wrote this tutor
plugin to do so), I think it should be disabled by default. Apart from the fact that we have a specifically vulnerable target audience, in Europe we need to consider the General Data Protection Regulation (GDPR), specifically Article 6. This information is more than enough to exactly identify a person, so it is covered by that law.
It seems like the tracking module is only necessary if you enable an optional module, like Insights. But it seems like many organizations do not use Insights, and are thus gathering information they do not use. If you do not have an explicit use for the data and you don’t have explicit consent, you’re basically breaking the law by gathering and storing it.
I am posting this here because I am wondering if others have considered their compliance to the GDPR as well. I am also wondering if we might have missed some reason why the tracking module needs all this information (even if you don’t use Insights). I would like to propose that the tracking module is disabled by default for new Open edX installations to prevent European organizations from accidentally breaking the law.