Open edX’s analytics pipeline was processing logs.
And the tech stack used was hadoop, hive, luigi tasks etc.
Setup of analytics is so comlex and has to be perfectly done to get it working so something in that area, we can work towards simplifying the installation and setup.
As for the data I think people just want to see some basic things.
How many users are enrolled in a course ?
Who are the users ?
How they are engaged with the content ?
How they are engaged with the videos ?
From which country they are coming ?
What is the age group of the users ?
What are their progress ? And how they are performing with the questions and answers ?
That sort of thing. I also think there should be a way to develop on top of it, for example if we want to develop some stuff in edX insights it requires knowledge of various things (hell, even it’s setup requires knowledge of various things like ansible and ubuntu system and various other things).
Some of those data is only found in logs and rightly so, those events if stored in mysql will be huge.
I don’t have any special requirements or anything that I use as such, I do the setup of insights and use it and have come to appreciate the piece of complex software that is available, this are just my 2 cents on the basis of what I’ve felt using the insights.
I am thinking of developing a custom solution for it, but don’t have the bandwidth to do so, I think there are some talks on tutor logs already in this direction but again haven’t had time to look into it deeply.
Yes, Figures is being updated to Maple and we’re making changes for Figures so it’s easier to be upgraded in the future by removing some fork dependencies we had in version 4 (Juniper) and earlier versions.