šŸ¾ Cairn: scalable, real-time analytics for Open edX, now free for everyone!

I am pleased to announce that Cairn, the scalable, real-time analytics solution for Open edX is now free for everyone. You can get started today with:

tutor plugins install cairn
tutor plugins enable cairn
tutor local launch

For further instructions, please check the project README: GitHub - overhangio/tutor-cairn: Scalable, real-time analytics for Open edX

Cairn used to be a commercial-only plugin that was part of the Tutor Wizard Edition. This subscription model has supported me for the past few years and allowed to focus on maintaining the free parts of Tutor. With the recent acquisition of Overhang.IO by Edly, it no longer makes sense to keep this plugin behind closed doors. If you were a subscriber of the Tutor Wizard Edition, you should know that I am extremely grateful for your support, and I hope we can keep working together :slight_smile:

This is a change that Iā€™ve been looking forward to ever since I created the Wizard Editionā€¦ Itā€™s always been a bit awkward to publish open source plugins but make them only available to commercial subscribers. I am exceptionally proud of Cairn and the possibilities that it creates for the Open edX platform, and it feels great to finally share the source code with the rest of the world.

Since Cairn was created, the OARS initiative was kickstarted by Axim and community members to build an Open edX analytics stack on the same technical foundations as Cairn. Our hope is that we can work together to avoid the duplication of efforts. There are some components of OARS that Cairn will re-use, such as the authentication system. Others, such as the learning record store, introduce extra complexity without immediate benefits to the end user and should probably be kept as optional components (IMHO). In any case, I am super excited by the possibilities that a powerful, free analytics stack offers for Open edX.

17 Likes

Happy and excited to read this, I remembered one group meeting, where open question was floated, what are the various analytics tools available in Open edX, Cairn was one of them and question asked, whether you are willing to make it open source. You said, No. But I am sure you were in that process to make open source some time. This is really a good new for community that Cairn is open source now! But this certainly create few clouds in sky like

  1. What would happen to ORAS?
  2. WIZARD EDITION is commertial or open source?
2 Likes

Congratulations @regis on making Cairn available.

I have a few questions and maybe you can shed light on what is possible in our situation.

We have been using Insights for a long time. Our current Tutor installation does convert the tracking logs in the Raw format expected by Insights. We also rotate and gzip our tracking logs each hour when they reach a certain size. Finally, we transfer those tracking logs to S3.

If I understand correctly the instructions in the README page for Cairn, all tracking logs events must be in a single file $(tutor config printroot)/data/lms/logs/tracking.log. Am I right?

I can always bring back our tracking logs from S3 to our local server, thatā€™s not an issue. But is there a tool or a script to convert these tracking logs in the format expected by Cairn?

Obviously, this is a big deal for us if we were to switch from Insights to Cairn because we would like to keep our historical data from the last 7 or 8 years. I would have the same questions for OARS to be honest. But I know they plan to have a tool to convert the event logs to xAPI.

Any suggestions?

@regis THANK YOU for sharing the great news with us.
We use the Cairn and have been able to produce very effective data results for various clients.

1 Like

Right now our focus is on supporting existing Cairn users, so this is where we are going to invest our efforts. But from discussions with @BrianMesick, @jill and @e0d there is a strong consensus that both projects need to merge. How exactly this will happen in the future is still an open question at this point.

For all purpose and intent, you can now consider that the Wizard Edition no longer exists. But when it existed, the plugins that it included were open source ā€“ yes, turns out you can sell open source software :slight_smile:

Not exactly @sambapete. In Cairn, logs are collected (by Vector) in real time from the Docker daemon, which outputs logs to stdout.

Yes. You want to read this part of the Readme: GitHub - overhangio/tutor-cairn: Scalable, real-time analytics for Open edX

When Cairn is launched for the first time, past events that were triggered prior to the plugin installation will not be loaded in the data lake. If you are interested in loading past events, you should load them manually by running:

tutor local start -d cairn-clickhouse
tutor local run \
  --volume="$(tutor config printroot)/data/lms/logs/:/var/log/openedx/:ro" \
  --volume="$(tutor config printroot)/env/plugins/cairn/apps/vector/file.toml:/etc/vector/file.toml:ro" \
  -e VECTOR_CONFIG=/etc/vector/file.toml cairn-vector

The latter command will parse tracking log events from the $(tutor config printroot)/data/lms/logs/tracking.log file that contains all the tracking logs since the creation of your platform. The command will take a while to run if you have a large platform that has been running for a long time. It can be interrupted at any time and started again, as the log collector keeps track of its position within the tracking log file.

To ingest a very large amount of logs, I suspect that you will have to massage this CLI a bit, and you might want to adjust the log parsing format. But Iā€™m sure you get the idea :slight_smile:

1 Like

What is the minimum required Tutor version? I went to test it on a tutor 14.0.3 system and got Error: No plugin found at cairn (and tutor plugins update doesnā€™t work on this version either)

Cairn works on Nutmeg, but you should then install it with pip: pip install 'tutor-cairn<15.0.0'. The tutor plugins install command was only introduced in Olive.

@regis Weā€™re interested in having our openedx tracking logs pushed out to S3. Can Cairn read from S3 or does the tracking log file need to be local to the Vector container?

If yes, could you point us to any documentation concerning this S3 configuration for the platform and Cairn?

We too are interested in migrating all our data since 2013 to Cairn. If you hear any update on the merging of OARS and Cairn would you post it here? We want to make sure that we use the latest analytics open-source solution for Open edX. Thanks for all your work in this effort Regis!

Do we need to focus on using MinIO for storing tracking logs?
Kubernetes deployment ā€” Tutor documentation (overhang.io)

cc @sambapete @traek728 @becdavid

8 posts were split to a new topic: ā€œError: plugin ā€˜cairnā€™ is not installed.ā€

Is there any updated plan and timeline for the future of Cairn and Aspects/OARS?

@Jeff_Cohen
Iā€™ve been talking with @BrianMesick about this. It appears that the new analytics system coming out is Aspects (previously named OARS) and documentation can be found here.
https://docs.openedx.org/projects/openedx-aspects/en/latest/

Version 1 Product Requirements are here. I think they were targeting October or November release but that may change.
https://openedx.atlassian.net/wiki/spaces/OEPM/pages/3834281985/DRAFT+Aspects+V1+Product+Requirements

Thank you @Zachary_Trabookis. Does this mean Cairn will no longer be developed? Or will that continue to receive investment and give folks an option between the two?

cc @regis

@Jeff_Cohen
Aspects has integrated components of Cairn like Clickhouse, Vector, and Superset and added xAPI events. Not sure if Cairn would be supported after Aspects is released.

We considered using Cairn now but I believe V1 for Aspects will be available this Fall (October-November). Iā€™m not on the development team for Aspects but rather a consumer.

Cairn is currently used in production by many Open edX platforms, so we are very much committed to its maintenance and improvement. At this point, the fact that Aspects cannot access stateful data from MySQL/MongoDb remains an issue for our users. We are still committed to bringing Aspects to feature parity, but we havenā€™t invested in Aspects yet for lack of time.