Is Meilisearch a viable upgrade alternative to OpenSearch?

Please note that the following is a request for comments/input. This is not a decision or definitive recommendation on Axim’s part.

DEPR-170 covers a move from Elasticsearch to OpenSearch, which was also discussed in these forums:

@regis suggested that we might use this opportunity to remove Elasticsearch (and by extension OpenSearch) altogether, pointing out their extremely high memory requirements:

Unfortunately, MySQL full text performance is lackluster, both in terms of result quality and performance when combining a field to be searched with other indexes (like a course). So we still need some search engine to fill that gap to maintain feature parity.

In a recent comment, @jmbowman stated his belief that Meilisearch would be a more promising long-term alternative:

From a future-looking perspective, I feel that Meilisearch would be a better search engine to integrate with. It’s MIT-licensed, blazing fast (implemented in Rust), much less resource-intensive than Elasticsearch, already fairly competitive with Algolia in many respects, has solid commercial support, and has pretty good Python support. There isn’t an authoritative Django package for it yet, but there are several packages and blog posts outlining how other people have used them together. It would be a gamble, but frankly it feels like it has more momentum than OpenSearch.

I’m unfortunately not likely to be able to help much with this for a while, so it’s going to be up to other people to pick a path forward. I just wanted to articulate that while OpenSearch looks at first like the easiest/safest path forward to solve the licensing problem, it’s actually harder than it looks and may not really set up Open edX for success in future search improvements.

I have not tested anything locally, but based on various blog posts, that resource usage difference is massive. This one shows Meilisearch using 1/10th the memory of Elasticsearch, this one at around 1/5th. (There’s even one claiming a 1/50th memory usage, though I suspect misconfiguration in that one.)

Elasticsearch to OpenSearch isn’t a drop-in replacement, and we will have to modify some code to use newer libraries. If we’re going through that effort anyway, should be consider a wholesale move to Meilisearch? It’s a less mature codebase, but it seems to be friendlier out of the box, it has compelling performance characteristics, and it has a much smaller memory footprint. It seems to be actively maintained and developed by an open-source friendly company. By the very unscientific measure that is GitHub stars, it also seems to have more developer excitement than OpenSearch.

The question is this: Is it worth delaying the Elasticsearch → OpenSearch migration in order to do discovery work around Meilisearch? Doing so might give us a more compelling long term search engine for the project, but it would further delay this long-running DEPR effort, and possibly threaten the timeline for new Studio course content search functionality currently scheduled for Redwood.

FYI to @feanil, @Diana_Huang, @braden, and @jristau1984 who have all recently commented on that DEPR.

5 Likes

We haven’t done a full rollout yet but tests of older opensearch still using the elasticsearch libraries work fine. So it seems possible to stop using elasticsearch servers before having to change the library, which would possibly ease the transition. To do the library transition we’d want to introduce another search engine and also overhaul the weird way search engines are chosen from settings to allow index by index changeover.

Different searches in the code are different levels of tied to ES - course content search is completely tied to the ES format and that tie is scattered across multiple codebases. Other more modern searches at least put all their assumptions together in one place.

Fixing courseware search would be sufficiently annoying that I think it would be better to just make a Meilisearch or whatever separate library and then redo courseware search on it than to carefully juggle the way it is split between the library, LMS, and CMS now.

See https://openedx.atlassian.net/wiki/spaces/AC/pages/3884744738/State+of+edx-search+2023 for various search stuff dug out last year.

On an operational note it only took a day to rebuild the course content index for all active edx courses using the management command, so I think we should go with the “rebuild your indices using these commands” route for any migrations rather than worry about some exotic index to index magic.

Personally I have not liked elasticsearch any time I have had to use it though it is relatively harmless here. I’m sure opensearch is about the same. :laughing: But that certanily makes me inclined to like a different search solution.

2 Likes

I believe it is! I’ve been looking at Meilisearch myself as a candidate replacement for Elasticsearch in Open edX, and I’m excited by this prospect. Elasticsearch is the biggest memory hog (and climate changer) in Open edX. If we were to switch to OpenSearch, I believe we would never invest the energy to migrate again to Meilisearch.

That being said, it’s not a trivial task to isolate the Elasticsearch clients in Open edX. ES is used in multiple apps, including edx-search and the forum. So it’s a complex project, but I think the reward would be well worth the effort.

3 Likes

If anyone wants to explore this in detail out or hack on it with me:

  • I just created tutor-contrib-meilisearch will add Meilisearch to your Tutor Nightly devstack and configure edx-platform to connect to it.
    • With only a few small documents in each index (i.e. typical devstack case), it uses < 20 MB memory vs ElasticSearch uses 1,380 MB :joy:
  • This PR demonstrates indexing library content in Studio. No search nor courseware functionality tested yet. You can do basic searches using the Meilisearch UI though - see screenshots.
4 Likes

I don’t have much to add, but I’m thrilled that this discussion is happening. I’ve used meilisearch in a Rust-based project before, although the number of usecases was limited, I had a good experience working with it. I believe that performance is definitely the strongest point of meilisearch, and it worth to consider the difference with EC and Opensearch during the discovery.

The Braden’s POC looks very promising, great work!

Additionally, different Rust solutions mature every year and continue to surprise with their performance characteristics compared to more popular choices. They also provide safety and easy integration with other technologies, including Python, for example, as through the PyO3, by building native Python wheel. In my biased opinion, the community could benefit from including such technologies in the core offering, potentially making OeX technologies appear more prestigious.
Would be happy to join and participate in this initiative regarding Meilisearch.

@ashultz: Thank you for the background info! That wiki page is amazing.

@braden: That proof-of-concept looks great!

We have a couple of in-Studio content search features that will be in development very shortly–one for courses and one for the new libraries experience. I’d like to propose that we implement these using Meilisearch, to try it out on something real. Some rough thoughts for an ADR:

  1. We keep the Meilisearch-specific code isolated to a single module, so it’s relatively easy to swap out later if this experiment doesn’t pan out.
  2. All use of Meilisearch would be off by default in Redwood, giving folks until at least Sumac to plan for the additional infrastructure required.

Once we try it out and have some experience with it, we can make the decision of whether alter the DEPR to convert to Meilisearch. We don’t want to end up in a final state where we’re running both indefinitely, so if things don’t go well with Meilisearch, we’ll convert the Studio content search functionality over to OpenSearch for consistency with the DEPR.

Does that sound reasonable to folks?

1 Like

The plan to experiment with Meilisearch sounds very reasonable. The most critical point for me is

We don’t want to end up in a final state where we’re running both indefinitely

Which you cover in the case that the experiments with Meilisearch go wrong. Should it go well and be as good as it looks, what could be a rough plan for moving all search to it?

1 Like

@dave The plan sounds great. I will continue to develop my prototype along those lines. I already expanded it to index courseware (in Studio).

1 Like

Shortly after Redwood, we should have enough information to know how we want to proceed. If Meilisearch pans out, we make a new DEPR, and make Meilisearch a baseline requirement for Sumac. We start porting over existing Python code that currently uses Elasticsearch to use Meilisearch instead in the run up to Sumac. Elasticsearch is likely still around as an option for the Teak release, but is dropped entirely after Teak is cut.

The most annoying sticking point is likely to be the forums–particularly the cs_comments_service written in Ruby. Next week, I’ll work on a long overdue ADR for re-implementing that service’s functionality in Django. The search part of that re-implementation would likely not start until after Redwood is cut, so we should have a direction by then.

1 Like

One thing that is worth noting about Meilisearch is that it seems to only allow high availability mode in their hosted cloud service. In their comparison matrix, under deployment it shows high availability as “available with Meilisearch Cloud”.

Overall it seems like a promising product, but for anyone running a non-trivial deployment of Open edX it would force them into using their hosted product. In general that may not be a show-stopper, but I can imagine there are cases where that would prevent someone from using Open edX.

@blarghmatey: Thank you for the info! That’s definitely concerning. :slightly_frowning_face:

The most recent ticket I can find about it is here:

The most recent activity was in November though. They have a prototype, but it’s rough and has some big flaws, the most obvious of which is: “If the leader crashes, there is no re-election, the cluster no longer works, but the followers can still answer search requests. We are still thinking about what we could do about this.”

I believe this is the draft PR for the prototype:

I wrote a comment on their Discord channel for this topic.

Good point. One thing that’s important to distinguish is whether they are purposely keeping HA out of the open source product as a business strategy (as many “open source” database/search vendors do these days), or they just haven’t developed it. From what I can tell, it’s the latter - they fully intend to support this feature in the open source project and would welcome contributions to do that, but it has been repeatedly delayed / de-prioritized. (The PR that Dave linked to is from a Meilisearch employee.)

So I am optimistic that this could be resolved in the future, but it seems like nobody should count on that anytime soon.

It’s also worth noting that (if I understand correctly) the nominal “high availability” that they advertise on their cloud offering is not replication-based but instead “we ensure the high availability of your project with Kubernetes technology, redundant volumes, and regular backups. In the event of an error, a Meilisearch server takes only a few milliseconds to restart” (source). So Open edX operators may not be able to use replication in the immediate future, but can certainly use those strategies. What’s more, because Meilisearch is so lightweight, you can deploy a separate instance per index, so that (for example) your Studio courseware search doesn’t go down at the same time as your forum search.

Q: Would this be a deal-breaker?

Q: Is ElasticSearch ever on the “critical path” for learning? i.e. learner account creation, logging in, course purchasing/enrollment, viewing courseware, submitting assignments/exams/problems, posting in the forum, viewing grades, etc.

I have expanded my prototype so it can demonstrate full end-to-end search functionality from backend to frontend. It also includes courseware now, not just v2 libraries. It also includes tag data from the new tagging system. Pardon the ugly UI.

4 Likes

For those following this thread, I am planning to proceed with developing new Studio search functionality using Meilisearch discussed (as an experiment - the feature will be off by default, and so Meilisearch won’t be required unless you choose to opt in and help test it out; later we will evaluate it and make a decision about what path to take for Sumac).

I’ve added an ADR to the PR and it’s ready for review/merge: Index Studio content using Meilisearch [FC-0040] by bradenmacdonald · Pull Request #34310 · openedx/edx-platform · GitHub

2 Likes

4 posts were split to a new topic: Auto-suggest course content on search (Meilisearch-compatible)

Recognizing that there has already been substantial investment in the adoption of Meilisearch as the de facto search backend for edx-platform, I wanted to follow up with this topic.

I had not been following this work closely, but after revisiting the conversations around high availability/redundancy/failover in Meilisearch it seems that there has still been no real progress in that direction. All of the GitHub issue and discussion threads peter out in the same manner of noting the challenge of implementing distributed consensus (e.g. Paxos, Raft) and the lack of high-quality libraries in Rust to handle them. All of the recommended methods of handling failures rely on persistent disk and restarting the process, which fundamentally fails to address high availability and shared-nothing architectures. Instead it forces you to have a distributed storage layer (e.g. NFS, GlusterFS, Ceph) or some other means of data replication to be able to handle server failures, disk corruption, etc.

I understand that the majority case of edX installations, and the primary mode of operation supported by e.g. Tutor is to have a single server or virtual machine, but for cases where someone is not operating in that fashion Meilisearch continues to pose an operational risk. Granted, the search functionality is not mission critical for the use of edx-platform, but for anyone who operates the system a failure in any element of the system can still lead to a degradation of trust in the system or the ability of the operator.

I recognize that there is no perfect answer to the challenge of search, and I do like the performance promises that Meilisearch offers. That being said, it seems that Typesense would be a more appropriate alternative? It offers similar performance benefits, a longer development history with wider adoption, and an out-of-the-box HA story (Algolia vs Elasticsearch vs Meilisearch vs Typesense Comparison). Looking at the comparison document it seems that the primary downside is the in-memory nature of the engine?

@blarghmatey What we’ve been discussing in other threads is to implement an abstraction layer, so that anyone who really cares about HA for search can use Algolia (or perhaps TypeSense if someone wants to implement that). Note that Elasticsearch will likely not be an option as it’s very different from these more modern search engines, and probably not worth including under the same abstraction.

TypeSense is nice and I’ve used it before, but one of the big drawbacks is exactly what you noted: that it can require a lot of memory because it stores the entire index in RAM.

As of yet however, nobody other than you has actually said that they need something other than Meilisearch, and nobody has volunteered to implement wrappers for other search engines. @qasimgulzar is working on the abstraction layer in general and Meilisearch in particular.

@braden: We haven’t had that much feedback from elsewhere in the community, but I’m inclined to believe that others will share @blarghmatey’s concerns about HA. It’s unlikely to change what goes out in the Sumac timeframe, but I think it’s worth Axim’s while to fund an investigation to the memory usage issues around Typesense.

I agree with both of you that the memory usage is the biggest potential drawback of Typesense, but I’m not sure how that will actually play out in practice. My intuition is that because Meilisearch uses memory mapped files, and because relatively few parts of the index will be “hot” at any given time, that it will effectively require less memory for comparable performance. But there are some huge caveats with that:

  1. My intuition is purely guesswork–it could be that Meilisearch requires comparable RAM to Typesense to give acceptable performance in practice on large datasets.
  2. Running in clustered mode will increase indexing write latencies. By how much?
  3. Are there significant differences in how compactly they represent their indexes?

The two biggest things that I’m aware of are course content data storage and forums post storage (the catalog-related metadata that I know of is orders of magnitude smaller).

@blarghmatey: If someone coded a minimal Typesense integration (using the interface that @qasimgulzar is making), would you have the time/capacity to be able to run both Typesense and Meilisearch indexing against the data on your production site? So that we can get a better understanding on how memory and latency compare across the two using real data?

Thank you for that suggestion. I can plan to set aside some time for that testing to help move the conversation forward with a bit of concrete evidence. I agree that getting some real-world data around the operational overhead of each solution would be useful.