### Proposal Date
2024-02-01
### Target Ticket Acceptance Date
2024-02-…15
### Earliest Open edX Named Release Without This Functionality
Redwood - 2024-04
### Rationale
If you’ve been following Open edX core development for a while, you might be surprised to see a deprecation notice for Blockstore, which we’d long thought of as the future of Open edX content storage. Rest assured, we’re still actively working on the original goals of Blockstore, but they’re now part of a more fleshed-out package: the [“Learning Core”](https://github.com/openedx/openedx-learning/tree/main/openedx_learning/core).
Some Context: Blockstore was developed in ~2018 as a content storage service, enabling courses.edx.org to serve content for the [LabXchange](https://www.labxchange.org/) project. The broader vision was to replace the Open edX platform’s current storage backend (Modulestore) with something simpler and more flexible, supporting a paradigm shift towards modular, non-linear, and adaptive learning content. The key tenets of Blockstore were that:
* It was **isolated from the complexities** of edx-platform. This made the system easier to comprehend and maintain.
* Its data model was **authoring-first**: its primitives were “bundles” and “assets”, which get versioned and published together. In contrast, Modulestore focuses on content blocks and structures.
* It was **agnostic to the structure of the content** it managed: bundles might be libraries, units, individual components; bundles may contain XBlocks, or something different. In contrast, Modulestore assumes that everything is an XBlock and that everything is part of a Course.
You can read more in Blockstore’s [DESIGN doc](https://github.com/openedx/blockstore/blob/master/DESIGN.rst). Blockstore was selected as the basis for the “Content Libraries V2” initiative, which aimed to rebuild the legacy Content Libraries feature to be more robust and broadly useful. It was envisioned that all Open edX content, including traditional courses, would eventually be migrated to Blockstore, and that Modulestore would be deprecated.
However, in the intervening years, LabXchange moved off of the Open edX platform, and the Content Libraries V2 project was delayed several times due to organizational shifts and competing priorities. As of 2023, we are not aware of any production users of Blockstore (if you are aware of any, please comment).
Since 2018, and especially since the creation of Axim in 2021, we have had time to learn lessons from Blockstore’s original design and think deeply about the needs of the Open edX project going forward. We still believe in Blockstore’s original key tenets, but we've also learned more:
* Whereas Blockstore stores almost all of its data on object storage (e.g. S3), we should store data in a **relational database** like MySQL and connect it LMS/CMS via an **in-process library**. One of the assumptions in the Blockstore design was the Open edX CMS (running in the same deployment) would be able to query this data very quickly, but in practice we found that reading data from S3 had significantly higher latency than anticipated, causing XBlocks to load very slowly and making additional complex layers of caching absolutely necessary. The vast majority of production issues with Blockstore were either Object Storage errors (expired signed URLs) or cache invalidation errors.
* Whereas Blockstore (being authoring-first) assumed that some other system would consume and cache its content for **efficient loading/filtering/searching/sorting in LMS**, we’re better off just building the basis of that “some other system” into the same package as the storage backend. edx-platform developers need a good, consistent pattern for representing learning content; there is [guidance for doing so today](https://github.com/openedx/edx-platform/blob/master/docs/decisions/0011-limit-modulestore-use-in-lms.rst), but it’s better for everyone if that pattern is provided as a core capability.
* Whereas Blockstore managed versions at the abstract “bundle” level, we’re better off managing **versions for individual learning components**. That way, edx-platform, et al, do not need to “choose” what a bundle means in any given context–everything is versioned, so clients just need to choose how to helpfully present that to authors. This still lets us be agnostic as to the shape and structure of the content, as we’re not making assumptions about how the components fit together. It also lets us support extremely large libraries with many thousands of components which, under Blockstore’s bundle-versioning system, introduced performance issues during writes.
### Removal
* Blockstore references need to be removed from edx-platform.
* Docs referencing Blockstore concepts should be updated or archived.
* Blockstore itself needs to be archived.
### Replacement
We've developed the [**Learning Core**](https://github.com/openedx/openedx-learning/tree/main/openedx_learning/core), which replaces Blockstore and incorporates what we've learned over the past few years. The insights of Blockstore live on mostly in the `openedx_learning.core.publishing` sub-package. You can read the various [decisions we’ve made](https://github.com/openedx/openedx-learning/tree/main/docs/decisions) and [are making](https://github.com/openedx/openedx-learning/issues) for this new system.
### Deprecation
We will add a deprecation notice to Blockstore's README. We'll also push a final Blockstore release to PyPI so that the deprecation notice shows up there.
### Migration
Because Blockstore isn’t currently deployed to production anywhere (again, please comment if you disagree), we have the great opportunity to jump right to Learning Core rather than foisting a two-step Modulestore->Blockstore->LC migration upon site operators. So, we are currently migrating the Content Libraries Relaunch project from Blockstore to the Learning Core, which we are aiming to make experimentally available as early as Redwood (June 2024) and properly available by Sumac (December 2024). We plan to remove Blockstore from the release starting with Redwood. Going forward, we plan to incrementally migrate parts of edx-platform the Learning Core, with the long-term goal of either replacing Modulestore, or reducing Modulestore to a compatibility layer resting on top of a backfilled Learning Core.
### Additional Info
N/A
### Task List
- [ ] Add DEPR notice to README of https://github.com/openedx/blockstore. Push a final release to PyPI.
- [ ] https://github.com/openedx/edx-platform/pull/34066
- [ ] Removing any remaining Blockstore references in edx-platform, including requirements and settings.
- [ ] Archive https://github.com/openedx/blockstore
- [ ] Search and update/archive docs containing the words "blockstore", "bundleversion", "snapshot", etc.