Meet this week to align on Doc efforts on Toggles/Settings/etc

Hello everyone,

(Apologies for this late post. I’d been meaning to send this out sooner - was delayed by Covid-19 adjustments.)

I’d like to meet this week to align on multiple documentation efforts that are happening in parallel and collectively hash out blocking open questions so we can make faster decisions moving forward.

Please read/edit the proposed agenda at Documentation of Toggles/Settings/etc.

I hope y’all can make it. When is a good time for you?

  • Wed Mar 18, 10am-11:30am EST (2pm-3:30pm UTC)
  • Wed Mar 18, 11am-12:30pm EST (3pm-4:30pm UTC)
  • Fri Mar 20, 10am-11:30am EST (2pm-3:30pm UTC)
  • Fri Mar 20, 11am-12:30pm EST (3pm-4:30pm UTC)

0 voters

Thanks,
Nimisha

It’s looking like Friday at 11-12:30pm EST (3pm-4:30pm UTC) is working for all so far. We’ll wait to hear from a few more today. In the meantime, I’ve set a tentative calendar invite:

Google calendar invite

with the following Google Hangouts meet:

Hi all,

Reminder that we’ll meet tomorrow at 11-12:30pm EST (3pm-4:30pm UTC). I have sent an invite to each of you who responded. I’ve also posted links above in case I missed anyone or if there are any latecomers.

Thanks,
Nimisha

Hi Nimisha,

In preparation for the meeting, and since I had expressed our interest in participate I dedicated some time to look into OEP-17 and I wanted to share here my notes. I’ll also be at the meeting tomorrow and we’ll be talking about some of the same points.

  • For open source installs is very important to know the maturity of the toggles.

  • One entry point for reading all the toggles and many places to write them (eg: DB, settings, …). This will also help with the work of migrating all flags to the new-entry-point ways.

  • Opt out and open edX option are critical for edunext. I can imagine they are also critical for many other open source instances.

  • Even for toggles than can be refactored as pluggable, the rest of the conditions apply: opt-out, beta, incremental…

  • We are all in for a common framework on top of django-waffle. I’d like to consider also endpoints for pluggins altering the framework options.

  • Can we extract the waffle_utils (common framework) in a dependency, so that we can install it in more services? Eg: ecommerce, discovery …

  • Any thoughs about the path of moving things over to the new framework. My take on this, anything that is already a settings.FEATURE flag should still read from the settings dict.

  • Can https://openedx.atlassian.net/wiki/spaces/OpenDev/pages/40862688/Feature+Flags+and+Settings+on+edx-platform be made public?

Some on point comments about the requirements of the common framework

  • Non-collision:: we have already some concept of namespace when using waffle? how would the framework make it different
  • Multi-tenancy:: plugins can extend this beyond django-sites. Consider a plugin entry point
  • Discoverability:: this should be very tightly coupled with documentation and thoughtful definition of the defaults
  • Report:: love this. We would require that is also multi-tenant and will be happy to work in it. I would love both an API and a html view for humans.
  • Auditability:: would it work to have a model that stores all the representations of previous instances of the XxWaffleFlagClass?

Hi all,

Thank you @nimisha for organizing this meeting. I wish I could have attended, but I have been stuck in bed with a fever since Thursday and it’s only yesterday that I started to get better and managed to stare at a screen again. It’s not the covid-19, although it’s related to the anxiety caused by the outbreak.

Anyway, I really enjoyed watching the recording of the meeting, where everyone was very insightful. It really helped clarify some of the technical decisions that were made, for instance on feature toggle annotations. I was interested to learn about the rejected alternatives in OEP30.

I still don’t have a very clear view of what kind of tooling the feature toggle annotations are supposed to enable. Also, I don’t really have a dog in that fight. I mean: yes, it’s important to document/annotate the feature toggles, but it’s less of a priority to me, right now, than producing a reference manual for Open edX. So I don’t have much interesting to say about how edX should implement feature toggles annotations.

Concerning the reference manual, I noted that the documentation we produce right now should be more or less independent of the technical implementation we choose to actually generate the docs. So I can happily keep writing unstructured docs for the settings and operations, and it shouldn’t affect the strategy we choose later.

Thus, I am not too concerned right now by the tool we will use to parse the feature toggle annotations and generating the reference manuals. However, this does not prevent me from having a strong opinion on it :yum: It’s one of those subjects where I’m unlikely to change my mind, but it’s OK if things don’t go the way I think they should. (I realize more and more questions fall into that category as I am getting older…)

Basically, we need to decide between two options:

  1. Use standard Python/Django tooling, such as Sphinx
  2. Use tooling that is custom-built for edx-platform/Open edX

This is not a binary choice though, and a few alternatives have been mentioned during the talk that borrow from both options.

My “not too concerned but very strong” opinion is that we should rely more on standard than custom-built tooling. This is what I tried to achieve by relying on autodoc-supported docstrings in my reference documentation pull request. Let me defend this proposition against the arguments raised in the meeting:

Sphinx requires module imports

@jmbowman You mentioned that the problem with docstrings is that sphinx needs to import the python modules in order to parse the docstrings. As such, a complete virtual environment with python dependencies is needed to generate the docs. I don’t see why this is a problem, in particular with Devstack or other Open edX docker images? After all, we all have a working development environment on our laptops and and CI machines. It’s relatively easy to pip install all the dependencies from an Open edX project, even on a host machine (provided it’s UNIX-based and recent).

On the other hand, if we need to switch to a different, custom kind of tooling, then the environment used to generate the docs will have to be the same for all Open edX projects: we will have to share the same “docs” virtualenv across all Open edX repos. How will that work with the devstack? Does that mean we need to create a new virtualenv inside each image, or create a dedicated image that mounts all repos inside the container at runtime?

Module imports cause bad code to break

It was also mentioned that importing some Python modules may cause issues because they might be doing some ungoldy things, such as parsing sys.argv and whatnot. This is true: bad code will cause the documentation generation to fail. And this is a good thing. It gives us an opportunity to detect and improve bad code in the Open edX codebase.

Call me an idealist: I believe that collective effort can help improve the overall quality of the Open edX codebase.

And anyway, what’s the alternative? Developing a custom, complex tool to work around the difficulties of the edx-platform code base? The code-annotations repo currently clocs at 2k+ lines of code. It’s probably all very good code, but it’s yet another tool that Open edX developers need to master in order to understand what goes wrong when building docs. The effort that have gone in developing that tool could have been directed toward improving the rest of the Open edX codebase.

This is a missed opportunity.

This documentation effort could be the right excuse to make edx-platform more like other Django projects, which is a good thing because it makes the project more approachable and understandable by the rest of the Django community. The way to do this is to remove one by one the weird bits of code that lie around and which force us to jump through non-convex-shaped hoops all the time.

Some modules are only importable under certain conditions

Importing certain modules may break because these modules will only work either with the LMS or the CMS, or when certain stars are aligned. I don’t have an example in mind of when that might occur but that’s ok. @robrap provided an excellent answer to that problem: refactor the problematic module such that the feature toggle annotation is moved to an independent module. I.e:

myweirdmodule.py:

if os.environ["cute_flag"] == "cutifyallthethings":
    # Feature flag to accelerate overall cuteness.
    # .. toggle_name: overall.cuteness
    # ...
    MY_CUTE_FEATURE_IS_ENABLED = True

becomes:

if os.environ["cute_flag"] == "yesiwanttoenablethis":
    from .cutefeature import *

with cutefeature.py:

# Feature flag to accelerate overall cuteness.
# .. toggle_name: overall.cuteness
# ...
MY_CUTE_FEATURE_IS_ENABLED = True

In a way, this is the same issue as the previous one: writing the docs gives us the opportunity to refactor the code in a cleaner, more modular way.

Docstrings cannot document blocks of multiple settings

This is something that @nedbat mentioned. The given example was the documentation of the following AWS_* settings https://github.com/edx/edx-platform/blob/fdc98696b15f8f2e595c056db76526e208e37140/lms/envs/common.py#L1337:

#: S3BotoStorage insists on a timeout for uploaded assets. We should make it
#: permanent instead, and yada yada yada.
AWS_QUERYSTRING_EXPIRE = 10 * 365 * 24 * 60 * 60  # 10 years
AWS_SES_REGION_NAME = 'us-east-1'
AWS_SES_REGION_ENDPOINT = 'email.us-east-1.amazonaws.com'
...

Here the docstring will be used to generate the docs only for the AWS_QUERYSTRING_EXPIRE and not the other settings. This might look like a problem, but it isn’t. And we might be tempted to solve it by bundling these settings in a dict, but we shouldn’t.

To understand why, let’s look at how other people in the world address this problem. For instance, Django has many EMAIL_* settings: https://docs.djangoproject.com/en/3.0/ref/settings/#std:setting-EMAIL_BACKEND

Because Django does not bundle these settings in a dict, that means that we can’t do it either. (and by the way, we have the same problem with boto3 and social-auth)

But that’s OK! It’s actually better if we document each setting individually. Then, we can link the docs for each setting to a reference setting where the bulk of the explanations will be given, using standard Sphinx tooling. Something like:

#: Activate the yadayadiness for students. See :ref:`lms.envs.ENABLE_YADA` for details.
ENABLE_YADA_FOR_STUDENTS = True

Conclusions?

As I (and others) have mentioned, code annotations and docstrings are not incompatible. Also, it shouldn’t be difficult to switch from one approach to another further down the road. That means we don’t have to make a hard technical decision right now. And whatever technical decision we end up making, I’ll be happy to go along with it. However, I do not want to invest my time in the BTR working group developing a custom tool for generating docs or annotations – at least, not right now. That means that I’ll keep working on producing unstructured docs for settings, operations, and maybe API endpoints, knowing that we may have to migrate them in the future.

@regis thanks for the thoughtful response, and glad you are back to full duty cycle!

About the difficulty of getting all the modules to import: the problem was on ReadTheDocs. It was a real challenge getting an environment there that could successfully run Sphinx. Of course, we don’t have to run Sphinx there, but it is the standard way it’s usually done. So if we choose to build and host docs some other way, we are doing a custom thing for Open edX. I’m afraid it’s difficult to truly do everything “the usual way” in a project this size.

I had not realized that docs were built by ReadTheDocs. If this remains a hard requirement, it will severely limit our ability to document code with docstrings. If I understand correctly, currently ReadTheDocs does not install any Open edX-specific-related requirement? Can you share the curated ReadTheDocs configuration for the edx-developer-docs project?

I tried to find the old configuration that we used on ReadTheDocs. It looks like there’s no current ReadTheDocs project that builds from edx-platform (or if there is, I couldn’t find it). We had written a complex Python file to mock out the worst of the third-party packages, just to get everything to import properly. It was a fragile mess: https://github.com/edx/edx-platform/blob/named-release/dogwood.rc/docs/en_us/platform_api/source/conf.py (I found it in Dogwood, not sure how long it persisted.)

Do you think it would be possible for edX to host the edX-platform docs instead of readthedocs? I can also offer to self-host the docs and then you would configure a domain name to point to that server. Unless deploying to readthedocs is a hard requirement, in which case we will have to reconsider how the docs are generated.

It’s not a hard requirement, it was just the easiest default thing to do, until it became a nightmare :slight_smile: https://docs.edx.org is a GitHub pages site. There’s no reason we couldn’t do something else for hosting.

1 Like