Documenting repo best practices and requirements

Context: A team was having trouble getting pylint working correctly for an Open edX repo. After some time trying to get it work, the owning team decided it was not a high enough priority, and left it disabled.

After some discussion, some of us think that using linting (edx-lint for example) is currently an undocumented requirement for repos. There may be other undocumented requirements (or general best practices). Or, maybe not everyone would agree on what is required.

Proposal: Create an OEP on Repo Health Best Practices (name could change) that documents requirements at a minimum, and potentially some optional recommended best practices. Anything large enough could be its own OEP. Existing examples, like conventional commits and dependency management, could be listed and cross-referenced.

Additionally, the existing cookie-cutter and repo health tooling might provide useful background for writing this OEP.

Does anyone think this OEP should appear in a different form or place? Then we just need to find someone to put it together, or we (or I) could create a very light-weight one that starts with these few details and can be iterated on over time.

2 Likes

I think it would be great to codify these expectations. I had started a wiki page some time ago, with expectations, but at a slightly different level than “pylint”: https://openedx.atlassian.net/wiki/spaces/COMM/pages/1475084709/Requirements+for+public+repos

One of the challenges here is balancing team autonomy with community expectations. Do we say you must having linting, or could there be a reason why a repo has none? If using linting in a Python repo, must it be pylint, or would pyflakes be OK? If it’s pylint, must you use the shared pylintrc or can you have your own rules?

Thanks @nedbat.

Your Confluence page is a good start. Do you agree that this should be an OEP, or are you proposing that the Confluence page is the right solution and just needs to be updated?

For linting, here is a potential outline which represents my (our?) thinking:

  • Linting helps automate certain type of feedback to get consistency in code and to reduce the burden of reviewers. For this reason, it should be used where possible.
  • For certain languages, we have required linting:
    • Python: Pylint is required, using the shared pylintrc. Local tweaks are acceptable. (Available in cookie-cutter.)
    • Javascript: …
  • Experimenting with alternative tools is acceptable where required, but should ultimately lead to updates to this OEP explaining when and why to use each tool.

I strongly agree that this codification belongs in an OEP. This strikes me as particularly the type of standard an OEP is meant to codify: something that affects the community at large that we want to record and get wide consensus on.

I support this. Another area that I think could benefit from standardization is a minimum set of make targets. Being able to trust that repos expose basic targets like make push_translations, make test, make quality, etc would open up a lot of power for cross-repo tools.

In addition to being useful background, I could see the OEP making directives towards those tools, a la “the cookie-cutters must generate repositories that follow all of these best practices” and “these best practices must be evaluated using the repo health tools, when possible”.

I do think we should leave room for team autonomy. I think it’s worth having levels of flexibility here depending on how critical we find each guideline. Hypothetically, the OEP could say something like:

  • All Python repos in the openedx GitHub organization must run a static analysis tool against pull requests.
    • By default, pylint in conjunction with edx-lint should be used.
    • A repository may choose drop pylint + edx-lint in favor of another static analysis setup if it provides an ADR documenting the decision.

That way we’d establish a baseline (you need linting) but allow for flexibility provided that the decision is recorded.

It is worth noting that this OEP would be directed against repos in the openedx GitHub org (I assume). Unless edX wanted to declare that the OEP applied to their own org too, edX teams would still have the autonomy to work however they wanted within the edx GitHub org.

Thanks Kyle. Agree with most of what you wrote.

In terms of scope, I’d rather use the more vague language of “Open edX repos”, rather than the more specific language of “repos in the openedx org”. Why would you want to shrink the scope?

  1. Hopefully it is obvious that the “openedx” org contains Open edX repos.
  2. There is still work to be done to determine where and when a repo is public and used for Open edX, but lives in a different org. The repo eduNEXT/openedx-events is a good example. Although it is likely to move at some point, I wouldn’t want linting to have to be added at the time of moving.
  3. It is more likely that this OEP would be enforced for repos in the “openedx” org, but if I were using a repo (like eduNEXT/openedx-events) that didn’t yet follow this OEP, I’d like to be able to point to this OEP.

Regarding standard make targets, I think it is a fine idea. However, from a process perspective, I think it might make sense to try to land this OEP more quickly with anything that is less hotly debated. Maybe that makes the cut, but if not, should move to a separate OEP.

For make targets, or linting, we could also consider making ADRs underneath this OEP. We’ve done that in the past. I don’t think we need each mini-topic to bubble up to a new top-level OEP, and this would still enable some organization to the OEP. This is a minor idea that I am not wed to at all, but just throwing it out there.

1 Like

Oops, I meant to say: an OEP is the right place for this, yes.