Pull Requests Review Delays

@ghassan Thank you for raising that issue - the time it takes to get reviews on PRs has long been a core issue of the Open edX project, and there are several initiatives that aim to solve it. Aside from the core contributor program itself, which increases the pool of reviewers (and aims to reduce the workload of individual reviewers), there is also on the coordination/planning side the initiative from @Michelle_Philbrick and @itsjeyd to follow more closely the progression of PR reviews, and help accelerate/unblock those.

@Michelle_Philbrick @itsjeyd Would you be able to give an update on how things stand, in that matter? And more specifically, about the point @ghassan brings up, which is that it is important for a contributor to know when to expect a review?

I would actually add that not only it’s important to know when – but that when needs to be quick. Open source projects who are good at getting contributors usually provide reviews within hours, or day (rather than weeks or months like us – or even years in some cases!).

@antoviaque @ghassan Yes, the purpose of OSPR management is to follow and help guide the PR process from start to finish. At each step of the process we aim to make it clear to authors and product/engineering reviewers what the current status of a PR is, and whether it is waiting on some action from them. Michelle and I have access to resources that map repos to principal owners (= Axim teams, 2U teams, etc.) so we know who to ping when a PR is ready for review.

:bulb: However, we don’t have control over when reviews are actually going to happen.

If a PR has been waiting for review for a while without any feedback from the team that owns the parent repo, we can ask for help from @Kelly_Buchanan and @nedbat to get them unstuck, but this is where our power ends.

So the work that Michelle and I are doing as OSPR managers cannot guarantee that PRs will always get reviewed and merged quickly – for that, the maintainership program, currently in pilot phase, will need to fully take off and expand to cover the majority of the Open edX code base (including edx-platform).

(For reference, the list of repos that are participating in the maintainership pilot is here.)

CC @Michelle_Philbrick @gabrieldamours @e0d

@itsjeyd I apperciate that you are doing the best that you can (along with @Michelle_Philbrick)., my reasoning of providing a specific date for a review, is that then (I or any PR author), would set the PR state to draft, create a reminder just a day or so, before the specified review date. where in that time I would ensure it’s updated with master, tests passed…etc, just before a reviewer reach it.

Doing that would save us both unnecessary work, like if the PR was label as ready, and just at the time reviewer land on it, it needed a change, so then a reviewer ask to change, the author reply, and so on. and meanwhile OSPR manager would need to keep looking at PRs to update labels.

I understand the maintainership touches on these topics, however to my knoweldge not everything expected to be Open edX core product is in maintainership.

I indeed agree with @antoviaque the sooner the better, I suggest to set a date, would be best probably used as a fallback if a quick review is not guaranteed. My end goal is to save everyone’s involved time.

One thing that we have discussed in the past was to have a proper fallback in these cases - the review first goes to the assigned maintainer or team that manages a repository, but if it doesn’t happen within a reasonable timeframe (2 weeks?), then it could be reassigned to any code core contributor. This way there is a clear incentive to review work quickly for the maintainer, even if it’s just to say “no”.

Ie to keep control over a repository of the official project, one has to fulfill their committed duties – and conversely, when a maintainer or team is over-committed, there is an option that allows to delegate to others in the community.

Would it be a good time to consider this approach?

Thank you for raising this issue @ghassan ! It’s a totally reasonable expectation of the CC program.

…with merge rights on that repository. :+1:

And for repos that don’t have CC’s assigned, it can be an opportunity to find and fill that gap.

Imho, if nobody with rights on a specific repo is available to review, other code core contributors should be able to review and give a :+1: . Otherwise the same blockers will persist.

Speaking for myself, I’d be happy to incorporate something like this into the OSPR management process.

@e0d @nedbat @Kelly_Buchanan @Michelle_Philbrick What do you think?

And CC @gabrieldamours, since this relates to the maintainership program.

2 Likes

I agree, but we’d need to make the following much more clear…

  • I have hundreds of PRs that I comb through every few weeks and it’s very time consuming. Making sure every two weeks that maintainers have responded to specific ones might be a bit of a challenge to keep track of, especially since we’re aiming to get edx-platform as a maintained repo. There’s so many OSPRs in that repo.

  • For some repos, like edx-platform, it would be good to nail down exactly which CCs have rights to review/merge in what specific areas. For instance, all CCs might be able to review/merge PRs on edx-platform, but what if only a few CCs have expertise in the area you’re looking for? Can we make this list more explicit about who can handle what?

  • Also, 2U has a basic breakdown of which 2U teams own certain projects on edx-platform, but it’s not as specific as we need it to be. It’s also not always clear what project a certain PR is related to.

  • There may be certain PRs that 2U would want to review that they don’t get to within that 2-week period. How do we flag these?

  • We’d need to figure out a cadence for getting stalled PRs to the CCs. Maybe every few weeks Tim and I could post stalled ones in the CC slack channel and have people self-assign?

cc: @itsjeyd

2 Likes

I’m moving this topic to a dedicated thread, as the discussion is getting longer, and this is an important topic.

Some of what has been mentioned, to also help with the current situation, we should:

  • Provide better information about the current situation on the different repos, since that can be different

  • Advertise that:

  • Provide a way for people with less time to take on a role with smaller commitments than core contributors to do so, and still get some commit rights – some names that are used for this type of intermediary roles: “Collaborator”, “Contributing Member”, or maybe simply “Contributor”?

@Felipe @lpm0073 @sarina Would you like to describe your points from the discussion here further, so we can continue the discussion async here?

2 Likes

2 posts were split to a new topic: Training courses & videos

@Michelle_Philbrick Thanks for giving this option some thoughts! It’s definitely helpful to know what we would need to prepare well, if we were to implement this. This is probably a topic on which @itsjeyd will likely have better ideas than me since he’s closer to it, but giving it a shot:

That does seem like a lot of PRs to go through! I wonder how much of the hundreds of PRs you have to comb through are recurring/long to merge, vs new ones? How many new PRs do you get each time, vs old ones that you have to keep going back to because they don’t get merged or closed? Speeding th would decrease the number of PRs that you have to go back to.

We might need to find a good way to quickly identify which PRs would qualify for review by any developer - ideally this wouldn’t require any manual combing, but could be the contents of a specific board. What if it were simply “any PR which isn’t either rejected/closed or merged after X days out of the draft stage?” Then a filter can display them on a board, and core contributors can comb through those directly? And you then don’t necessarily have to go on all the PRs to remind the maintainers - the escalation is simply a tool added to your toolbox when trying to help PRs get reviewed, if some languish and you think it’s useful, you can escalate them yourself to specific core contributors.

For the systematic part, maybe a bot could help? Eg. X days after the PR goes out of draft, a bot posts an automated message on the PR comments, saying that it can now be reviewed and approved or refused by any code core contributor. The bot could do the merge / close of the PR based on a list of core contributors.

  • For some repos, like edx-platform, it would be good to nail down exactly which CCs have rights to review/merge in what specific areas. For instance, all CCs might be able to review/merge PRs on edx-platform, but what if only a few CCs have expertise in the area you’re looking for? Can we make this list more explicit about who can handle what?

That would be helpful to figure out in any case – it’s important to maximize the quality of the reviews, and making it clear for each part of the code who is responsible for it is a good way to ensure that. There are also bots that automatically use this kind of information to ping the right people for the review.

The information would also be a good way to identify where we don’t have that expertise available, either temporarily, or at all. These are probably areas where we should probably escalate it immediately to other core contributors – who might not be familiar with this part of the codebase, but if contributions are blocked because nobody with the expertise is available to review them, that problem perpetuates itself. Doing reviews and merging code in is a good way to build expertise - we know we can trust core contributors to care about the code quality and project’s best practices.

  • Also, 2U has a basic breakdown of which 2U teams own certain projects on edx-platform, but it’s not as specific as we need it to be. It’s also not always clear what project a certain PR is related to.

@Kelly_Buchanan @nedbat What do you think on this? ^

I also know that @jmakowski is also building a list of relationships between features ↔ repositories, which might maybe be able to help here?

  • There may be certain PRs that 2U would want to review that they don’t get to within that 2-week period. How do we flag these?

2 weeks is already a long time for getting a PR review, in an open source project. For Open edX that timeframe would be a great improvement compared to the current situation – but that’s part of the problem we have as a project, and that we need to solve. Often large open source projects try to provide reviews within a day or two - because this is when contributors are most likely to keep working on it, and have more latitude to change their code. If we are not able to provide a review within 2 weeks, maybe it means that the changes to that code base are not a priority, and then it’s fair to let someone else take the decisions, since they are doing the work?

Advertising them this way regularly would likely be a nice way to get more people’s attention on them. Especially while working on catching up with the reviews backlog?

Most projects I contribute to take a month or more - a recent PR merged after 11 months.

Anyway, this thread is really long (and goes in a bunch of directions), so I’m having trouble figuring out where my input is needed. I don’t personally feel I’m in charge of the Maintainer’s Program or making SLAs there. Michelle is able to give better feedback in terms of current PR management. The one point I made that I think you already made is how do we get current CCs more rights on more repos? We have an email list - should we just email?

I like the idea of giving more people limited rights. What might that look like?

I wonder if the email list is better, so people can better look async? Things in Slack can quickly disappear - at least, I think Slack is much more ephemeral than email - and everyone checks their email, not everyone keeps up to date on Slack or forums.

2 Likes

Hey all, just an apology that I don’t have time to read and absorb the thread, but wanted to note a few things:

  1. For repos without a named maintainer that originated from 2U, Ned and I are working with @Michelle_Philbrick and @itsjeyd to facilitate reviews. Sometimes we do request a CC or community reviewer. It’s not a fast or efficient process and it’s temporary until named maintainers are identified.
  2. A few of us recently met and discussed code review workflows and came up with cross-project labels that will allow us to indicate a) that it’s time for code review, and b) that the maintainer needs to look at it. PRs that fall under (a) and not (b) would be candidates for community / CC review.

I think just having these labels will unblock progress on a few fronts. Happy to discuss with anyone on Slack. I’m not able to keep up with Discourse.

From our last Contributors Meetup there were two ideas that I’d like to write here so that they can be discussed further.

First is the something that could be considered a lightweight core contributor role. Currently we have core-contributors in the code contributor role and we have people in the community that have no core contributor status. We talked about adding a simple-contributor category.
This category could serve as a stepping stone for people in the community that want to contribute, but are not yet ready for the commitment of the core-contributors. Normally, the CC role would entail 20h/month of dedication. The new lightweight-contributor role could be removed from this or have a lower count. Probably some limitations would be needed to be placed in the role to mark the point where the person should move into a full core-commiter role instead of skipping the CC process forever. One such limitation could be the ammount of repos that can be granted access through the lightweight-contributor role. The process to grant lightweight-contributor status could as well be simpler and more inviting for people that may not know 5 more CCs in the community to get started.

Talking only from my experience at edunext and even though we support every person that wants to be a core contributor in the current form, some people find it difficult. They would benefit from having an easier to access position.

The second idea is that we acknowledge the differences about the contributing process in different parts of the ecosystem and write about it. It is very different to try and find a reviewer or expert for edx-platform than it is to find it for a MFE that is only used by 2U and is not supported by the BTR. The same if is is a well maintained (OEP55) repo vs one whose last line was the python3 migration made by someone who has long left the community. If we are clear about what the expectations might be for certain parts of our codebase, contributors, new and old, will have a better understanding of how things might work and the process of contributing different things.

2 Likes

In edx-platform, it can be hard to decide what team should review a pull request. Often it requires technical understanding of the changes. We have a CODEOWERS, but it’s incomplete, and GitHub points out that it has errors. I don’t know if CODEOWNERS could do the job even if it was fully fleshed out.

I don’t think it’s productive to make sweeping generalizations about how open source projects work. For example, Python has 1522 open pull requests, only 50 of which are less than two weeks old. Many have comments on them, but 791 have the label “awaiting review.”

We all want to reduce the time it takes to review pull requests. We want to enable and speed contribution. But we have to solve our problems, whatever they are. There isn’t just one way that open source projects work. Making unfair comparisons to unnamed projects doesn’t help us find solutions.

I don’t understand the logic of this. If pull requests need review by the maintainer, then they need review, no matter how old they are. If pull requests can instead be reviewed by core contributors, we should let those reviews happen no matter how new the pull request is.

This sounds roughly like, “review the PR, or we’re going to do it whether you like it or not.” You say, “even if it’s just to say no”, but that could turn into auto-closing pull requests once they are two weeks old. I don’t think anyone would like that.

I would like to understand more about which repos are experiencing the longest delays, and try to understand why that is.

2 Likes

To get from the review to the merge it can take more time, but in the contribution-friendly projects I have looked at, the first review comes in much faster than it does for us. Gitlab is a good example of this – and they manage this even with an open core strategy and a company focused on a specific instance, so with more structural obstacles than we have!

Also, to be specific about what I said here, the aim is to review PRs as they come in. While for us a goal of 2 weeks is currently perceived as something more or less unattainable in the current schema – it’s not so much the specific time that I want to compare, but the fact that we have room to improve our goals in the first place.

We could try! :+1:

Maybe if we helped prepare a bit the work so it’s easy to do - possibly with some categories or repos we would like to assign? @itsjeyd @Michelle_Philbrick are there some categories of PRs or areas of code that would benefit the most from having more core contributors with rights? Where are we getting OSPRs blocked the most?

+1, that could be a nice stepping stone – and also maybe something some of the core contributors would even want to switch to, if they find the core contributor role too heavy, but still want to contribute. We should just be careful to keep the role/responsibility is proportionate to the commitment and contributions, to still encourage organizations to invest in contributing more.

+1 – that also goes in the direction of further refining the list of who supports what, which @Michelle_Philbrick mentioned would help to improve the triage process.

@nedbat Interesting, thanks for mentioning it. Do you know if the file currently maintained? And how does that relate (if it does) with the process to identify maintainers in 0001 Use Backstage to Support Maintainers — Open edX Proposals 1.0 documentation ?

@nedbat The core contributors can already review any pull request anywhere - the issue is with the ability to give a binding :+1: that allows a PR to be merged, which is currently limited for each core contributor to a few repositories that that person has access to. It does makes sense to try to get the review by the people with the most intimate knowledge of a given repo first – the solution of escalating a review to all core contributors would only be a backup, for when those people are not available, and while we ramp up the number of maintainers. If we enlarge the group who are allowed to give a binding :+1: then, we increase the chances of the PR being reviewed in a timely manner (or at all).

It comes down to trust I think – a core contributor will know if they can reasonably safely give a :+1: on such a PR, or if they can’t. We choose core contributors for trustworthiness and care, as well as conformity to the project’s best practices, and all have plenty of experience with Open edX in general. So when maintainers are not available for reviews, imho we should give the option for other core contributors to unblock the PR, if they judge they can do so.

Now you’re right that the time delay might not be necessary in cases where the maintainer is unavailable and says so (or just doesn’t exist) - so the escalation could also be initiated immediately by the maintainer, it could be a nice way for maintainers to delegate their work when they are overloaded,. But in other cases it might be worth giving the time for the maintainer to see the PR and get a chance to react, before another core contributor can merge it?

That’s too bad that it sounds like that, because my goal here is more to find better ways to work together on this. Getting help from others to get work that we, as a community, need done is a good thing, and part of the charm of working on open source, no? We assign areas of responsibility on the code base based on skills and contribution track history, and because it’s an efficient way to divide responsibility - but ultimately, none of us “own” the code, it’s a community effort. When we don’t have time to do something, someone else can pick it up.

That would mean that the maintainers would trust so little the rest of the core contributors that they would prefer to block all contributions on a repo rather than allow another core contributor to handle reviews? That would be pretty disappointing! Imho this would be more something to reserve for cases where it’s either an obvious “no” (unwanted feature, horrendous code), or code/areas so tricky that it would be impossible for a core contributor to do a proper review (though, here again, the maintainer could also trust that core contributors will do the right thing - if a maintainer comments “this is tricky code, I would recommend against merging without a review from me”, I’m pretty certain that core contributors would heed the warning.

Again - the key ingredient is trust. And the more opportunities we have for interactions and collaborations like that, the more trust can develop.

+1 - it could help in multiple ways to know that. @itsjeyd would you have some time to devote on your core contributor time to help figuring this out?

For me, it would definitely be edx-platform. Right now there’s no good way to determine 1. Who should review, and 2. Can a CC review and merge on edx-platform (if we know a CC has expertise in a particular area).

@nedbat currently I run the OSPRs that need review on edx-platform by Ed, and he helps to guide me to someone to review/merge. Would it be helpful to post these in the Maintainer channel instead so you can weigh in too? I typically do them in batches every couple of weeks.

1 Like

In general I would say that:

  • The more PRs a repository gets, the more it would benefit from having a larger number of core contributors with merge rights.
  • Smaller repositories (with fewer PRs) that haven’t joined the maintainership program yet would benefit from getting some dedicated CCs with merge rights.
    • By “dedicated CCs” I mean people who have a lot of experience with the code on these smaller repositories, and time available to regularly review incoming PRs. (On paper, we already have a bunch of CCs with write access to these repos – for example, everyone with “Axim On-Call” access listed on the core contributors page would be able to review and merge PRs there. But it seems like these CCs are often spread thin between many different roles and responsibilities, and their review queues can get very long.)

However, based on what I’ve experienced so far, OSPRs mostly get stuck on repos that aren’t part of the maintainership program yet:

As mentioned previously, these repos still have owning teams, and @Michelle_Philbrick and I can look up who they are and ping them on PRs. But there is no guaranteed response time, and I’ve had to use the escalation path to @Kelly_Buchanan and @nedbat in a number of cases to ultimately get the engineering review process started. :bulb: At this point in the process, CCs aren’t involved at all yet, so having more of them wouldn’t help with this particular bottleneck.

So, along the lines of what @Michelle_Philbrick said above about edx-platform, it would be really helpful to figure out how we can unblock CC reviews early on in the OSPR process. Especially (but perhaps not exclusively) for unmaintained repositories, how can we get eyes :eyes: from someone in the owning team quickly, so they can say:

  1. This is good for CC-only review, or
  2. This is good for CC review, but someone from my team will need to give final :+1:, or
  3. This is not a good candidate for CC review; my team will need to review this.

Unmaintained repositories (including edx-platform, which @Michelle_Philbrick already mentioned) seem to be experiencing the longest delays.

I can put together some more detailed stats for the repositories that I’m assigned to in an upcoming sprint (most likely the one after next).

In the meantime it might be helpful to have a look at stats from Maintainer Pull Request Reporting. That board isn’t limited to OSPRs but it can be used to identify long-running PRs and corresponding repositories.

1 Like

That would be ideal yes – the issue is with cases where there are no people with a lot of experience on the repo and which also have time available, no? If we limit the reviewers to those criteria when nobody matches them, then nobody else can gain experience on these repos… Doing reviews (and contributing PRs, when they can be reviewed) are good ways to acquire experience with a repository – when nobody else is available to review, getting core contributors to gain that experience by reviewing would allow to break the vicious circle, and build review capacity.

My understanding is that there are many cases where it’s hard to even get an answer at all on this, no? A solution for this would be to default option (1), unless there is an answer asking for (2) and (3). It’s not quite to the level of trust I would like to see given to core contributors – which should also be capable to choose between the 3 options, by giving or withholding a binding :+1: or :-1: in any case. But it would be a step in the right direction already, end ensure we are able to unblock situations like this quickly even with unresponsive reviewers. It would also allow to start gathering stats about the proportion of PRs that fall in the different buckets.