Compute_all_grades_for_course is executed in cms-worker instead of lms-worker

When an instructor changes the grade policy, the compute_all_grades_for_course task (task name lms.djangoapps.grades.tasks.compute_all_grades_for_course) is triggered. It is supposed to be executed in the lms-worker, but it is executed in the cms-worker instead, causing it to fail. I discovered this issue while checking /admin/celery_utils/failedtask/.

I tried setting POLICY_CHANGE_GRADES_ROUTING_KEY = “edx.lms.core.default” for both LMS and CMS, but the task still ran in the cms-worker.

Is it supposed to run in the cms-worker?

My Environment:
tutor v20

Related code:

@feanil do you know who I might tag for help on this one?

Sorry if you already know some of this but here is what I think is happening.

  • The platform runs four different processes from the edx-platform repo(on different containers in tutor)
    • The LMS (learner facing website and APIs for learner MFEs)
    • The CMS/Studio (author facing website and APIs for the authoring MFE)
    • The LMS worker (a series of celery workers running in the LMS context)
    • The CMS workers (a series of workers running in the CMS context)
  • The definition of ques in the LMS and CMS context uses different queue variants to prefix all the queue names.
  • The function you pointed to is a signal receiver that kicks off a celery task.
    • The signal receiver is in the CMS code so it is fired in the CMS/Studio process and creates a celery task in that context and queues it.
    • This means that there are no LMS named queues defined when this task fires.
      • You naming the queue as a destination does not create it on the CMS.
      • When a queue doesn’t exist, this custom routing logic is executed putting thins back on the default priority queue in the CMS.

Thanks!

I apply a plugin as follows:

“openedx-cms-production-settings”,
“”"
EXPLICIT_QUEUES[‘lms.djangoapps.grades.tasks.compute_all_grades_for_course’][‘queue’] = ‘edx.lms.core.default’
“”"

Afterwards, the task is queued in the lms-worker and executes successfully without errors.

I wonder if this is a bug and, if so, whether it originates in the edx-platform or tutor.

1 Like

Hmm, that’s a good question, I’m not sure. The celery queue setup is fairly convoluted and something that it wolud be nice to cleanup but has been low on the priority list. Some of the recent settings cleanups will help make it easier to understand hopefully.

I also found something similar. When you modify any content in Studio, the task update_course_in_cache_v2 is executed in the LMS worker. I’m using Tutor v19.

I’m curious to know: If an operation is executed in CMS/Studio, then the related task should go to the CMS worker. If update_course_in_cache_v2 is going to the LMS worker instead. What could be the reason?

@Mahendra I think Feanil explained in the post above that task scheduling is design dependent, it is important that the task is executed in the right environment without errors. So it is not wrong for CMS to schedule tasks on lms-worker.