Recalculate_subsection_grade_v3 is submitted with the wrong user_id

Problem:

Recently, we started getting learner issues where a learner submits a correct answer, and the progress page shows 0/1 for some of the problems.

Upon investigating the logs, we have found that when the learner(User A) submits a response to a problem, the recalculate_subsection_grade_v3 task is submitted with the wrong user_id(User B). Learner submission history also shows the submission of User A, but the score says None/None. On the other hand, User B has no submission, but the score is 1/1. It is important to note that User B has never enrolled in the course.

Another thing that we noted is that most of the time, a task with the wrong user_id is submitted within a minute after User B has submitted an answer in one of their enrolled courses. This indicates that maybe the runtime/some other service is not able to pick the correct user when submitting the recalculate_subsection_grade_v3 task.

Submission History:

User A:

#2: 2025-11-28 09:33:40 UTC

Score: None / None
{
  "attempts": 1,
  "correct_map": {
    "cbf22dccc3e24561bdd95280f9110196_2_1": {
      "answervariable": null,
      "correctness": "correct",
      "hint": "",
      "hintmode": null,
      "msg": "",
      "npoints": null,
      "queuestate": null
    }
  },
  "correct_map_history": [
    {
      "cbf22dccc3e24561bdd95280f9110196_2_1": {
        "answervariable": null,
        "correctness": "correct",
        "hint": "",
        "hintmode": null,
        "msg": "",
        "npoints": null,
        "queuestate": null
      }
    }
  ],
  "done": true,
  "input_state": {
    "cbf22dccc3e24561bdd95280f9110196_2_1": {}
  },
  "last_submission_time": "2025-11-28T09:33:38Z",
  "score": {
    "raw_earned": 1,
    "raw_possible": 1
  },
  "score_history": [
    {
      "raw_earned": 1,
      "raw_possible": 1
    }
  ],
  "seed": 1,
  "student_answers": {
    "cbf22dccc3e24561bdd95280f9110196_2_1": "choice_2"
  },
  "student_answers_history": [
    {
      "cbf22dccc3e24561bdd95280f9110196_2_1": "choice_2"
    }
  ]
}




    
    
#1: 2025-11-28 09:08:38 UTC

Score: None / None
{
  "input_state": {
    "cbf22dccc3e24561bdd95280f9110196_2_1": {}
  },
  "score": {
    "raw_earned": 0,
    "raw_possible": 1
  },
  "seed": 1
}

User B:

#1: 2025-11-28 09:33:38 UTC

Score: 1.0 / 1.0
null

Responsible Code Flow

/openedx/edx-platform/xmodule/capa_block.py(1893)submit_problem()
   1892             raise
-> 1893         published_grade = self.publish_grade()
   1894 

  /openedx/edx-platform/xmodule/capa_block.py(1758)publish_grade()
   1757 
-> 1758         self.runtime.publish(self, 'grade', event)
   1759 

  /openedx/edx-platform/xmodule/x_module.py(1504)publish()
   1503         if publish_service := self._services.get('publish'):
-> 1504             publish_service.publish(block, event_type, event)
   1505 

  /openedx/edx-platform/xmodule/services.py(240)publish()
    239         if handle_event and not is_masquerading_as_specific_student(self.user, self.course_id):
--> 240             handle_event(block, event)
    241         else:

  /openedx/edx-platform/xmodule/services.py(280)_handle_grade_event()
    279         if not self.user.is_anonymous:
--> 280             grades_signals.SCORE_PUBLISHED.send(
    281                 sender=None,

  /openedx/venv/lib/python3.11/site-packages/django/dispatch/dispatcher.py(189)send()
    188         for receiver in sync_receivers:
--> 189             response = receiver(signal=self, sender=sender, **named)
    190             responses.append((receiver, response))

  /openedx/edx-platform/lms/djangoapps/grades/signals/handlers.py(178)score_published_handler()
    177         # Fire a signal (consumed by enqueue_subsection_update, below)
--> 178         PROBLEM_RAW_SCORE_CHANGED.send(
    179             sender=None,

  /openedx/venv/lib/python3.11/site-packages/django/dispatch/dispatcher.py(189)send()
    188         for receiver in sync_receivers:
--> 189             response = receiver(signal=self, sender=sender, **named)
    190             responses.append((receiver, response))

  /openedx/edx-platform/lms/djangoapps/grades/signals/handlers.py(210)problem_raw_score_changed_handler()
    209 
--> 210     PROBLEM_WEIGHTED_SCORE_CHANGED.send(
    211         sender=None,

  /openedx/venv/lib/python3.11/site-packages/django/dispatch/dispatcher.py(189)send()
    188         for receiver in sync_receivers:
--> 189             response = receiver(signal=self, sender=sender, **named)
    190             responses.append((receiver, response))

  /openedx/edx-platform/lms/djangoapps/grades/signals/handlers.py(236)enqueue_subsection_update()
    235         return  # If it's not a course, it has no subsections, so skip the subsection grading update
--> 236     recalculate_subsection_grade_v3.apply_async(
    237         kwargs=dict(

  /openedx/venv/lib/python3.11/site-packages/celery_utils/logged_task.py(24)apply_async()
     23         """
---> 24         result = super().apply_async(args=args, kwargs=kwargs, **options)
     25         log.info('Task {}[{}] submitted with arguments {}, {}'.format(  # pylint: disable=consider-using-f-string

  /openedx/venv/lib/python3.11/site-packages/celery/app/task.py(598)apply_async()
    597             with denied_join_result():
--> 598                 return self.apply(args, kwargs, task_id=task_id or uuid(),
    599                                   link=link, link_error=link_error, **options)

  /openedx/venv/lib/python3.11/site-packages/celery/app/task.py(826)apply()
    825         )
--> 826         ret = tracer(task_id, args, kwargs, request)
    827         retval = ret.retval

  /openedx/venv/lib/python3.11/site-packages/celery/app/trace.py(453)trace_task()
    452 
--> 453                     R = retval = fun(*args, **kwargs)
    454                     state = SUCCESS

  /openedx/venv/lib/python3.11/site-packages/edx_django_utils/monitoring/internal/code_owner/utils.py(195)new_function()
    194         set_code_owner_attribute_from_module(wrapped_function.__module__)
--> 195         return wrapped_function(*args, **kwargs)
    196     return new_function

> /openedx/edx-platform/lms/djangoapps/grades/tasks.py(191)recalculate_subsection_grade_v3()
    190     breakpoint()
--> 191     _recalculate_subsection_grade(self, **kwargs)
    192 

Suspected Service

EventPublishingService is the one that passes the user_id to different signals that eventually get passed to the recalculate_subsection_grade_v3.

We are looking for feedback/suggestions on what could be the possible issue with the scoring.

PS: We are running master branch of edx-platform at MIT.

@dave @kmccormick Could you please take a look at this post? Maybe you can point out any changes that could affect the related things in the platform. Looking forward to get some input from you guys.

@Asad_Ali when making a post, could you please indicate what version of the software you’re running?

@sarina sorry for not adding that info. At MIT, we have multiple deployments of Open edX, but the one thats having issues is running the master branch of the edx-platform.

That is very bizarre. I wonder if it’s some kind of caching issue. Is there any obvious pattern mapping between user IDs (e.g. truncation)? How often does it happen? Does it happen only with capa problems? Does it happen for XQueue-related requests to remote graders?

Do you have a timeline for about when these issues started happening? There have been some recent changes related to XQueue deprecation that might have impacted this, though I’m not clear on how. FYI @Abdul_Rehman, @UsamaSadiq

@Asad_Ali could you verify if the waffle flag send_to_submission_course.enable is enabled for your setup or not. The new changes merged in XQueue Deprecation process shouldn’t take effect if the waffle flag is disabled.
My team will take a detailed look in it to try and reproduce the issue and find a fix for this issue.

Thanks for checking. We are not using the send_to_submission_course.enable waffle flag.

Hi @dave

is there any obvious pattern mapping between user IDs (e.g. truncation)?

No there is not any obvious pattern mapping between user IDs

How often does it happen?

Its happening quite often and what we have observed is that it happens within a minute when User B submits answers in one of its enrolled courses’ then same user_id is used when another submission comes in for any other user and course

Does it happen only with capa problems?

Not sure if its only affecting capa problems but right now its only reported in capa problems.

Do you have a timeline for about when these issues started happening?

The user reported the issue on 19 Nov, 2025 and related problem submission was on 13 Nov, 2025.
Our release around that time was sent on 12 Nov, 2025 (master branch)

We have also checked that we have BlockCompletion of User B so its quite possible the issue is not limited to only recalculate_subsection_grade_v3

We are also looking into submission history as well to pinpoint the start date of this problem.

Based on the timing, my suspicion would have been https://github.com/openedx/edx-platform/pull/37122, but it looks like you folks reverted that shortly afterwards. The instance you’re talking about has the revert on it, right? (I know you folks have multiple instances, and I’m never sure exactly which code is running on what instance.)

@dave That was my suspicion as well, and we reverted the PR. Yes, we have that revert deployed, and we are on the latest master.

The reason I put up this post was to get some info like this from your side so that we can narrow down the issue.

Thanks!

The first such submission dates back to 7th November, 2025. Our successful production release went out on 28th October, 2025.
Here are the numbers: There are a total of ~273k problem submissions with the below query:

  AND grade IS NOT NULL
  AND event_json LIKE '%"score"%'
  AND event_json LIKE '%"raw_earned"%1%'
  AND event_json LIKE '%"raw_possible"%1%'
  AND event_timestamp > TIMESTAMP '2025-11-07 00:00:00';

Note that this only counts problems with weight 1.

We have 3545 entries with the query:

  AND grade IS NULL
  AND event_json LIKE '%"score"%'
  AND event_json LIKE '%"raw_earned"%1%'
  AND event_json LIKE '%"raw_possible"%1%'
  AND event_timestamp > TIMESTAMP '2025-11-07 00:00:00';

That is slightly more than 1% of all submissions. This means that it works fine 99% percent of the time.

@Asad_Ali have reproduced the issue locally and by following those steps I was also able to reproduce it. Here are the steps:

  1. Create 2 courses with simple problem and 2 learners. Enroll 1 learner to 1 course and another learner to different course
  2. To be sure that we are submitting problems at the same time, add a breakpoint after the instance creation in block_render.py
  3. Now open 2 browsers/windows and login with the learners.
  4. Open the courses and problems.
  5. Attach with the LMS container to bypass the breakpoints when we submit the answers, e.g; tutor dev start lms
  6. Submit the answers with both the users and bypass the breakpoint by entering c twice
  7. Thats it, you will have 1 learner scored correctly in his enrolled course and the same learner will be scored in the other course as well. And its always the 2nd request’s user.

Additional Information:
I am able to reproduce this issue in release/teak as well by following the same steps.

@dave

I have found the root cause. Our modulestore follows singleton pattern and these 2 requests were using the same modulestore which, in our case, is SplitMongoModuleStore.
When we modulestore().get_item it also create_runtime using the services dict in its own object(SplitMongoModuleStore) and attach it to the block.

Now: Both the SplitMongoModuleStore and all SplitModuleStoreRuntime instances will share the exact same services dictionary object.

Reason:
Reason why they will share the same object is because, in Python, assignment of a mutable object (like a dictionary) does not create a copy, it just creates a new reference to the same object in memory.

In which case:

  • Any change (add, remove, or update) to the services dict in one place (either the modulestore or any runtime) will be immediately visible in all others.
  • This means mutating the dict (e.g., adding or removing a key) or mutating any mutable value inside it will affect all users of that dict.

Fixing PR: fix: runtime services are fixed by marslanabdulrauf · Pull Request #37825 · openedx/edx-platform · GitHub

Is this issue reproducible in Ulmo? We are considering whether this is a release blocker for Ulmo.

Answering my own question, I see that @Muhammad_Arslan was able to reproduce the issue in Teak.

That said, there is still an open question about whether this bug will manifest in a typical install of open edX. As we were just discussing in the BTR meeting, we are currently using Granian instead of UWSGI at MIT, and it may be that UWSGI’s limitations prevent this bug from manifesting. We’ll report back here after further investigation.