While I was running transform_tracking_logs command in LMS, as suggested in here, I faced some intercalated errors in event_routing_backends.tasks.dispatch_bulk_events celery tasks (see the Flower monitor screenshot below):
Traceback (most recent call last):
File "/openedx/venv/lib/python3.11/site-packages/celery/app/trace.py", line 453, in trace_task
R = retval = fun(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^
File "/openedx/venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1816, in runner
return sentry_patched_function(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/openedx/venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 416, in _inner
reraise(*exc_info)
File "/openedx/venv/lib/python3.11/site-packages/sentry_sdk/utils.py", line 1751, in reraise
raise value
File "/openedx/venv/lib/python3.11/site-packages/sentry_sdk/integrations/celery/__init__.py", line 411, in _inner
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/openedx/venv/lib/python3.11/site-packages/celery/app/trace.py", line 736, in __protected_call__
return self.run(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/openedx/venv/lib/python3.11/site-packages/event_routing_backends/tasks.py", line 106, in dispatch_bulk_events
bulk_send_events(self, events, router_type, host_config)
File "/openedx/venv/lib/python3.11/site-packages/event_routing_backends/tasks.py", line 147, in bulk_send_events
raise task.retry(exc=exc, countdown=getattr(settings, 'EVENT_ROUTING_BACKEND_COUNTDOWN', 30),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/openedx/venv/lib/python3.11/site-packages/celery/app/task.py", line 736, in retry
raise_with_context(exc)
File "/openedx/venv/lib/python3.11/site-packages/event_routing_backends/tasks.py", line 127, in bulk_send_events
client.bulk_send(events)
File "/openedx/venv/lib/python3.11/site-packages/event_routing_backends/utils/xapi_lrs_client.py", line 103, in bulk_send
raise EventNotDispatched
event_routing_backends.processors.transformer_utils.exceptions.EventNotDispatched
The fact is that I need to run this command within a huge amount of data. How can I do it?
Thanks in advance!
PD: this error only applies when running the transform_tracking_logs command with a huge amount of data. When Aspects run this celery task automatically this error is never shown.
Hi @Yago , are you seeing anything in the Ralph logs or slowness on the Ralph side? This looks like some kind of issue getting the events to Ralph, possibly sending events faster than it’s configured to handle.
Hi @TyHob ! Yes, you are right, Ralph is showing errors like these:
158434 ralph-1 | 2025-10-07 11:14:32,148 INFO: 172.18.0.13:56252 - "POST /xAPI/statements HTTP/1.1" 200 OK
158435 ralph-1 | Poco::Exception. Code: 1000, e.code() = 0, HTML Form Exception: Field value too long (version 25.3.6.56 (official build))
158436 ralph-1 |
158437 ralph-1 | 2025-10-07 11:14:36,075 ERROR: Failed to read documents: :HTTPDriver for http://db-openedx.ti.uam.es:8123 returned response code 500)
158438 ralph-1 | Poco::Exception. Code: 1000, e.code() = 0, HTML Form Exception: Field value too long (version 25.3.6.56 (official build))
158439 ralph-1 |
158440 ralph-1 | 2025-10-07 11:14:36,075 ERROR: Failed to read from ClickHouse
158441 ralph-1 | 2025-10-07 11:14:36,075 INFO: 172.18.0.13:42270 - "POST /xAPI/statements HTTP/1.1" 500 Internal Server Error
Even thought the error looks like some value is too long, maybe the problem is that Ralph cannot handle such amount of data at the same time. I will try to run the command over small bunches of data instead of a large amount of files at the same time.
I finally managed to launch the python manage.py lms transform_tracking_logs Monitoring both Ralph and Redis gave me the clue:
First of all I splitted the file into multiple files, because there was a problem with memory saturation cause tracking.logs was >5GB
I tried to go 1 by 1, but when the files where higher than 1000 lines some kind of error was thrown. And I went into more than 10000 files, so it wasn’t able to handle.
So I worked with the prefix parameter to work on bunches of 500 files at the same time, and added the following parameters for better performance: --sleep_between_batches_secs 10 --batch_size 500