DocumentTooLarge: BSON document too large

Has anyone received this error when saving a course within the CMS? How do I resolve this. One of the content developers copied and pasted an image from Word directly within the WYSIWYG HTML component editor. They also copied/pasted Word text and the CSS styling associated with Word was also included in the HTML. I told them not to do this going forward with copy/paste from Word in the WYSIWYG editor.

I’ve removed the course from the platform then re-added it as new course with no content. After creating a new section I’m still getting this error too. So this seems like something with this course has been held up with MongoDB insert.

Any advice on how to resolve this issue?

2020-11-09 15:37:25,746 ERROR 16086 [django.request] exception.py:135 - Internal Server Error: /xblock/
Traceback (most recent call last):
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/exception.py", line 41, in inner
    response = get_response(request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 249, in _legacy_get_response
    response = self._get_response(request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 187, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 185, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/decorators.py", line 185, in inner
    return func(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/views/decorators/http.py", line 40, in inner
    return func(request, *args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 23, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/djangoapps/util/json_request.py", line 53, in parse_json_into_request
    return view_function(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/item.py", line 229, in xblock_handler
    return _create_item(request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 23, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/djangoapps/util/json_request.py", line 53, in parse_json_into_request
    return view_function(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/item.py", line 681, in _create_item
    boilerplate=request.json.get('boilerplate'),
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/helpers.py", line 290, in create_xblock
    return created_block
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/mixed.py", line 1016, in bulk_operations
    yield
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/__init__.py", line 192, in bulk_operations
    self._end_bulk_operation(course_id, emit_signals, ignore_case)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/__init__.py", line 286, in _end_bulk_operation
    dirty = self._end_outermost_bulk_operation(bulk_ops_record, structure_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 265, in _end_outermost_bulk_operation
    self.db_connection.insert_definition(bulk_write_record.definitions[_id], bulk_write_record.course_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/mongo_connection.py", line 579, in insert_definition
    self.definitions.insert(definition)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/mongodb_proxy.py", line 115, in __call__
    return self.proxied_object(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/collection.py", line 540, in insert
    gen(), check_keys, self.uuid_subtype, client)
DocumentTooLarge: BSON document too large (28164091 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.

I don’t see this import error but rather when I create a new section or go to a page with content.

Resolution
I did a search for HTML components that were modified recently ‘2020-11-01’ and newer for the raw PNG and WordDocument pastes into the WYSIWYG HTML component editor and removed the text while logged into the MongoDB client for all documents that had results.

edxapp.modulestore.definitions

// Search for raw PNG
{block_type: ‘html’,‘fields.data’: {$regex: /src=\"data:image\/png;base64/ }, “edit_info.edited_on”: { $gte: ISODate(‘2020-11-01’) }}

// Search for WordDocument formatting
{block_type: ‘html’,‘fields.data’: {$regex: /WordDocument/ }, “edit_info.edited_on”: { $gte: ISODate(‘2020-11-01’) }}

I then proceeded to restart the CMS and this seemed to make everything work. I’m guessing that memcached is involved somehow to store the course component and that it got refreshed upon restart. Can anyone confirm this?

ubuntu@hawthorn-app:~$ 
sudo /edx/bin/supervisorctl restart cms
sudo /edx/bin/supervisorctl restart edxapp_worker:cms_low_1
sudo /edx/bin/supervisorctl restart edxapp_worker:cms_high_1
sudo /edx/bin/supervisorctl restart edxapp_worker:cms_default_1

Anyway let me know what you think. This seems like a bug with the WYSIWYG editor. We’re currently running open-release/hawthorn.master in production and I haven’t tested this with a newer open release.

Should we create a bug for this?

@braden @jill @sambapete @DanielMcQ @regis In case this ever happens to you please read above. Copy and paste from Word into the WYSIWYG HTML component can cause issues with your course causing it to not load or work at all. Hope this helps.

2 Likes