DocumentTooLarge: BSON document too large

Has anyone received this error when saving a course within the CMS? How do I resolve this. One of the content developers copied and pasted an image from Word directly within the WYSIWYG HTML component editor. They also copied/pasted Word text and the CSS styling associated with Word was also included in the HTML. I told them not to do this going forward with copy/paste from Word in the WYSIWYG editor.

I’ve removed the course from the platform then re-added it as new course with no content. After creating a new section I’m still getting this error too. So this seems like something with this course has been held up with MongoDB insert.

Any advice on how to resolve this issue?

2020-11-09 15:37:25,746 ERROR 16086 [django.request] exception.py:135 - Internal Server Error: /xblock/
Traceback (most recent call last):
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/exception.py", line 41, in inner
    response = get_response(request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 249, in _legacy_get_response
    response = self._get_response(request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 187, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/core/handlers/base.py", line 185, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/utils/decorators.py", line 185, in inner
    return func(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/views/decorators/http.py", line 40, in inner
    return func(request, *args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 23, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/djangoapps/util/json_request.py", line 53, in parse_json_into_request
    return view_function(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/item.py", line 229, in xblock_handler
    return _create_item(request)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py", line 23, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/djangoapps/util/json_request.py", line 53, in parse_json_into_request
    return view_function(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/item.py", line 681, in _create_item
    boilerplate=request.json.get('boilerplate'),
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/helpers.py", line 290, in create_xblock
    return created_block
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/mixed.py", line 1016, in bulk_operations
    yield
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/__init__.py", line 192, in bulk_operations
    self._end_bulk_operation(course_id, emit_signals, ignore_case)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/__init__.py", line 286, in _end_bulk_operation
    dirty = self._end_outermost_bulk_operation(bulk_ops_record, structure_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 265, in _end_outermost_bulk_operation
    self.db_connection.insert_definition(bulk_write_record.definitions[_id], bulk_write_record.course_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/mongo_connection.py", line 579, in insert_definition
    self.definitions.insert(definition)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/mongodb_proxy.py", line 115, in __call__
    return self.proxied_object(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/collection.py", line 540, in insert
    gen(), check_keys, self.uuid_subtype, client)
DocumentTooLarge: BSON document too large (28164091 bytes) - the connected server supports BSON document sizes up to 16777216 bytes.

I don’t see this import error but rather when I create a new section or go to a page with content.

Resolution
I did a search for HTML components that were modified recently ‘2020-11-01’ and newer for the raw PNG and WordDocument pastes into the WYSIWYG HTML component editor and removed the text while logged into the MongoDB client for all documents that had results.

edxapp.modulestore.definitions

// Search for raw PNG
{block_type: ‘html’,‘fields.data’: {$regex: /src=\"data:image\/png;base64/ }, “edit_info.edited_on”: { $gte: ISODate(‘2020-11-01’) }}

// Search for WordDocument formatting
{block_type: ‘html’,‘fields.data’: {$regex: /WordDocument/ }, “edit_info.edited_on”: { $gte: ISODate(‘2020-11-01’) }}

I then proceeded to restart the CMS and this seemed to make everything work. I’m guessing that memcached is involved somehow to store the course component and that it got refreshed upon restart. Can anyone confirm this?

ubuntu@hawthorn-app:~$ 
sudo /edx/bin/supervisorctl restart cms
sudo /edx/bin/supervisorctl restart edxapp_worker:cms_low_1
sudo /edx/bin/supervisorctl restart edxapp_worker:cms_high_1
sudo /edx/bin/supervisorctl restart edxapp_worker:cms_default_1

Anyway let me know what you think. This seems like a bug with the WYSIWYG editor. We’re currently running open-release/hawthorn.master in production and I haven’t tested this with a newer open release.

Should we create a bug for this?

@braden @jill @sambapete @DanielMcQ @regis In case this ever happens to you please read above. Copy and paste from Word into the WYSIWYG HTML component can cause issues with your course causing it to not load or work at all. Hope this helps.

2 Likes

Hi @Zachary_Trabookis,

We also faced this issue in a hawthorn installation. We tried restoring the course to a previous version using the draft-version but it didn’t worked.

The solution was to restart the cms service. We think it might be related with memcached as you can see in the traceback:

Traceback (most recent call last):
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/django/core/handlers/base.py", line 115, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/django/core/handlers/base.py", line 113, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/python3.5/contextlib.py", line 30, in inner
    return func(*args, **kwds)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/newrelic/hooks/framework_django.py", line 540, in wrapper
    return wrapped(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/django/contrib/auth/decorators.py", line 21, in _wrapped_view
    return view_func(request, *args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/django/utils/decorators.py", line 142, in _wrapped_view
    response = view_func(request, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/assets.py", line 86, in assets_handler
    return _asset_index(request, course_key)
  File "/edx/app/edxapp/edx-platform/cms/djangoapps/contentstore/views/assets.py", line 105, in _asset_index
    course_module = modulestore().get_course(course_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/mixed.py", line 95, in inner
    retval = func(field_decorator=strip_key_collection, *args, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/mixed.py", line 400, in get_course
    return store.get_course(course_key, depth=depth, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split_draft.py", line 62, in get_course
    return super(DraftVersioningModuleStore, self).get_course(course_id, depth=depth, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 1153, in get_course
    return self._get_structure(course_id, depth, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 1143, in _get_structure
    result = self._load_items(structure_entry, [root], depth, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 837, in _load_items
    self.cache_items(runtime, block_keys, course_entry.course_key, depth, lazy)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 817, in cache_items
    return system.module_data
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/__init__.py", line 197, in bulk_operations
    self._end_bulk_operation(course_id, emit_signals, ignore_case)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/__init__.py", line 291, in _end_bulk_operation
    dirty = self._end_outermost_bulk_operation(bulk_ops_record, structure_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/split.py", line 281, in _end_outermost_bulk_operation
    self.db_connection.insert_definition(bulk_write_record.definitions[_id], bulk_write_record.course_key)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/modulestore/split_mongo/mongo_connection.py", line 569, in insert_definition
    self.definitions.insert_one(definition)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/mongodb_proxy.py", line 117, in __call__
    return self.proxied_object(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/newrelic/api/datastore_trace.py", line 182, in _nr_datastore_trace_wrapper_
    return wrapped(*args, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/collection.py", line 698, in insert_one
    session=session),
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/collection.py", line 612, in _insert
    bypass_doc_val, session)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/collection.py", line 600, in _insert_one
    acknowledged, _insert_command, session)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/mongo_client.py", line 1492, in _retryable_write
    return self._retry_with_session(retryable, func, s, None)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/mongo_client.py", line 1385, in _retry_with_session
    return func(session, sock_info, retryable)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/collection.py", line 595, in _insert_command
    retryable_write=retryable_write)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/pool.py", line 618, in command
    self._raise_connection_failure(error)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/pool.py", line 613, in command
    user_fields=user_fields)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/network.py", line 143, in command
    name, size, max_bson_size + message._COMMAND_OVERHEAD)
  File "/edx/app/edxapp/venvs/edxapp/lib/python3.5/site-packages/pymongo/message.py", line 1066, in _raise_document_too_large
    " bytes." % (doc_size, max_size))
pymongo.errors.DocumentTooLarge: BSON document too large (17101064 bytes) - the connected server supports BSON document sizes up to 16793598 bytes.