Discussion forum not working

kstefan · March 30, 2020, 7:23pm

Dear Sir/Madam,

I am facing the following issue.

I have an instance of OpenEDX Hawthorn.2 running on Ubuntu 16.04 (hosted on AWS).

The problem is that the discussion forum is not working properly, everything else is OK.

The browser shows me the following:
‘There has been an error on the … servers
We’re sorry, this module is temporarily unavailable. Our staff is working to fix it as soon as possible. Please email us at … to report any problems or downtime.’

And the console shows:
‘Uncaught ReferenceError: Courseware is not defined
at HTMLDocument. (lms-application.e7bd4b65d083.js:1)
at fire (lms-main_vendor.a04b73033169.js:2)
at Object.fireWith [as resolveWith] (lms-main_vendor.a04b73033169.js:2)
at Function.ready (lms-main_vendor.a04b73033169.js:2)
at HTMLDocument.completed (lms-main_vendor.a04b73033169.js:2)’

The ‘/edx/var/log/lms/edx.log’ shows:
'Mar 30 14:08:27 ubuntu [service_variant=lms][edx.courseware][env:sandbox] ERROR [ubuntu 423] [views.py:575] - Error in /courses/course-v1:AUTH+Prog1+2018_T2/discussion/forum/: user=XXX, effective_user=XXX, course=course-v1:XXX+Prog1+2018_T2
Traceback (most recent call last):
File “/edx/app/edxapp/edx-platform/lms/djangoapps/courseware/views/views.py”, line 506, in get
return super(CourseTabView, self).get(request, course=course, page_context=page_context, **kwargs)
File “/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/web_fragments/views.py”, line 26, in get
fragment = self.render_to_fragment(request, **kwargs)
File “/edx/app/edxapp/edx-platform/lms/djangoapps/courseware/views/views.py”, line 640, in render_to_fragment
return tab.render_to_fragment(request, course, **kwargs)
File “/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/tabs.py”, line 294, in render_to_fragment
return self.fragment_view.render_to_fragment(request, course_id=unicode(course.id), **kwargs)
File “/edx/app/edxapp/edx-platform/lms/djangoapps/discussion/views.py”, line 697, in render_to_fragment
base_context = _create_base_discussion_view_context(request, course_key)
File “/edx/app/edxapp/edx-platform/lms/djangoapps/discussion/views.py”, line 404, in _create_base_discussion_view_context
user_info = cc_user.to_dict()
File “/edx/app/edxapp/edx-platform/lms/lib/comment_client/models.py”, line 59, in to_dict
self.retrieve()
File “/edx/app/edxapp/edx-platform/lms/lib/comment_client/models.py”, line 64, in retrieve
self._retrieve(*args, **kwargs)
File “/edx/app/edxapp/edx-platform/lms/lib/comment_client/user.py”, line 152, in _retrieve
metric_tags=self._metric_tags,
File “/edx/app/edxapp/edx-platform/lms/lib/comment_client/utils.py”, line 119, in perform_request
content=response.text[:100]
CommentClientError: Invalid JSON response for request 7fd630b4-1f6b-42fc-8e88-1910a3725030; first 100 characters: ’

502 Bad Gateway

502 Bad Gat' Mar 30 14:08:46 ubuntu [service_variant=lms][openedx.core.djangoapps.catalog.utils][env:sandbox] WARNING [ubuntu 423] [utils.py:66] - Failed to get program UUIDs from the cache.'

The ‘/edx/var/log/supervisor/forum-stderr.log’ shows:
‘/edx/app/forum/.gem/ruby/2.4.0/gems/elasticsearch-transport-1.1.2/lib/elasticsearch/transport/transport/base.rb:52: warning: constant ::Fixnum is deprecated
/edx/app/forum/.gem/ruby/2.4.0/gems/elasticsearch-transport-1.1.2/lib/elasticsearch/transport/transport/base.rb:54: warning: constant ::Fixnum is deprecated
ERROR: ElasticSearch configuration validation failed. “rake search:validate_index” failed with the following message: Alias ‘content’ does not exist.’

The ‘/edx/var/log/supervisor/forum-stdout.log’ shows:
‘W, [2020-03-30T14:09:19.789651 #16255] WARN – : NewRelic agent library not installed
W, [2020-03-30T14:09:19.954294 #16255] WARN – : Overwriting existing field _id in class User.
W, [2020-03-30T14:09:19.983714 #16255] WARN – : MONGODB | Unsupported client option ‘max_retries’. It will be ignored.
W, [2020-03-30T14:09:19.983783 #16255] WARN – : MONGODB | Unsupported client option ‘retry_interval’. It will be ignored.
W, [2020-03-30T14:09:19.983813 #16255] WARN – : MONGODB | Unsupported client option ‘timeout’. It will be ignored.
W, [2020-03-30T14:09:19.997302 #16255] WARN – : NewRelic agent library not installed’

The ‘/edx/var/log/nginx/error.log’ shows:
‘2020/03/30 14:08:27 [error] 23624#23624: 3164 connect() to unix:/edx/var/forum/forum.sock failed (111: Connection refused) while connecting to upstream, client: 127.0.0.1, server: forum., request: “GET /api/v1/users/3?complete=True&request_id=7fd630b4-1f6b-42fc-8e88-1910a3725030 HTTP/1.1”, upstream: “http://unix:/edx/var/forum/forum.sock:/api/v1/users/3?complete=True&request_id=7fd630b4-1f6b-42fc-8e88-1910a3725030”, host: “localhost:18080”’

I don’t know if anything else is needed for more information.

I would be grateful if someone could assist me on this in order to get the discussion forum up and running.

Best Regards,

Kostas.

Mickiyas_Ephrem · March 30, 2020, 7:45pm

try checking the key of COMMENTS_SERVICE_KEY on lms.env.json file and FORUM_API_KEY in my-passwords.yml file. This two keys must be the same.

kstefan · March 30, 2020, 8:27pm

Dear @Mickiyas_Ephrem,

thank you for your quick reply.

The 2 keys are the same.

So, probably something else is wrong.

Best Regards,

Kostas.

andrey.kryachko · March 31, 2020, 8:02am

hi @kstefan

please make sure that correct version of cs_comment_service was deployed (master branch is compatible only with ironwood release).

~$ sudo git -C ~forum/cs_comments_service/ status
HEAD detached at open-release/hawthorn.1

also open-release/hawthorn.2 is correct.

kstefan · March 31, 2020, 8:10am

Hi @andrey.kryachko,

thank you for your reply.

I ran the command and I got the following:

‘HEAD detached at open-release/hawthorn.2
nothing to commit, working directory clean’

So, I suppose my version is OK.

Something else seems to be the problem.

Best Regards,

Kostas.

kstefan · April 1, 2020, 8:04am

What I also see, is the following:

When I run the ‘/edx/bin/supervisorctl status’ command, the forum seems to be restarting every 1 minute.

Maybe this could be something that can assist me in the right direction if anyone has faced the same problem before.

UPDATE:
I saw in my ‘/edx/app/forum/forum-supervisor.sh’ file, the following:

#!/bin/bash

source /edx/app/forum/forum_env
cd /edx/app/forum/cs_comments_service

/edx/app/forum/cs_comments_service/bin/unicorn -c config/unicorn.rb -I '.'

# If forums fails to start because elasticsearch isn't migrated, sleep so supervisord
# doesn't attempt to restart it immediately.
# 101 is the magic exit code forums uses to mean "rake search:validate_index failed"
exit_code="$?"
[ "$exit_code" -eq 101 ] && sleep 60 && exit "$exit_code"

Could this be the reason why the forum service is restarting every 60 seconds?

Could this be related to the ‘/edx/var/log/supervisor/forum-stderr.log’ file error that I mentioned in my first post which follows below?

/edx/app/forum/.gem/ruby/2.4.0/gems/elasticsearch-transport-1.1.2/lib/elasticsearch/transport/transport/base.rb:52: warning: constant ::Fixnum is deprecated
/edx/app/forum/.gem/ruby/2.4.0/gems/elasticsearch-transport-1.1.2/lib/elasticsearch/transport/transport/base.rb:54: warning: constant ::Fixnum is deprecated
ERROR: ElasticSearch configuration validation failed. "rake search:validate_index" failed with the following message: Alias 'content' does not exist.

Finally, the ‘/edx/var/forum/forum_unicorn.pid’ and ‘/edx/var/forum/forum.sock’ files do not update (judging from their modified date time) when I execute the ‘/edx/bin/supervisorctl restart forum’ command. I do not know if they should, they update though when I run this command from my localhost installation.

Best Regards,

Kostas.

amit · April 1, 2020, 9:37am

@kstefan it seems index issue, if elasticsearch connection is ready then you need to initialize the index by using this command bin/rake search:initialize. Make sure you can run this command from the forum environment.

kstefan · April 1, 2020, 9:42am

Hi @amit,

thank you for your reply.

How can I check if elasticsearch is ready before I run the command?

When I run ‘curl -XGET http://localhost:9200/’ from the command, I get:

{
  "status" : 200,
  "name" : "Bug",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.5.2",
    "build_hash" : "62ff9868b4c8a0c45860bebb259e21980778ab1c",
    "build_timestamp" : "2015-04-27T09:21:06Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

Is this enough and OK?

Also, in order to run the command, do I do the following:

sudo -H -u forum bash
source forum_env
bin/rake search:initialize

Or should I do the following:

sudo -sHu forum bash
cd ~/cs_comments_service/
source ~/forum_env
rake search:initialize

Which would be a more appropriate approach?

Thank you very much in advance.

Best Regards,

Kostas.

amit · April 1, 2020, 10:26am

This one is looks good to me.

kstefan · April 1, 2020, 11:11am

Hi @amit,

I ran the commands and I got:

rake aborted!
Gem::LoadError: You have already activated rake 10.4.2, but your Gemfile requires rake 12.0.0. Prepending `bundle exec` to your command may solve this.
/edx/app/forum/cs_comments_service/Rakefile:4:in `<top (required)>'
(See full trace by running task with --trace)

I ran the command with --trace and I got:

rake aborted!
Gem::LoadError: You have already activated rake 10.4.2, but your Gemfile requires rake 12.0.0. Prepending `bundle exec` to your command may solve this.
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/site_ruby/2.4.0/bundler/runtime.rb:319:in `check_for_activated_spec!'
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/site_ruby/2.4.0/bundler/runtime.rb:31:in `block in setup'
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/2.4.0/forwardable.rb:229:in `each'
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/2.4.0/forwardable.rb:229:in `each'
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/site_ruby/2.4.0/bundler/runtime.rb:26:in `map'
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/site_ruby/2.4.0/bundler/runtime.rb:26:in `setup'
/edx/app/forum/.rbenv/versions/2.4.1/lib/ruby/site_ruby/2.4.0/bundler.rb:107:in `setup'
/edx/app/forum/cs_comments_service/Rakefile:4:in `<top (required)>'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/rake_module.rb:28:in `load'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/rake_module.rb:28:in `load_rakefile'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:689:in `raw_load_rakefile'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:94:in `block in load_rakefile'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:176:in `standard_exception_handling'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:93:in `load_rakefile'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:77:in `block in run'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:176:in `standard_exception_handling'
/edx/app/forum/.gem/gems/rake-10.4.2/lib/rake/application.rb:75:in `run'
/edx/app/forum/.gem/gems/rake-10.4.2/bin/rake:33:in `<top (required)>'
/edx/app/forum/.rbenv/versions/2.4.1/bin/rake:23:in `load'
/edx/app/forum/.rbenv/versions/2.4.1/bin/rake:23:in `<main>'

Should I run

bundle exec rake search:initialize

instead or should I check anything else before that?

The rake --version gives: rake, version 10.4.2

The bundle exec rake --version gives :rake, version 12.0.0

The ‘/edx/app/forum/cs_comments_service/Gemfile.lock’ file has this line: rake (12.0.0)

Best Regards,

Kostas.

kstefan · April 1, 2020, 12:52pm

Dear @amit and everyone else,

thank you very much for your assistance.

I ran bundle exec rake search:initialize and the forum is up and running again!

I hope this thread is helpful to anyone else that faces the same problem.

Best Regards,

Kostas.

Isanka · October 12, 2020, 9:05am

In my experience,
Error:

ERROR: ElasticSearch configuration validation failed. "rake search:validate_index" failed with the following message: Alias 'content' does not exist.

/edx/app/forum/.gem/ruby/2.4.0/gems/elasticsearch-transport-1.1.2/lib/elasticsearch/transport/transport/base.rb:52: warning: constant ::Fixnum is deprecated

/edx/app/forum/.gem/ruby/2.4.0/gems/elasticsearch-transport-1.1.2/lib/elasticsearch/transport/transport/base.rb:54: warning: constant ::Fixnum is deprecated

If you have a forum instance with data then be careful when you running, rake search:initialize. This will reset your forum(if I’m wrong please correct me).

This fixed my issue,

Stop forum service.
sudo /edx/bin/supervisorctl stop forum
Login to forum env.
sudo -sHu forum bash
cd ~/cs_comments_service/
source ~/forum_env
Run
bin/rake search:rebuild_index
Start the forum service
sudo /edx/bin/supervisorctl start forum

Anil_Mallampati · February 1, 2022, 7:44am

i am facing an issue with

forum-9b44cb895-9mct4 0/1 CrashLoopBackOff

logs :

Waiting for mongodb/elasticsearch…
2022/02/01 07:22:20 Waiting for: tcp://mongodb:27017
2022/02/01 07:22:20 Waiting for: http://elasticsearch:9200
2022/02/01 07:22:20 Connected to tcp://mongodb:27017
2022/02/01 07:22:20 Received 200 from http://elasticsearch:9200
W, [2022-02-01T07:22:24.337270 #17] WARN – : Overwriting existing field _id in class User.
W, [2022-02-01T07:22:24.362896 #17] WARN – : MONGODB | Unsupported client option ‘max_retries’. It will be ignored.
W, [2022-02-01T07:22:24.362960 #17] WARN – : MONGODB | Unsupported client option ‘retry_interval’. It will be ignored.
W, [2022-02-01T07:22:24.362985 #17] WARN – : MONGODB | Unsupported client option ‘timeout’. It will be ignored.
ERROR: ElasticSearch configuration validation failed. “rake search:validate_indices” failed with the following message: [404] {“error”:{“root_cause”:[{“type”:“index_not_found_exception”,“reason”:“no such index [comments]”,“resource.type”:“index_or_alias”,“resource.id”:“comments”,“index_uuid”:“na”,“index”:“comments”}],“type”:“index_not_found_exception”,“reason”:“no such index [comments]”,“resource.type”:“index_or_alias”,“resource.id”:“comments”,“index_uuid”:“na”,“index”:“comments”},“status”:404}

Anil_Mallampati · February 1, 2022, 7:45am

status of elastic search

sh: rake: command not found
sh-4.2$ curl -XGET http://localhost:9200/
{
“name” : “elasticsearch-945f8d677-9mjr2”,
“cluster_name” : “openedx”,
“cluster_uuid” : “xubCgVyoT8OW3V8QogBYvQ”,
“version” : {
“number” : “7.8.1”,
“build_flavor” : “default”,
“build_type” : “docker”,
“build_hash” : “b5ca9c58fb664ca8bf9e4057fc229b3396bf3a89”,
“build_date” : “2020-07-21T16:40:44.668009Z”,
“build_snapshot” : false,
“lucene_version” : “8.5.1”,
“minimum_wire_compatibility_version” : “6.8.0”,
“minimum_index_compatibility_version” : “6.0.0-beta1”
},
“tagline” : “You Know, for Search”
}

BbrSofiane · February 3, 2022, 11:22am

@Anil_Mallampati to maximise your chance to get some help you should probably open a separate thread for your issue. Also, include what version of Open edX, how you deploy it and the steps that led you to this error.

Topas · March 11, 2022, 2:39pm

Hi, I have any problem, but your problem solution is not working for me.
I have same error in console.

JQMIGRATE: Migrate is installed with logging active, version 1.4.1
lms-main_vendor.a04b73033169.js:5 JQMIGRATE: jQuery.browser is deprecated
migrateWarn @ lms-main_vendor.a04b73033169.js:5
lms-main_vendor.a04b73033169.js:5 console.trace
migrateWarn @ lms-main_vendor.a04b73033169.js:5
lms-main_vendor.a04b73033169.js:5 JQMIGRATE: jQuery.fn.toggle(handler, handler…) is deprecated
migrateWarn @ lms-main_vendor.a04b73033169.js:5
lms-main_vendor.a04b73033169.js:5 console.trace
migrateWarn @ lms-main_vendor.a04b73033169.js:5
lms-application.ecc588966829.js:1 Uncaught ReferenceError: Courseware is not defined
at HTMLDocument. (lms-application.ecc588966829.js:1:24824)
at fire (lms-main_vendor.a04b73033169.js:2:11339)
at Object.fireWith [as resolveWith] (lms-main_vendor.a04b73033169.js:2:12513)
at Function.ready (lms-main_vendor.a04b73033169.js:2:15403)
at HTMLDocument.completed (lms-main_vendor.a04b73033169.js:2:15670)
eduskop.cz/:1 Failed to load resource: the server responded with a status of 500 (INTERNAL SERVER ERROR)
DevTools failed to load source map: Could not load content for https://eduskop.cz/static/common/js/vendor/hls.js.map: HTTP error: status code 404, net::ERR_HTTP_RESPONSE_CODE_FAILURE

Can anyone advise me please?

Thanks

Jan

lpm0073 · March 23, 2023, 5:52pm

I ran into this yesterday on an installation of Nutmeg running on Kubernetes. Interestingly, this problem has occasionally surfaced since at least as far back as Ginkgo in 2018. My guess is that this is caused by a race condition inside of ElasticSearch itself.

Confirming that in my case, the following solved the problem:

running tutor init. In my case this simply meant re-running my automated deployment, which eventually leads to it running tutor init at the appropriate time.
deleting the existing failed forum pod, and allowing the existing forum kubernetes deployment to automatically recreate it. This is necessary because redeploying to k8s will not by itself lead to the forum pod being replaced. Moreover, there apparently exist settings in the forum pod that are persisted at the point of pod creation, and at least one of these is directly related to the ElasticSearch service.

Topic		Replies	Views
Discussion forum is not working Site Operations Help	4	2250	September 9, 2019
Forum(Discussions) is not working Site Operations Help	7	891	January 8, 2021
Discussion not working on Juniper release Site Operations Help online , juniper	6	1016	October 27, 2020
Discussion is not working Site Operations Help discussion	9	716	June 15, 2022
Discussion forum not working, after juniper to koa upgrade Site Operations Help juniper , koa	1	723	November 14, 2021

Discussion forum not working

502 Bad Gat' Mar 30 14:08:46 ubuntu [service_variant=lms][openedx.core.djangoapps.catalog.utils][env:sandbox] WARNING [ubuntu 423] [utils.py:66] - Failed to get program UUIDs from the cache.'

Related topics