Hi guys,
I am working on feature Insights, I also setup analytics-pipeline by running most of the command lines within ansible successfully.
I am struggling to execute some tasks on a separate machine (Hadoop, Hive, and Sqoop). Follow the document here Tasks to Run to Update Insights — EdX Analytics Pipeline Reference Guide 1.0 documentation
The first task is Performance (Graded and Ungraded) by running command line
launch-task AnswerDistributionToMySQLTaskWorkflow --remote-log-level DEBUG --local-scheduler --include '"[\"*tracking.log*\"]"' --src '"[\"hdfs://localhost:9000/data\"]"' --dest '"[\"/tmp/answer_dist\"]"' --mapreduce-engine local --name test_task
The output seem look good
However, I recognize the error message as below
WARNING:luigi-interface:Will not run AnswerDistributionToMySQLTaskWorkflow(database=reports, credentials=/edx/etc/edx-analytics-pipeline/output.json, name=test_task, src=["[", "\"", "h", "d", "f", "s", ":", "/", "/", "l", "o", "c", "a", "l", "h", "o", "s", "t", ":", "9", "0", "0", "0", "/", "d", "a", "t", "a", "\"", "]"], dest="[\"/tmp/answer_dist\"]", include=["[", "\"", "*", "t", "r", "a", "c", "k", "i", "n", "g", ".", "l", "o", "g", "*", "\"", "]"], manifest=None, answer_metadata=None, base_input_format=None) or any dependencies due to error in deps() method: Traceback (most recent call last): File "/var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/worker.py", line 743, in _add deps = task.deps() File "/var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/task.py", line 630, in deps return flatten(self._requires()) File "/var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/task.py", line 602, in _requires return flatten(self.requires()) # base impl File "/var/lib/analytics-tasks/analyticstack/edx-analytics-pipeline/edx/analytics/tasks/common/mysql_load.py", line 64, in requires self.required_tasks['insert_source'] = self.insert_source_task File "/var/lib/analytics-tasks/analyticstack/edx-analytics-pipeline/edx/analytics/tasks/insights/answer_dist.py", line 861, in insert_source_task manifest=self.manifest, File "/var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/task_register.py", line 88, in __call__ param_values = cls.get_param_values(params, args, kwargs) File "/var/lib/analytics-tasks/analyticstack/venv/src/luigi/luigi/task.py", line 411, in get_param_values raise parameter.MissingParameterException("%s: requires the '%s' parameter to be set" % (exc_desc, param_name)) MissingParameterException: AnswerDistributionPerCourse[args=(), kwargs={'src': (u'[', u'"', u'h', u'd', u'f', u's', u':', u'/', u'/', u'l', u'o', u'c', u'a', u'l', u'h', u'o', u's', u't', u':', u'9', u'0', u'0', u'0', u'/', u'd', u'a', u't', u'a', u'"', u']'), 'name': 'test_task', 'dest': '"[\\"/tmp/answer_dist\\"]"', 'answer_metadata': None, 'n_reduce_tasks': 25, 'manifest': None, 'base_input_format': None, 'lib_jar': (), 'include': (u'[', u'"', u'*', u't', u'r', u'a', u'c', u'k', u'i', u'n', u'g', u'.', u'l', u'o', u'g', u'*', u'"', u']'), 'mapreduce_engine': 'local'}]: requires the 'remote_log_level' parameter to be set
I also added --remote-log-level into the command. But it does not work.
Can anyone please help me in this case?
Thank you,
Thin Nguyen