Hello, I am running Koa, but for some reason when I add a video component it will never show the “Import YouTube Transcript” option, despite the fact that the auto-generated transcript exists, as shown in the attached image… I can download the transcript and import it into the component just fine. But I’m going to have a lot of videos so it would really be way better if I and other instructors could just click that button. Is there anything special that needs to be done in order to ensure that Open edX can recognize the presence of a transcript in a video? (E.g. does public vs. unlisted videos make a difference? How about short link names vs. long ones?)
Bumping this, since it’s going to be a lot of work to download and re-upload all the transcripts for my hundreds of videos…
Is there some logging I could collect which would shed light on this issue?
I’m interested in knowing this too…
Hi @oedx !
Have you provided a YOUTUBE_API_KEY
?
I believe this is required in order to load the youtube metadata for a video.
See Setting Up the YouTube API Key for details on how to get one.
Thanks for the help! I wasn’t using YT API keys, but I did now. It sounds like the ReadTheDocs documentation is out of date because they told me on the tutor forums that studio.yml changed to cms.yml.
I created a plugin which set my YOUTUBE_API_KEY, and I confirmed with grep that it landed in both the cms and lms.yml files. But the video component still can’t import transcripts from YouTube.
What’s the next step to debugging this?
Hi,
I had some issues with YouTube video transcripts a while ago. Check your edx.log file for messages such as:
PermissionError: [Errno 13] Permission denied:
Sometimes these issues are related to the existence, location or permissions of the “media/video-transcripts/” directory.
I grepped through the Tutor logs (I’m not 100% sure if they include the edx.log output, but I would tend to assume so), and there’s no hits for “Permission”
@jill thanks for that link to the source code. It’s useful because I don’t see any of these errors in my log:
YouTube API request failed with status code
YouTube API request failed because of connection time out or connection error
YouTube API key or video id is None. Please make sure API key and video id is not None
This seems to suggest that the video mechanism isn’t even trying to download the transcript, doesn’t it? Or do I need to do something to crank up my log verbosity to see warning messages instead of error messages?
After digging around in the logs for a while, I eventually found this error:
cms_1 | Traceback (most recent call last):
cms_1 | File “/openedx/edx-platform/common/lib/xmodule/xmodule/video_module/video_handlers.py”, line 337, in transcript
cms_1 | content, filename, mimetype = get_transcript(
cms_1 | File “/openedx/edx-platform/common/lib/xmodule/xmodule/video_module/transcripts_utils.py”, line 1060, in get_transcript
cms_1 | return get_transcript_from_contentstore(
cms_1 | File “/openedx/edx-platform/common/lib/xmodule/xmodule/video_module/transcripts_utils.py”, line 72, in wrapper
cms_1 | return func(*args, **kwds)
cms_1 | File “/openedx/edx-platform/common/lib/xmodule/xmodule/video_module/transcripts_utils.py”, line 940, in get_transcript_from_contentstore
cms_1 | raise NotFoundError(‘No transcript for{lang}
language’.format(
cms_1 | xmodule.exceptions.NotFoundError: No transcript foren
language
I then manually added a transcript to a video as “English” instead of “English (United States)”. The transcript import then worked for that video!
Tutor doesn’t seem to accept a language of “en-US”, only “en”, and “en-uk”, so I think this probably means I need to reconfigure all my transcripts on YouTube to be specified as just plain English.
Which raises an interesting question: After I download the transcript the first time, there’s no button to re-download it when I change it on YouTube as it gets fixed up over time. Is the best practice to just delete the video, re-add it, and then re-download the transcript? Will this lead to an ever-growing pile of unused transcript files hiding somewhere in the background?
Recently I have been dabbling with the issue that Transcript from my YouTube video was not getting loaded, every time I enter the video URL, the “Import from Youtube” button was not showing up. Later I realized that there is a language dependency for importing transcripts in the platform. The transcript which I was planning to import was in Arabic
and the platform by default just supports English
.
As seen here and here.As seen in the code only one language transcript can be imported from YouTube.
Now in order to get Arabic Transcript to be imported, I had to make the following changes in the settings.py.
YOUTUBE:
# YouTube JavaScript API
API: 'https://www.youtube.com/iframe_api'
TEST_TIMEOUT: 1500
# URL to get YouTube metadata
METADATA_URL: 'https://www.googleapis.com/youtube/v3/videos'
# Current youtube api for requesting transcripts.
# For example: http://video.google.com/timedtext?lang=en&v=j_jEn79vS3g.
TEXT_API: {
'url': 'video.google.com/timedtext',
'params': {
'lang': 'ar',
'v': 'set_youtube_id_of_11_symbols_here',
},
}
IMAGE_API: 'http://img.youtube.com/vi/{youtube_id}/0.jpg'
And, Voilla! I could see the “Import From YouTube” button.
Did you by any chance find a way to force-re-import a transcript when it’s been updated on YT? That’s the thing I currently need…
Hey @oedx ,
This is probably just a hack but have you tried Clearing the video ID?
That forces to give the “Import from Youtube” button again, which I am assuming can help to re-download the transcript. Let me know this was something that helped.
Just for the record, yes, clearing the Video ID did allow me to re-import transcripts at the time (though I haven’t been able to test this lately as apparently Lilac broke this functionality and it’s still broken in Maple (Lilac broke Youtube Transcript import?))