Getting video upload to MinIO working in Studio

After a lot of source code digging and test setups, I got video upload in Studio working!

Here a step-by-step receipt for trying it out:

1. Introduction
The relevant video upload code can be found in the file edx-platform/videos.py at master · openedx/edx-platform · GitHub

This code, with a small patch and the right configuration, will generate a presigned PUT url of the form:

https://files.lmshost.tld:443/videos/07613d52-4cda-4cbf-bf7d-4a64da41e7d0?Signature=dogBMJ%2Bc6umxbvOPJ5Sw4V2qFwo%3D&Expires=1659073179&AWSAccessKeyId=openedx&x-amz-meta-client_video_id=testvideo.mp4&x-amz-meta-course_key=course-1234

where lmshost.tld stands for your LMS host domain {{ LMS_HOST}}.

We see, that the upload url domain is the same as the standard MINIO_HOST.

Further in this example url:

videos/07613d52-4cda-4cbf-bf7d-4a64da41e7d0 is the path to your uploaded file in the bucket
x-amz-meta-client_video_id=testvideo.mp4 contains the original file name of the video (url escaped)
x-amz-meta-course_key=course-1234 contains the course ID (url escaped)

2. Patch tutor-minio
The actual tutor-minio plugin must be patched, to include a MINIO_DOMAIN environment parameter (see MinIO | Learn how to configure your MinIO server - completely at the bottom).
This MINIO_DOMAIN parameter must be set to {{ LMS_HOST }}.

Here the long explanation:

AWS S3 uses “virtual” endpoint bucket urls. Bucket names are globally unique. and AWS / MinIO will figure out from the presigned upload url like above and the MINIO_DOMAIN setting, the name of this bucket, as (.+).{{MINIO_DOMAIN}}.

So in our example, the presigned upload url is https://files.lmshost.tld:443/......, MINIO_DOMAIN = lmshost.tld, and so MinIO figures out a bucket name of “files”.

A bit weird, but we have a fixed MINIO_HOST endpoint, we can’t use virtual endpoints. Thus, very important, our video upload bucket must have the name “files” !!!

The patch

I have an open pull request here Minio domain by insad · Pull Request #19 · overhangio/tutor-minio · GitHub

Until this code has been merged into master, you’ll need to uninstall tutor-minio and install the version from my fork:

pip uninstall tutor-minio
pip install git+https://github.com/insad/tutor-minio.git@minio_domain

My patch adds the MINIO_DOMAIN parameter (initial value {{ LMS_HOST }}, documents the previously unused and hidden MINIO_VIDEO_UPLOAD_BUCKET_NAME parameter and changes it’s value from ‘openedxvideos’ to ‘files’ (which is the name we must use, see above).

3. Patch, configure and rebuild the openedx image

  1. Create a plugin (e.g. “video_upload.py”) in your plugins folder (tutor plugins printroot), with following content:
from tutor import hooks

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-dockerfile-post-git-checkout",
        "RUN curl -fsSL https://github.com/insad/edx-platform/commit/1c9d7b0595f96b693ed5973b75de7980a7bdee86.patch | git am"
    )
)

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-cms-common-settings",
        "FEATURES['ENABLE_VIDEO_UPLOAD_PIPELINE'] = True"
    )
)

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-cms-common-settings",
        "VIDEO_UPLOAD_PIPELINE['VEM_S3_BUCKET'] = 'files'"
    )
)

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-cms-common-settings",
        "VIDEO_UPLOAD_PIPELINE['ROOT_PATH'] = 'videos'"
    )
)
  1. Enable the plugin:
tutor plugins enable video_upload
tutor config save
  1. Rebuild your openedx image:

tutor images build openedx

  1. Stop and start tutor:
tutor local stop
tutor local start -d

Explanation

I first tried to configure boto in cms with a ~/.boto file, with content:

[s3]
host = lmshost.tld

Sadly the old boto code has a bug, whereby this “host” parameter is not read using the boto lib with Python 3.8.

So I had to go another way, patching the edx-platform/videos.py at master · openedx/edx-platform · GitHub file directly.

The only patch needed there is in function storage_service_bucket() - adding the host to the connection parameters:

    if ENABLE_DEVSTACK_VIDEO_UPLOADS.is_enabled():
        params = {
            'aws_access_key_id': settings.AWS_ACCESS_KEY_ID,
            'aws_secret_access_key': settings.AWS_SECRET_ACCESS_KEY,
            'security_token': settings.AWS_SECURITY_TOKEN

        }
    else:
        params = {
            'host': settings.LMS_BASE,  ### THIS LINE ADDED ###
            'aws_access_key_id': settings.AWS_ACCESS_KEY_ID,
            'aws_secret_access_key': settings.AWS_SECRET_ACCESS_KEY
        }

4. Check that you have a “files” bucket in MinIO

You should check that you have a “files” bucket in MinIO, if not, create a bucket with that name:

5. Patch your reverse proxy code and restart your webserver

The proxy server must unescape both x-amz-meta-* parameters from the presigned upload url query string, and add them to the header.

I can only indicate here how to do it in Apache2, people who use NGINX as proxy, please post here below the relevant config settings for your webserver.

The relevant part in my Apache2 configuration is:

    RewriteEngine On

    RewriteMap ue int:unescape

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-client_video_id=([^&]+)
    RewriteRule (.*) - [E=VIDEO_ID:${ue:%1}]
    RequestHeader set X-Amz-Meta-Client_video_id %{VIDEO_ID}e env=VIDEO_ID

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-course_key=([^&]+)
    RewriteRule (.*) - [E=COURSE_KEY:${ue:%1}]
    RequestHeader set X-Amz-Meta-Course_key %{COURSE_KEY}e env=COURSE_KEY

and for being redundant, my complete relevant virtual host setting for Apache2 (with my domain replaced by “lmshost.tld”) is:

<VirtualHost *:80>
    ServerName lmshost.tld
    Redirect / https://lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName apps.lmshost.tld
    Redirect / https://apps.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName courses.lmshost.tld
    Redirect / https://courses.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName discovery.lmshost.tld
    Redirect / https://discovery.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName ecommerce.lmshost.tld
    Redirect / https://ecommerce.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName files.lmshost.tld
    Redirect / https://files.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName grades.lmshost.tld
    Redirect / https://grades.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName mail.lmshost.tld
    Redirect / https://mail.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName minio.lmshost.tld
    Redirect / https://minio.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName mobile.lmshost.tld
    Redirect / https://mobile.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName notes.lmshost.tld
    Redirect / https://notes.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName preview.lmshost.tld
    Redirect / https://preview.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName studio.lmshost.tld
    Redirect / https://studio.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName xqueue.lmshost.tld
    Redirect / https://xqueue.lmshost.tld/
</VirtualHost>

<VirtualHost *:443>
    ServerName lmshost.tld
    ServerAlias *.lmshost.tld
    SSLEngine on

    LogLevel info

    RequestHeader set X-Forwarded-Proto https
    RequestHeader set X-Forwarded-SSL on

    RewriteEngine On

    RewriteMap ue int:unescape

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-client_video_id=([^&]+)
    RewriteRule (.*) - [E=VIDEO_ID:${ue:%1}]
    RequestHeader set X-Amz-Meta-Client_video_id %{VIDEO_ID}e env=VIDEO_ID

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-course_key=([^&]+)
    RewriteRule (.*) - [E=COURSE_KEY:${ue:%1}]
    RequestHeader set X-Amz-Meta-Course_key %{COURSE_KEY}e env=COURSE_KEY

    ProxyPreserveHost On
    ProxyRequests Off
    ProxyVia Block

    <Proxy *>
        Require all granted
    </Proxy>

    ProxyPass / http://localhost:444/
    ProxyPassReverse / http://localhost:444/

    ErrorLog /var/log/apache2/openedx_error.log
    CustomLog /var/log/apache2/openedx_access.log combined

    Include /etc/letsencrypt/options-ssl-apache.conf
    SSLCertificateFile /etc/letsencrypt/live/lmshost.tld/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/lmshost.tld/privkey.pem
</VirtualHost>

Don’t forget to restart your webserver:

systemctl restart apache2

6. Test video upload

7. Troubleshooting

Install the MinIO client:

sudo -i
cd /usr/local/bin
wget https://dl.min.io/client/mc/release/linux-amd64/mc
chmod +x mc

Open a new terminal window, create an alias for your minio in mc, and start watching what goes on:

mc alias set minio https://files.lmshost.tld openedx {{ MINIO_AWS_SECRET_ACCESS_KEY }} --api S3v4
mc admin trace --verbose minio

Double check that the bucket is identified as “files”, and that specially the X-Amz-Meta-Course_key header has an unescaped value:
thus not
course-v1%3ATestX%2BCourse%2B1
but instead
course-v1:TestX+Course+1

8. And then…
I’ll start working on the rest of the video pipeline, things like:

  • generating copies of the video file in distinct formats / resolutions
  • generating HLS (m3u8) video snippets and manifest
  • generating a poster image for the video player
  • uploading everything generated back to MinIO, and update via the Open edX API the corresponding edxval record
  • maybe generate transcripts using speech recognition (will be for spanish as a starter)

    many nice goodies for this to be found at GitHub - EsupPortail/Esup-Pod at dev3
1 Like

IMPORTANT UPDATE

Patching MINIO with MINIO_DOMAIN parameter as described, will have as side effect that none of the other assets (images, translations, avatar etc) will be available, and new uploads will go the wrong buckets.

There is a much simpler solution without changing tutor-minio, please follow these new instructions:

1. Patch, configure and rebuild the openedx image

  1. Create a plugin (e.g. “video_upload.py” in your plugins folder (tutor plugins printroot), with following content:
from tutor import hooks

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-dockerfile-post-git-checkout",
        "RUN curl -fsSL https://github.com/insad/edx-platform/commit/74f6ef4efbe06839cb574479166ec8c9fb20cad8.patch | git am"
    )
)

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-cms-common-settings",
        "FEATURES['ENABLE_VIDEO_UPLOAD_PIPELINE'] = True"
    )
)

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-cms-common-settings",
        "VIDEO_UPLOAD_PIPELINE['VEM_S3_BUCKET'] = 'openedxvideos'"
    )
)

hooks.Filters.ENV_PATCHES.add_item(
    (
        "openedx-cms-common-settings",
        "VIDEO_UPLOAD_PIPELINE['ROOT_PATH'] = 'upload'"
    )
)
  1. Enable the plugin:
tutor plugins enable video_upload
tutor config save
  1. Rebuild your openedx image:
tutor images build openedx
  1. Stop and start tutor:
tutor local stop
tutor local start -d

Explanation

The file edx-platform/videos.py at master · openedx/edx-platform · GitHub is being patched in function storage_service_bucket() as follows:

    if waffle_flags()[ENABLE_DEVSTACK_VIDEO_UPLOADS].is_enabled():
        params = {
            'aws_access_key_id': settings.AWS_ACCESS_KEY_ID,
            'aws_secret_access_key': settings.AWS_SECRET_ACCESS_KEY,
            'security_token': settings.AWS_SECURITY_TOKEN

        }
    else:
        params = {
            'host': settings.AWS_S3_ENDPOINT_URL.replace('https://', ''),
            'calling_format': s3.connection.OrdinaryCallingFormat(),
            'aws_access_key_id': settings.AWS_ACCESS_KEY_ID,
            'aws_secret_access_key': settings.AWS_SECRET_ACCESS_KEY
        }

‘host’ will be equal to {{ MINIO_HOST }} (without https, i.e. “files.lmsdomain.tld”), and the calling format will use path-style calling format, instead of the default virtual-host-style calling format - ref. Making requests using the REST API - Amazon Simple Storage Service

2. Patch your reverse proxy configuration and restart your webserver

The proxy server must unescape both x-amz-meta-* parameters from the presigned upload url query string, and add them to the header.

I can only indicate here how to do it in Apache2, people who use NGINX as proxy, please post here below the relevant config settings for your webserver.

The relevant part in my Apache2 configuration is:

    RewriteEngine On

    RewriteMap ue int:unescape

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-client_video_id=([^&]+)
    RewriteRule (.*) - [E=VIDEO_ID:${ue:%1}]
    RequestHeader set X-Amz-Meta-Client_video_id %{VIDEO_ID}e env=VIDEO_ID

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-course_key=([^&]+)
    RewriteRule (.*) - [E=COURSE_KEY:${ue:%1}]
    RequestHeader set X-Amz-Meta-Course_key %{COURSE_KEY}e env=COURSE_KEY

and for being redundant, my complete relevant virtual host setting for Apache2 (with my domain replaced by “lmshost.tld”) is:

<VirtualHost *:80>
    ServerName lmshost.tld
    Redirect / https://lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName apps.lmshost.tld
    Redirect / https://apps.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName courses.lmshost.tld
    Redirect / https://courses.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName discovery.lmshost.tld
    Redirect / https://discovery.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName ecommerce.lmshost.tld
    Redirect / https://ecommerce.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName files.lmshost.tld
    Redirect / https://files.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName grades.lmshost.tld
    Redirect / https://grades.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName mail.lmshost.tld
    Redirect / https://mail.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName minio.lmshost.tld
    Redirect / https://minio.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName mobile.lmshost.tld
    Redirect / https://mobile.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName notes.lmshost.tld
    Redirect / https://notes.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName preview.lmshost.tld
    Redirect / https://preview.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName studio.lmshost.tld
    Redirect / https://studio.lmshost.tld/
</VirtualHost>
<VirtualHost *:80>
    ServerName xqueue.lmshost.tld
    Redirect / https://xqueue.lmshost.tld/
</VirtualHost>

<VirtualHost *:443>
    ServerName lmshost.tld
    ServerAlias *.lmshost.tld
    SSLEngine on

    LogLevel info

    RequestHeader set X-Forwarded-Proto https
    RequestHeader set X-Forwarded-SSL on

    RewriteEngine On

    RewriteMap ue int:unescape

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-client_video_id=([^&]+)
    RewriteRule (.*) - [E=VIDEO_ID:${ue:%1}]
    RequestHeader set X-Amz-Meta-Client_video_id %{VIDEO_ID}e env=VIDEO_ID

    RewriteCond %{QUERY_STRING} (?:^|&)x-amz-meta-course_key=([^&]+)
    RewriteRule (.*) - [E=COURSE_KEY:${ue:%1}]
    RequestHeader set X-Amz-Meta-Course_key %{COURSE_KEY}e env=COURSE_KEY

    ProxyPreserveHost On
    ProxyRequests Off
    ProxyVia Block

    <Proxy *>
        Require all granted
    </Proxy>

    ProxyPass / http://localhost:444/
    ProxyPassReverse / http://localhost:444/

    ErrorLog /var/log/apache2/openedx_error.log
    CustomLog /var/log/apache2/openedx_access.log combined

    Include /etc/letsencrypt/options-ssl-apache.conf
    SSLCertificateFile /etc/letsencrypt/live/lmshost.tld/fullchain.pem
    SSLCertificateKeyFile /etc/letsencrypt/live/lmshost.tld/privkey.pem
</VirtualHost>

Don’t forget to restart your webserver:

systemctl restart apache2

3. Test video upload

Your file should be in bucket “openedxvideos”, inside the folder “upload”:

1 Like

And a smaill tip (found this undocumented flag in the videos.py source code):

If you add a Django waffle switch “videos.video_image_upload_enabled”:

you can even upload poster images:

2 Likes

If you add a Django waffle switch “videos.video_image_upload_enabled”

You can add this to the plugin to make it complete :slight_smile:

How can I add waffle switches in the plugin?

If you can help me with that, I’ll make a “nice” plugin from this, even no problem to publish it under overhang.io

You can add an init hook

tutor_hooks.Filters.COMMANDS_INIT.add_item

And set the waffle switch like this.

It will mean that users will have to run tutor local init -l video_upload but in my opinion that’s better than having to manually set the waffle switch.

Thanks… If some NGINX guru still could figure out how to do the query parameters to header parameters conversion and share it here, would be fantastic. So we can have a plugin that solves the video upload part. After that put energy in the video processing part.

1 Like