Help making server decisions

Hi everyone

I have been experimenting with open edx for quite sometime. And one thing that has always disappointed me is making server related decision. Clearly because I don’t understand too many things. and I have been stuck of I get some clarification on these issues I can make some progress.

So here is my question that will involve lots of ifs and buts.

So open edx provides single system installation which is good for understanding allover platform which is great.

Going to production with it would be big NO if one expects increasing userbase.

Having said that I am planning to separate MySQL mongoDb and memcache to separate instance so while I do that what would be ideal cpu required? A ballpark figure would do too. How do I monitor these systems if I had to do it all by myself?

I was also considering RDS for MySQL eliminating need for monitoring it.

Could I similarly use dynamoDb? Out of the box? Or how do I migrate from mongo to dynamo?

Regarding cache I have a question before even sizing decision

  • if my target audience is from one
    region only and my servers are sitting close enough do I really want to go to CDN or memcache is fine?
    I will have about 8 courses in my lms. How much cache can memcache retain? Because I am afraid if I let more request to my server the more resources it will need and I am doing this whole project on my own without any financial support infact I am not even a developer I don’t understand python django MySQL or anything it’s all resources I am reading and have been able to get the platform running for me. Rational behind this is I have struggled to acquiring some skills and I don’t want others to go through the same. The fact that these are communication skills (learning languages) it will attract lot of audience.

What I need is the way where most content is delivered without hitting server (may be through cache)

Is there a way where I can work around to increase the concurrent user capacity without investing too much on the server if I know my content once published may never change?

is serving from cache is also a load on server?

Can cdn help my situation? Where content is delivered through them without having to make requests to server or request always hits the server?

I used to face this quite often. Whenever I spin up ami I used to get lot of rabbitMq errors. Now I use lilac.
And I heard rabbitmq is not there in lilac.

Is that true?

Or do I need to migrate that as well?

Finally when I deploy a server how much power does lms have or cms have is determined. So upon increasing its size do I need to configure anything or lms and cms on its own will take up the proportion of this cpu? Also I will only be the one authoring course can I manually configure the least resources to cms and most to lms? That would help too I guess

I have heard many people advice against using load balancer. Can someone give idea why is that? How are we suppose to scale application server without load balancer?

Thanks Everyone

Hope to put these questions to rest

Hi @Selli!

There is no simple, easy answer to that question, except to say that the more load you face, the more resources you’ll need. If I were you, I’d look outside the Open edX community for articles on performance of each of the individual services, and ways to scale them up as needed.

You’re right that the first logical step is to factor them out of the single-server installation. The next step would be to make the services highly-available, such as creating a 3-node Galera cluster and similarly sized MongoDB/memcached/Redis ones.

Again, there is no one-size-fits-all answer. I’ve done it many different ways in the past, each with its advantages and disadvantages, from configuring Postfix to send alerts from each VM, to the likes of Cloudwatch and Newrelic.

Like above, this is not an Open edX-specific problem.

This is possible, and I believe many in the community do it. Same goes for hosted MongoDB services.

Yes, deploying Open edX is complicated, but not a whole lot more than any modern web application. For better or worse, deploying it effectively will require devops knowledge.

There have been attempts to make it easier, though. Such as Tutor, for instance. While it successfully provides a turn-key way of getting a deployment, you still need to know what you’re doing when scaling it up. We’re trying to tool and document how we’ll do it at Opencraft via the Grove project, but it’s still early days.

I think we all want that, but it’s hard to do it and still get a good interactive experience as a learner. The good news is that memcached goes a long way in reducing load - requests still always hit the Python processes, but they respond rather quickly.

I’m not aware of any way to scale Open edX up without adding more servers, unfortunately.

As of a few releases ago, the only supported messaging backend is Redis.

The recommended way to scale edx-platform app servers is to create more of them as needed, with a load balancer in front of them. This way you can turn them on and off, thus saving money for periods with less traffic.

Adding to what @arbrandes mentioned…

RDS works fine, we use it at OpenCraft in some deployments. However you still need to monitor it (disk space, performance).

I don’t know whether it works well with DynamoDB; we usually use other MongoDB hosting providers.
You also have the option to create an EC2 server and install MongoDB or MySQL there and manage it yourself. I have seen this used to host Mongo.

I haven’t seen any criticism. You’ll need some LB to scale to many servers. We use ELB with many clients and it works. However, in small setups 1 server may be enough, so I’d recommend starting with 1 server, seeing how it works, and then deciding how to scale. Well, 1 LMS + 1 CMS; see below.

If you’re thinking about scaling to >1 servers, then instead of deciding how to split the CPU between LMS+CMS in each server, you can use different servers for LMS and for CMS. Later when you need scaling, you want to scale only the LMS servers.
It’s also typical to have different servers for „worker“ processes, e.g. to grade exams. Each type of instance (LMS/CMS/workers) can have a different size.

Hosting Open edX involves many components. There are companies that host it and hide the complexity. Plus what @arbrandes mentioned (Tutor, Grove, …)