Categories
Hosting Security

Log4j and global panic

Now-a-days, the world is getting used to things being thrown at it to worry about. And we all hope that smart cookies in a lab somewhere will find a cure. Well – a couple of days ago, some boffins found a new computer bug that is being given hazard level 10, and I can assure you – that gets us geeks all rather excited

CVE-2021-44228, or the Log4j bug, was first published, with a patch, on the 9th / 10th of December. This vulnerability, which was discovered by Chen Zhaojun of Alibaba Cloud Security Team, impacts Apache Log4j.

Yip – that’s all foreign language to most humans, but the long and short of it is, this is a fresh vulnerability found in a piece of software very commonly used across the world for storing software activity logs, that allows anyone without permission, access to hijack a computer system and effectively run their own commands – from establishing a ransomware attack on a host, through to compromising secure user records etc.

The vulnerability has been shown to be active in software that uses the log4j software as well – from well known names like Apple iOS (yep – your mobile phone / tablet), MacOS, VMWare, Discord, Ubiquiti etc – A list is starting to be collected via https://github.com/YfryTchsGD/Log4jAttackSurface – a patch has been released to counter the attack, but the slower people are applying the patch, the more exposed systems are, and the more havoc that can be applied globally.

So what can we do?

  • Check for, and apply, any updates from software manufacturers. Always make sure you are running the latest versions of everything. This is paramount for both your security and your piece of mind.
  • Consider application of a strong, secure firewall to block potential threat traffic from getting to your systems
  • Contact any providers you use that could be storing sensitive information and seek assurances that they have taken appropriate measures to counter the risk associated with this new threat

Here at Webmad all of our hosting systems have been secured against this threat, simply because we are not using any services that rely on Log4j, and any of our upstream providers have been quick off the mark to get this resolved. Should you have any concerns though, by all means get in contact.

Categories
Hosting Security

Never trust an email

Over the last week, some of our shared hosting clients have been targeted by a rather complex email attack that is focusing on clients using cPanel based hosting, like we use at Webmad.

The attack first detects if the website hosting is cPanel based, and then if it can locate a contact email address form the website, it emails the contact with an email that looks like a legitimate cPanel disk space usage warning email, requesting you take various actions to protect your website from downtime.

This typically looks like the following:

So the key components of the email to look out for are:

  • If you hover your mouse over the links in the email, they are not the same as the link text. This is a huge red flag, as it is misleading you as to where you think you are being directed.
  • The From address always has ‘no-reply@’ at the start – most hosting providers will customise this so it comes from them, not from your own domain name
  • The disk usage percentage is always over 95%

Please ignore these emails, and if you have followed any of the links, do let your hosting provider know as soon as possible, as it is possible that details you provide on the links will lead to compromising your websites hosting security – its best to work through with your hosting provider the best course of action from here.

For Webmad hosted clients – we don’t actually have set disk quotas on our hosting, so we can assure you you will never receive any legitimate emails like this from us – we prefer to contact you directly, using humans not automation. Contact us if you ever have any concerns.

Stay safe out there everyone!

Categories
Hosting Technology

Aaargh! Facebook is down!

What a shock to many – their worlds coming crashing down as the need for social interaction is unable to be met by the worlds most commonly used social networks, all owned by Facebook. Today (5 Oct 2021) – many here in New Zealand have woken to a worldwide outage, visiting the site is complaining about a DNS / Domain issue, and a white screen that no doubt has some rather high paid network engineers at Facebook, having kittens.

So why is it down? Well – that’s the question everyone is speculating on, and much of it comes down to the core structure of the internet, and how they are harnessing tools to give us all the best experience possible. The most likely reason is what I’ll be walking us through today.

How does the internet work?

Well it all starts off in your internet browser – Google Chrome, Mozilla Firefox, Internet explorer / Edge / whatever Microsoft are calling it now, Apple Safari. Lots of options, all work the same way. You type in a web address (URL) into the address bar, and hit enter, and within seconds the page you want renders in the browser, and we carry on our merry way. But there is a bunch of communications that goes on within those few seconds that helps make this all work.

The first part of this is the address translation. There is a global system called DNS (the Domain Name System) which translates what you have typed in (ie https://webdeveloper.nz/ ) into a series of numbers called an IP address. The servers that store the website data each have an IP address that they respond on, and deliver the web pages back to you. Its a bit like your phonebook. I want to call someone by this name, so please give me their phone number to do so.

Once the address translation has happened, you can talk directly to the servers and get the data you need to render the web page. The faster this translation happens, the faster your website will load for the end users. And this is where the problem is believed to have happened for facebook today.

Where has it all gone wrong?

The way that normal IP addressing works is that one server typically has one IP address. It is unique, and you can get a bunch of details from it (check out https://ip-api.com for some of this info). The downside is that a single IP address typically translates to one server, that may actually be on the other side of the world to you. And because light can only travel so fast (ie the internet backbones that link us all together via fibre optic cables) there is a delay talking from little old NZ through to big datacenters in the USA or Europe.

What some clever clogs has worked out though, is that you can use Content Delivery Networks to reduce the physical distance between your web servers and your customers around the world, making websites load so much quicker. Yay! But that is only part of the equation. This works for website content, but it doesn’t work for the DNS lookup / translation aspect. And this is where we get to BGP routing. This is where we believe the outage has been caused today.

You’re getting technical…

BGP Routing or Border Gateway Protocol Routing, is a fancy way of allowing one single advertised IP address to be shared by multiple servers globally, which can then serve website clients from the closest possible geographic location. As there are lots of servers that can serve the data of the one IP address, it can be very fault tolerant, and increases speeds of users getting website addresses translated to IP addresses so that the traffic can be routed to the right places and the websites work

In todays outage, the hardware that does this BGP routing globally for Facebook, allowing them high website speeds, has been misconfigured / lost its configuration. What this has meant is that anyone trying to do lookups / translations of any of the facebook operated web addresses, are getting blank screens with their browsers telling them that they can’t find the domain name.

As I write this it looks like things are slowly starting to resume normal operations after 4 and a bit hours – there is a facebook branded error page now, so we are at least seeing facebook servers again, but I suspect the next issue they will face as they slowly bring the site back online is the large influx of people accessing the sites after their drought, and trying to catch up, effectively swamping their servers

What can we learn from this?

  • Firstly – in the internet world, you are never to big to fail.
  • Secondly – the world is still ok without social networks.
  • All the geekery in the world (CDN’s, BGP Routing etc etc) won’t necessarily save you from good old fashioned human error, although it does help reduce its occurance.

Here at Webmad we are well versed in using these various tools to get you the best outcomes and speed for your website, using trusted providers, and offering proven results. We’ve run sites using BGP failover routing to offer high availability geolocation aware systems within NZ, we use CDN‘s all the time, and we can quickly pinpoint where issues might be, and how to fix them. Could we fix Facebook’s troubles? That’s a bit above our pay grade, but we can definitely put our knowledge to great use as part of your web team. Drop us a line to get the best results for your online assets.

Categories
Hosting Technology

What is a Content Delivery Network (CDN)?

This past week the buzz-word floating about internet related conversations has been the drop out of a huge chunk of the internet related to an outage from the CDN provider Fastly. A good number of websites went out world-wide, and high traffic sites experienced either total outage or parts of their networks unable to be reached. It felt like a digital apocalypse for many. For some of our clients there was glee as their competition were taken offline by this outage. In the end, it was only for an hour, and late in the evening New Zealand time, but it still caused panic.

So how did an outage at a company no-one in the general public has really heard of before, cause such a ruckus? Well to get to the bottom of that we need to get a better understanding of how the internet functions, and some of the tips and tricks that webmasters employ to get their content in front of their users as quickly as possible so as not to lose users.

When someone goes to a website on the internet there is a flurry of communication between their device and various internet services to then serve the web page. Here is a rough pictorial guide to what happens:

Once the user has told their web browser what website they are wanting to view, requests are fired to Domain Name Service (DNS) servers in order to translate the address entered into an address that computers understand (an Internet Protocol (IP) address). That information is then used to talk to the appropriate server (or load balancer if the website is big enough, which then directs traffic to an available web server) to return the web page you have requested. That page may have a number of images and fonts and scripts linked to that all need downloaded in order to display the website you have requested on the device you are requesting from.

That’s a bit of the background behind how the internet works for websites. But where do CDN’s fit into this mix?

Ever called someone overseas and noticed the delay between what you say, and their response? This effect is called latency. It’s the delay between your initial request, and you getting a response. Even with a global network using fibre connections, which are as fast as the speed of light, if I request a website on my device here in New Zealand, and it is hosted in the UK, every request to the web server is going to take at least half a second just to get from my device to the server and back, and that does not factor for any processing time on the web server slowing things down as well. If a web page has 30+ media assets, which is very common now-a-days, the website will feel almost unusable. The further away a server is from its users, the slower it will be able to respond to user requests.

This is where CDN’s come in. A global Content Delivery Network is a network of computers located around the world. These computers are set up as a cache for the websites you are visiting. Website owners tell their domain names to resolve to the servers of the CDN instead of the origin servers, and then the CDN is configured to know how to get teh requested content from an origin server where the content is hosted. So, the first time you visit the website, the CDN server which is geographically closes to you, fetches your content from the origin host. It also keeps a copy of the content that the origin server has served, so if anyone else needs that content, it can return it directly instead of needing to route the request to the other end of the globe. This has the end effect of the website appearing to be served from the location of the CDN’s server that is closest to you. So each request to the web server now takes 50ms instead of 500ms+ The more ‘edge’ locations the CDN has, the better the chances of them having a server as close to you as possible.

The other advantage of CDN’s is that you now have a pool of servers serving your website traffic, so if one edge location drops into an error state, other servers can take up the slack, without the need for a huge amount of traffic back to the origin server, adding load.

CDN’s also get around a bit of a flaw in the way that internet browsers load media assets from web servers. Most web browsers will load content in a ‘blocking’ way, meaning they only open up a maximum of 10 connections (typically its only 4-6 connections without tweaking) to a remote web server / domain simultaneously. This means you have to wait for one asset to complete download before you can fetch the next one. Using a CDN, all assets can be downloaded simultaneously in a ‘non-blocking’ fashion, so page load speeds are vastly improved here too.

Due to all of these advantages, it makes a lot of sense for websites being served to a global audience to use a CDN to make their websites quicker for their end users wherever they are in the world. And there are a number of providers that offer this service to website owners. Some you may have heard of, like Cloudflare, Akamai, and Amazon’s Cloudfront. Fastly is another provider in this space that has a huge number of servers scattered around the globe, and boasts very impressive latency figures worldwide, which is how it has become popular with a number of larger websites around the globe.

Knowing what we know about CDN’s now, it becomes easier to understand how half the worlds websites dropped out. The official line from Fastly is that a configuration error caused ALL of their CDN servers to refuse to serve any website content. It took an hour to resolve. If this had have been one or two servers then the CDN would have healed itself nicely and no-one would be the wiser – sites may be a little slower for some locations, but generally it’d be fine. But if you push out a global configuration that wipes out the function of all your servers, there is no saving that until you push out a revised configuration that undoes the breaking change. The more clients you have, the more websites are effected. From this outage, its easy to see that Fastly have a large client base around the world, and no doubt they are now contemplating their options for reliable CDN providers.

If you need help getting your websites working at optimal speed in front of a global audience, using trusted CDN partners, get in touch with Webmad and we’ll help you plan and implement solutions for optimal performance.

Categories
Hosting Security

Why do I need an SSL certificate on my website?

Heres the thing… many websites don’t need one. Will the world break? Nope. Will you be putting your best face out to the world if you don’t have one? Well… not really. And this is the tricky bit.

Most browsers nowadays will mark your website as not being secure if you don’t have an SSL certificate, and you will be penalised in search result rankings by the big search players like google etc for not having one. Seems a bit unfair really… but – lets take a look at why we have SSL certificates, and then it might be easier to see why they are actually a good thing to have.

So – what on earth is this SSL thing anyways?

SSL stands for Secure Sockets Layer. Its not like a physical thing. Its a protocol. Don’t zone out. This bits important. SSL is a method of communicating from one device to another, typically from your computer / laptop / mobile phone / tablet / whatever, to the server which hosts your website.

So normal communication for website traffic is sent in plain text. It uses HTML coding language to make it look pretty when you see it, but anyone could read the content and if you can understand html, even just a little, you can probably get the gist of what is happening on the page. If anyone was to get a copy of the communications between your device and the server (this can potentially happen at internet routers etc), they could see what you are up to, and potentially take over your communications and impersonate you to the server, and do things you probably didn’t intend.

A huge majority of the websites out there are the equivalent of an online brochure out there on the internet. So who would care if anyone has seen the content of peoples interactions with your site? Well yeah you wouldn’t really, and its not compulsory for this type of website to have an SSL certificate. But where this falls over is if your website has a contact form, or you ask for any sort of user input. If people could intercept that information, thats not ideal for your clients, and likewise not ideal for you.

This is where SSL comes in. It’s a protocol that defines a method of secure communication between your device and the website server. By securing the communication, no one can listen in on what you send to the server, or what the server sends back. Woo!

Jolly good… So why do i need an SSL certificate? Can I put it on the wall? Frame it? Is there a ceremony?

Yeah nah. What an SSL certificate does is it proves the server is who it claims to be, so that when you you set up an SSL communication link with it, the communication gets encrypted with a special hash (long string of numbers and letters that are mathematically representative of something) which proves that the communication is legitimate. That special hash is called the certificate. If any part of the communication can’t be decrypted with the certificate, lets say part of the communication has changed etc, then the client device can easily pick that up and fail the communication. Because the communication is encrypted, if anyone is watching the traffic, they would need that certificate in order to decode it… Only the device that set up the initial communication channel with SSL can decrypt the communications.

An SSL certificate is locked to a particular domain name. So if someone was to copy your website, they could not use your SSL certificate because it wouldn’t match the domain. Some SSL certificates allow for multiple domain names (sometimes referred to as SANS) to be serviced by the one certificate (lets say you have a website that has multiple domain names pointed at it, but its all served by the same server). You can also get what are known as wildcard ssl certs which are valid for any subdomains of your primary domain name. ie shop.example.com and web.example.com

You can also get stronger SSL certificates. This is measured by the number of bits (digital measurement) of numbers and letters that are used to make up the certificate. So you can get 256 bit through to 2048 bit at the moment, with 1024 to 2048 bit certificates being the industry standard at the moment. The more bits your certificate is, the harder it is for someone trying to decrypt anything signed with it.

The third parameter you deal with when purchasing your SSL certificate is that you need to verify that you are who you say you are. This can be done in 2 ways. Either domain verified or organisation verified.

  • Domain verified: This is the easiest form of certificate to get. All you need to do to prove ownership is either verify you have access to an email address linked to the ownership of the domain name you are trying to protect, or to place a file on the website hosting for that domain at a particular location so that the issuing authority can visit it to prove it’s you. Some issuing authorities also allow for DNS based verification where you alter a DNS record on your domain. Thi is by far the quickest option, and can be completed in minutes.
  • Organisation verification. This is harder and takes quite a bit longer. You have to verify the domain name as above, but you also need to verify that the company or organisation purchasing the certificate is a valid company or organisation, and has a physical address and phone number verified by a 3rd party like the yellow pages etc. This process can take days or weeks.
Who gives these certificates out, and why can’t i just invent my own?

Well – you can generate your own certificates – these are called self signed certificates. But – because you make it yourself, no-one trusts them, cos you could say anything about yourself, and no-one else can verify your statement. I mean, I’m actually the worlds best chef… I could generate a certificate to tell you this. But if you asked my wife or kids…

Because of this, we need certification authorities who are globally trusted, who can then verify anyone looking to get an SSL certificate is who they say they are courtesy of the domain checks above or the organisation tests. Examples of this are Sectigo and GeoTrust. Different providers offer different services and levels of insurance against your communications being decryptable. These also come at different costs.

What do they cost?

Depends. There are providers like Letsencypt which provide free domain verified SSL certificates. These are great for most brochure websites mentioned above, and give you enough security for web browsers to call your website secure, and your customers peace of mind. If you are offering e-commerce on your website, or any form of data access which is potentially sensitive, then it is strongly recommended to purchase an SSL certificate provided by a provider that offers insurance, as these providers have high trust relationships with web browsers, and give you support with installation and ongoing security of your setup. Purchased SSL certificates typically start from around $10NZD per year + installation, through to multiple thousands of dollars per year (bank level) – it really depends on what you need the certificate to do.

Do I need it?

Nowadays, yip you really do. You need some form of SSL certificate, be it free or paid, just so your website looks safe out there on the internet. This is even more critical if you are wanting to attract visitors using search engines (you are penalised in ranking if you don’t have one) or you offer online products for purchase (e-commerce). Because you will be accepting user credentials or contact details etc, and in some cases accepting payment details, it is imperative for user security that all communications are secured.

There are also newer web technologies that will only work with SSL connections – things like websockets.

If you need assistance with getting your website secured, or have any issues with SSL certificates, contact the team at Webmad and they can get you all set up.

Categories
Hosting

What happens when a domain name expires?

[ Disclaimer: this is primarily written for the New Zealand context, so anything ending in .nz, but some parts are generally applicable ]

Oh dear. Your invoice for domain renewal has landed at the wrong email address, or your existing domain name registrar has gone quiet. This is definitely less ideal, and can leave you in the position of having a domain name that has expired. Lets explore what that means, and your options.

So. Domain names expire. You can think of it like ‘owning’ a domain name is more like a subscription.. You subscribe to the domain name, you pay for it each year, and you get full rights to it. When the subscription ends (the domain expires) then the domain moves into a process of expiring.

The domain name is placed into ‘Pending Release’ status for a period of 90 days. In this state, the domain name is inactive (mail and websites won’t work) but it is still registered to you. You can renew at any stage during this 90 day period (some registrars charge more to renew your domain the closer you get toward the 90 day mark) and by doing so, this reactivates the domain name. You can also transfer your domain name to another registrar during this period if you want – only some registrars allow this incoming transfer, or allow you to get the domain ownership code while the domain is expired, so it can pay to check first. If the registrant of the domain (You) fails to renew by the 90th day, the domain name is released available for registration on a first in, first served basis by the .nz Registry.

Ideally you’ll catch your name back in that 90 day period. As the domain gets closer to the 90 day mark, it’ll get listed on services like https://www.expireddomains.co.nz/ so people can bid on the domain – highest bidder wins the domain provided that service catches the domain when it becomes available. This part gets interesting.

On the day that the domain name is set to ‘drop’ and becomes available for anyone to register, there is a set sequence that isn’t very well documented out there, but here is the process:

The domain gets queued up by the domain name commission for the next domain release window ( this is documented at https://docs.internetnz.nz/faq/general/ ). The release maintenance window runs from 00:29:00 to 00:34:00 and all domain names should be released during this maintenance window. So – at some point in that window, your domain name is going to become available. You are able to send up to 15 requests per second to try and catch the domain within this window, to try to be the first one to catch the domain when it becomes available. Its really a gamble as to whether you will land it or not.

The downside of this process is that once its gone through the process of being released to the public, you really have no say on getting the domain back. You’ve had your chances. That’s it. Its painful, but unfortunately the domain is completely out of your hands.

Domains can be confusing at the best of times. If you are having issues, or need a hand, get in contact and we’ll do our best to get you the best outcome.

Categories
Hosting

Re-streaming video from webcams to websites

What’s the problem?

One of the powerful things you can do with the internet nowadays is access web cameras and video sources from around the globe. Over the last 5+ years the team here at Webmad have been hosting web camera re-streaming services for https://taylorssurf.co.nz. The site runs a few IP cameras based at a local surf and recreation beach here in Christchurch, New Zealand. Since starting with this site, we’ve managed a number of different methods of getting the video from the various cameras, and for various clients as well.

Late evening view from one of the cameras streaming from the top of our office building

So the problem we are trying to solve is how do we get the video from the cameras out to viewers on the internet so that hundreds of people can view the video streams at once. Typically a camera has a limit of around 20 connected users at a time if you are trying to access the camera directly, and if your camera is on a fairly limited bandwidth internet connection (at Taylors Mistake we can only get VDSL speeds at best) then multiple people trying to access the cameras at once will kill the streams pretty quickly. The other issue is that the default streams from the cameras we’ve got there are not in overly friendly formats for websites (rtsp etc) meaning you’d have to use flash based video players, which pretty much all web browsers look down on these days.

So… What can we do?

In New Zealand, ISP’s don’t charge for internet traffic between two endpoints within their networks. This is fantastic as it means there is no charge for bandwidth between the cameras at remote locations and our re-streaming server we host locally, provided we use the same internet provider. This allows us to cost effectively re-stream the video, meaning there is only one connection to each camera pulling in video feeds, and that can then re-broadcast the camera feeds on a high capacity internet connection allowing thousands of end users to connect to view the video images.

Solution 1: The mjpeg streamer

On cameras that only output an mjpeg stream, we developed the mjpeg-streamer. Basically what it does is it connects to the camera source, and then feeds that into a memory buffer. Then any consecutive requests to the script, instead of fetching the camera feed from the camera, will connect to that same memory buffer and return the camera feed to the end user as an mjpeg stream. By using php tools like imagemagick you can add image overlays onto the stream as needed. This system works really well so long as you have difference memory locations for each different camera you are looking to re-stream, and requires low resource usage on the server side, so it can be easily used on shared hosting. The downsides of using this type of streaming is that there is no ability to alter resolution easily, and mjpeg streams have questionable capability with most modern browsers, and have been known to crash browsers that can’t clear older frames from the video from their memory.

Solution 2: The RTSP / RTMP re-streamer

A camera upgrade eliminated our ability to use mjpeg streaming, so we were forced to update our streaming strategy. The best tools for the job came in the form of the open source Red5 flash streaming server and the open source FFMPEG application. The aim here is to use ffmpeg to pull the video from the camera, and feed it to the red5 flash streaming server. Clients must then use a flash based player to connect to red5 and play the video stream. This works well, and any overlays can be injected during the ffmpeg based ingest process. The code is at https://github.com/stephen-webmad/rtsp-restream

Where this falls over is that modern browsers no longer support flash based players. So, we had to move to something else.

Solution 3: HTTP live streaming

Remember how we are using ffmpeg in the solutioni above? Well – turns out there is another format it can output, HLS (HTTP Live Stream). What is HLS? It’s a sequence of bite sized chunks of the video stream, all tied together using an index file. Where HLS comes out tops for live streaming is that it enables you to pause and rewind the live stream, allowing you to go back as many chunks as are stored in the index, which can be real handy. The index file can be as big as you want. The player just polls the index file (checks in on it every few seconds) to see if there are any new video chunks to download, and grabs them if there are. You can see this in action at https://canview.nz

HLS is super handy for web browsers and mobile as its easily playable, and you only need to fetch small chunks of video, meaning page load times as measured by Google etc are typically much better. The downside of HLS streaming is that your live stream will be delayed a little as it needs to build up the video chunks for the players to download, but for most things, that is an acceptable compromise. For most of our video streams this is around the 30 second delay sort of mark, but this can be shortened by tweaking the settings.

By using the wonderful ffmpeg software, which runs nicely on linux servers, you gain the advantage of being able to overlay imagery, alter resolutions and framerates, you can handle almost any form of ip camera, and you can also output multiple resolutions, and snapshot images at whatever time interval you wish, if you are keen to serve a static image fallback or splash screen to your viewers as well. HLS is also relatively easy to embed into an html5 <video> element ( there’s lots of javascript libraries to assist with this task ) allowing things like fullscreen and picture in picture.

We’ve not yet created a git repository outlining how we operate this form of live streaming – this is likely to come in the future, but for now, if you are looking to live stream a web camera from anywhere in New Zealand, or the world if you have friendly international traffic allowances, do please contact us and we can help make that happen, or if you’ve got the resources, we can assist with getting you all set up to run it yourself.

Categories
Hosting

High availability website hosting on Amazon Web Services

Tolerance is not a virtue that feels like it is growing in the world at the moment. This rings true on the internet more so than anywhere. Outages of web services, or websites going down is something that loses confidence in a business or organisation, so being able to offer services that are robust and highly tolerant of outage is increasingly a must. Web systems need to be able to handle sudden spikes in traffic, failure of servers, and anything else that can be thrown at them in order to still be able to serve customers reliably.

Here at Webmad we run a number of High Availability systems, so what we are going to do on this post, is to outline the basic concept behind them that we use to run PHP based web systems like WordPress and Moodle / Totara. We tend to use Amazon Web Services for setups like this as it has a tonne of tools that we are very familiar with that get the job done nicely, no matter how big the deployment.

Heres the graphic that outlines our usual setup:

The general concept for a High Availability hosting environment on Amazon Web Services
The general concept for a High Availability hosting environment on Amazon Web Services

So – lets go through the setup, what does what, and how we go about making it happen.

One of the key considerations of a high availability setup is for it to try to minimise any single points of failure within the system. Any points in the system where the failure of one component can mean an outage / failure, shouldn’t be acceptable – there should be redundancies to cater for outages etc etc. With this in mind, we select tools within the Amazon suite that factor for this.

First we start with the EFS (Elastic File System) service. This is selected, as one of the important things to bear in mind is that we can’t guarantee any page loads on the website will use the same web server each time. To rely on that would be tragic if the server the user was interacting with needed to be taken down / had a fault. By using a shared file system, replicated across multiple data centers and regions, uploads and data that needs to be persisted between all user facing servers can be shared effectively. Each server mounts the filesystem, and can access any files as needed. Our standard setups would only use shared filesystems for user contributed data, not for plugin or core system files. Shared network filesystems like EFS do not have the speed required for web systems, especially PHP systems, to include the multitude of files that typically get included into the system just to return one page load. By keeping these files on each servers EBS (Elastic Block Storage) based storage (equivalent to the servers hard drive) speed is optimal for a fast user experience. Typically user uploaded content does not need high performance, so using a network based filesystem is just fine.

The next service we make use of is the Relational Database Service (RDS) service. This service allows you to set up replicated database servers of any size you need, for mysql or postgresql based databases. They also have a service called Amazon Aurora which is a high efficiency cloud optimised mysql compatible service that allows for multiple replications over multiple data centers in multiple regions. These services allow you to scale your servers vertically (ie increase the power of the servers) and horizontally (more servers). Used with services like proxysql to spread load etc, you can get very flexible setups.

The core of many of our setups is to use a service called Elastic Beanstalk (EB). Elastic Beanstalk is a powerful set of services that allow you to operate and monitor a self healing, high availability setup. It sets up the load balancer used to route incoming web traffic to the web servers depending on load on each server etc, and also provides a firewall to restrict public access to just the open ports you need for your application to work. Elastic Beanstalk tracks how many virtual servers in the Elastic Compute Cloud (EC2) you will be running at any one time, and maintains this number of servers. It also allows you to define triggers to add additional servers depending on shared load across all servers, or any other triggers you define to add ro remove servers from the system.

One of the key considerations with Elastic Beanstalk is that you can only use server images from the one elastic beanstalk environment if you are looking to restore backups into the system. So – what we would normally do is fire up the elastic beanstalk environment, and then take an AMI image from one of the running servers. We would then create a new EC2 server from that image. This server will be used as a seed for the system. What I mean by a seed, is that we can make any changes to the system, or setup the application on this seed, and mount filesystems, and connect to databases etc, and then take an image of that seed once we are happy with it, and then within elastic beanstalk, update the base AMI id (the image that all servers started within the elastic beanstalk environment use) with the image from the seed.

The other advantage of running a seed server is that it can be used as a semi-staging server so you can test code changes before they are rolled out to full production, whilst still being in the production environment. The seed can also be used to run cron tasks for the system to keep cron tasks away from user facing servers so that extra load does not impact user experience. This is very useful for systems like Moodle / Totara that can run some rather large data collection / processing cron tasks. It is also handy to ensure that these tasks are only run on a single server, rather than all user facing nodes (servers) trying to run the same cron tasks at once.

With this setup, using elastic beanstalk to monitor server laoding and automatically cycle out replacement servers when anything goes wrong, or add and remove servers as needed to handle incoming load. There will always be a small period of time while new servers are launched where load may exceed capacity, but this can be minimised by having sensible early trigger levels for scaling, and suitably sized servers to handle typical load on the system. Ideally running at least 2 web servers is ideal.

To add capability to the system for speed optimisation or stability reasons, other fun things to try are to add AWS memcached or REDIS services into your application to cache session data or pre-compiled code in order to speed up operations. This is highly recommended for Moodle / Totara setups. You can also look to use tools like s3fs as alternatives to amazons EFS systems. This can be higher performing, but comes with additional risks with synchronisation settings. You can also investigate using rsync of files between shared filesystems and local (on-server system drives) in order to maintain optimised end user performance whilst maintaining the ability to update files across all servers relatively easily.

That’s a brief run-down of some of the high availability systems that the team at Webmad operate for various clients that need to factor for variable traffic loads without end user experience failure. If you’d like to get into more details, feel free to contact us to discuss your requirements, and how we can customise a system that will work for your needs.