Categories
Hosting

What happens when a domain name expires?

[ Disclaimer: this is primarily written for the New Zealand context, so anything ending in .nz, but some parts are generally applicable ]

Oh dear. Your invoice for domain renewal has landed at the wrong email address, or your existing domain name registrar has gone quiet. This is definitely less ideal, and can leave you in the position of having a domain name that has expired. Lets explore what that means, and your options.

So. Domain names expire. You can think of it like ‘owning’ a domain name is more like a subscription.. You subscribe to the domain name, you pay for it each year, and you get full rights to it. When the subscription ends (the domain expires) then the domain moves into a process of expiring.

The domain name is placed into ‘Pending Release’ status for a period of 90 days. In this state, the domain name is inactive (mail and websites won’t work) but it is still registered to you. You can renew at any stage during this 90 day period (some registrars charge more to renew your domain the closer you get toward the 90 day mark) and by doing so, this reactivates the domain name. You can also transfer your domain name to another registrar during this period if you want – only some registrars allow this incoming transfer, or allow you to get the domain ownership code while the domain is expired, so it can pay to check first. If the registrant of the domain (You) fails to renew by the 90th day, the domain name is released available for registration on a first in, first served basis by the .nz Registry.

Ideally you’ll catch your name back in that 90 day period. As the domain gets closer to the 90 day mark, it’ll get listed on services like https://www.expireddomains.co.nz/ so people can bid on the domain – highest bidder wins the domain provided that service catches the domain when it becomes available. This part gets interesting.

On the day that the domain name is set to ‘drop’ and becomes available for anyone to register, there is a set sequence that isn’t very well documented out there, but here is the process:

The domain gets queued up by the domain name commission for the next domain release window ( this is documented at https://docs.internetnz.nz/faq/general/ ). The release maintenance window runs from 00:29:00 to 00:34:00 and all domain names should be released during this maintenance window. So – at some point in that window, your domain name is going to become available. You are able to send up to 15 requests per second to try and catch the domain within this window, to try to be the first one to catch the domain when it becomes available. Its really a gamble as to whether you will land it or not.

The downside of this process is that once its gone through the process of being released to the public, you really have no say on getting the domain back. You’ve had your chances. That’s it. Its painful, but unfortunately the domain is completely out of your hands.

Domains can be confusing at the best of times. If you are having issues, or need a hand, get in contact and we’ll do our best to get you the best outcome.

Categories
Security

How to prevent email spam from my website

“Get a website” they said. “It’ll get you heaps of new clients” they said. You’ve invested into a website that acts as online brochure with the aim of bringing in clients and potential sales. Its got a contact form, maybe you have a blog on there to try to show you are still relevant… Isn’t it disheartening when what feels like the only contact you get through the site is spam. It plagues your inbox, it gets filtered to your spam folder, and then you never know what is legitimate or not… Aaaargh!

We hear it a lot. “I’ve started getting a lot of spam from my website…”. Firstly, we are going to go through how on earth all the spam is getting there in the first place, and then we’ll go through a list of preventive tools that you can use to help avoid getting bogged down in the ‘noise’, allowing you to focus on your real clients, ideally without forcing them to jump through hoops to prove they are legitimate.

So… Where is all this spam coming from?

Nowadays, most spam generated on websites comes from automated processes, often referred to as Bots. Basically some clown somewhere decides it’d be great to try to get their message in front of the website owner, or in the case of blogs, even potentially in front of your target audience by getting their comments published in your website. Yeah most of the time you wouldn’t be daft enough to publish their comments, but if someone does, and they get their message / website link hosted on your site, then your SEO helps promote their SEO and they win the battle of Google-sberg. Not ideal for a clean internet. But, that little bot of cheap bot code can be run against hundreds of websites, and keep on trying with no further cost to the people who developed it, and potential for payoff, so the spam keeps rolling in. Small tweaks to the bot code gets around little changes made to try to prevent its effect. So, we’ve got to get smart.

Basic workarounds:

Your standard bot simply reads the code that is used to display a form on your page. It then plucks out all input fields, populates them with some form of content, and fires them back at your website, which then emails the submitted data to you / someone. One of the simplest methods of detecting bogus entries to your site, is simply to add an extra field into your forms, that is hidden from normal users (ie using the css property “display:none” or similar, ideally applied to a class name so that its harder for the bot to recognise it is a hidden field). If you detect if there is content submitted on the hidden field, ie content that got there that no normal user would have been able to fill in, then we can pretty reliably say that the submission is bogus. This type of spam rejection is sometimes called a ‘honeypot’ – the bot sees the lure of another input to fill in, gets its hand in the jar, and is consequently found with honey stuck to it. Poor thing.

Many form plugins for popular web systems allow for honeypot style traps built in, to be enabled on forms you create with their tools – plugins like Gravityforms for WordPress. I’d recommend that when evaluating form plugins, this is a quick win option that helps sway selection of the best fit.

Captcha, Recaptcha, and annoying your users.

Sounds sinister eh? Don’t Captcha me! But what is a Captcha? You’ve likely seen them. Its those funny wee ‘Type the text you see in the image’ questions that you get on some forms, and half of them aren’t even readable, and you just get that little bit frustrated ‘cos its effort. Its not even for your benefit!

So Captcha is the term for those image recognition questions. Why do we have them? Well – because they are hard. Not just for humans – they are really hard for computers to figure out. How do i tell a line or shape from a letter of the alphabet. Humans are great at pattern recognition, especially when trained to do it since around the age of 5. Computers? The harder the image (ie warped text, lots of foreign objects, characters without solid borders etc) the less likely the computer / bot will be able to resolve it to a satisfactory, correct answer. This method works well at preventing spam… but also for putting off legitimate clients, unless they have good enough reason to contact you to move past the hurdles you put in front of them.

So then Google put some weight behind ReCaptcha – a similar concept, but with some extra smarts behind it. Instead of just throwing an image onto the page, it uses some code that is only rendered in the web browser of your visitor, and uses that to then add the verification image, and outsources validation of that image to the ReCaptcha service. Pretty cool stuff. Still a pain to fill in for your users, but doing it this way gives the same powerful tools to more forms systems out there on the web, in a consistent way, and has good rejection rates.

The latest version of ReCaptcha doesn’t show images anymore – either a wee tickbox to tick to show you are a human, or an option to not show anything at all, and just rely on neat detection algorithms. Many websites rely on this method – its not perfect, but it does a pretty good job against most incoming spam.

Are there ways to weed out spam without relying on user entry / client side tricks?

How good of you to ask. Why Yes. Yes there are. There are a number of services out there that you can forward the content of your submitted data to, and they run filters on it and can detect if the content is obvious spam (anyone wanna buy some viagra or cialis?). In the WordPress world the most obvious one is Akismet. The great things about these tools is they can be run retroactively on previous comments in your system to weed out spam from them as well. Very helpful. Another we have had great success with, that allows integration with a variety of web systems is Cleantalk.

These third party filtering systems use learning filters to target the ‘in season’ spam content trends, and block them, so you don’t need to stay on top of them. They aren’t perfect – it is possible you will get some false positive recognitions (legitimate messages that are seen as spam on content analysis alone) but typically they give good interfaces for whitelisting content or users so the systems can learn from their mistakes.

So… What should we do?

The best approach to most problems is multi-faceted. The options represented above all attack spam submissions in different ways. Traps, challenges, and filters. We have found our most reliable setups have been mixtures of each, depending on the context of what we are looking to protect. To prevent spam in blog comments and contact forms / calls to action: a honeypot to catch most of the bots, and cleantalk to catch the ones that get through is a good fit. For user registration forms or user login protection – recaptcha works well as your client already knows they have work to do to get at the goodies in store once they put in the effort to get past your hurdles.

Find what won’t annoy your users, and use that. There is plenty of options out there. Still stuck, or not sure how to implement your changes? I know some people who could help.

Categories
Hosting

Re-streaming video from webcams to websites

What’s the problem?

One of the powerful things you can do with the internet nowadays is access web cameras and video sources from around the globe. Over the last 5+ years the team here at Webmad have been hosting web camera re-streaming services for https://taylorssurf.co.nz. The site runs a few IP cameras based at a local surf and recreation beach here in Christchurch, New Zealand. Since starting with this site, we’ve managed a number of different methods of getting the video from the various cameras, and for various clients as well.

Late evening view from one of the cameras streaming from the top of our office building

So the problem we are trying to solve is how do we get the video from the cameras out to viewers on the internet so that hundreds of people can view the video streams at once. Typically a camera has a limit of around 20 connected users at a time if you are trying to access the camera directly, and if your camera is on a fairly limited bandwidth internet connection (at Taylors Mistake we can only get VDSL speeds at best) then multiple people trying to access the cameras at once will kill the streams pretty quickly. The other issue is that the default streams from the cameras we’ve got there are not in overly friendly formats for websites (rtsp etc) meaning you’d have to use flash based video players, which pretty much all web browsers look down on these days.

So… What can we do?

In New Zealand, ISP’s don’t charge for internet traffic between two endpoints within their networks. This is fantastic as it means there is no charge for bandwidth between the cameras at remote locations and our re-streaming server we host locally, provided we use the same internet provider. This allows us to cost effectively re-stream the video, meaning there is only one connection to each camera pulling in video feeds, and that can then re-broadcast the camera feeds on a high capacity internet connection allowing thousands of end users to connect to view the video images.

Solution 1: The mjpeg streamer

On cameras that only output an mjpeg stream, we developed the mjpeg-streamer. Basically what it does is it connects to the camera source, and then feeds that into a memory buffer. Then any consecutive requests to the script, instead of fetching the camera feed from the camera, will connect to that same memory buffer and return the camera feed to the end user as an mjpeg stream. By using php tools like imagemagick you can add image overlays onto the stream as needed. This system works really well so long as you have difference memory locations for each different camera you are looking to re-stream, and requires low resource usage on the server side, so it can be easily used on shared hosting. The downsides of using this type of streaming is that there is no ability to alter resolution easily, and mjpeg streams have questionable capability with most modern browsers, and have been known to crash browsers that can’t clear older frames from the video from their memory.

Solution 2: The RTSP / RTMP re-streamer

A camera upgrade eliminated our ability to use mjpeg streaming, so we were forced to update our streaming strategy. The best tools for the job came in the form of the open source Red5 flash streaming server and the open source FFMPEG application. The aim here is to use ffmpeg to pull the video from the camera, and feed it to the red5 flash streaming server. Clients must then use a flash based player to connect to red5 and play the video stream. This works well, and any overlays can be injected during the ffmpeg based ingest process. The code is at https://github.com/stephen-webmad/rtsp-restream

Where this falls over is that modern browsers no longer support flash based players. So, we had to move to something else.

Solution 3: HTTP live streaming

Remember how we are using ffmpeg in the solutioni above? Well – turns out there is another format it can output, HLS (HTTP Live Stream). What is HLS? It’s a sequence of bite sized chunks of the video stream, all tied together using an index file. Where HLS comes out tops for live streaming is that it enables you to pause and rewind the live stream, allowing you to go back as many chunks as are stored in the index, which can be real handy. The index file can be as big as you want. The player just polls the index file (checks in on it every few seconds) to see if there are any new video chunks to download, and grabs them if there are. You can see this in action at https://canview.nz

HLS is super handy for web browsers and mobile as its easily playable, and you only need to fetch small chunks of video, meaning page load times as measured by Google etc are typically much better. The downside of HLS streaming is that your live stream will be delayed a little as it needs to build up the video chunks for the players to download, but for most things, that is an acceptable compromise. For most of our video streams this is around the 30 second delay sort of mark, but this can be shortened by tweaking the settings.

By using the wonderful ffmpeg software, which runs nicely on linux servers, you gain the advantage of being able to overlay imagery, alter resolutions and framerates, you can handle almost any form of ip camera, and you can also output multiple resolutions, and snapshot images at whatever time interval you wish, if you are keen to serve a static image fallback or splash screen to your viewers as well. HLS is also relatively easy to embed into an html5 <video> element ( there’s lots of javascript libraries to assist with this task ) allowing things like fullscreen and picture in picture.

We’ve not yet created a git repository outlining how we operate this form of live streaming – this is likely to come in the future, but for now, if you are looking to live stream a web camera from anywhere in New Zealand, or the world if you have friendly international traffic allowances, do please contact us and we can help make that happen, or if you’ve got the resources, we can assist with getting you all set up to run it yourself.

Categories
Hosting

High availability website hosting on Amazon Web Services

Tolerance is not a virtue that feels like it is growing in the world at the moment. This rings true on the internet more so than anywhere. Outages of web services, or websites going down is something that loses confidence in a business or organisation, so being able to offer services that are robust and highly tolerant of outage is increasingly a must. Web systems need to be able to handle sudden spikes in traffic, failure of servers, and anything else that can be thrown at them in order to still be able to serve customers reliably.

Here at Webmad we run a number of High Availability systems, so what we are going to do on this post, is to outline the basic concept behind them that we use to run PHP based web systems like WordPress and Moodle / Totara. We tend to use Amazon Web Services for setups like this as it has a tonne of tools that we are very familiar with that get the job done nicely, no matter how big the deployment.

Heres the graphic that outlines our usual setup:

The general concept for a High Availability hosting environment on Amazon Web Services
The general concept for a High Availability hosting environment on Amazon Web Services

So – lets go through the setup, what does what, and how we go about making it happen.

One of the key considerations of a high availability setup is for it to try to minimise any single points of failure within the system. Any points in the system where the failure of one component can mean an outage / failure, shouldn’t be acceptable – there should be redundancies to cater for outages etc etc. With this in mind, we select tools within the Amazon suite that factor for this.

First we start with the EFS (Elastic File System) service. This is selected, as one of the important things to bear in mind is that we can’t guarantee any page loads on the website will use the same web server each time. To rely on that would be tragic if the server the user was interacting with needed to be taken down / had a fault. By using a shared file system, replicated across multiple data centers and regions, uploads and data that needs to be persisted between all user facing servers can be shared effectively. Each server mounts the filesystem, and can access any files as needed. Our standard setups would only use shared filesystems for user contributed data, not for plugin or core system files. Shared network filesystems like EFS do not have the speed required for web systems, especially PHP systems, to include the multitude of files that typically get included into the system just to return one page load. By keeping these files on each servers EBS (Elastic Block Storage) based storage (equivalent to the servers hard drive) speed is optimal for a fast user experience. Typically user uploaded content does not need high performance, so using a network based filesystem is just fine.

The next service we make use of is the Relational Database Service (RDS) service. This service allows you to set up replicated database servers of any size you need, for mysql or postgresql based databases. They also have a service called Amazon Aurora which is a high efficiency cloud optimised mysql compatible service that allows for multiple replications over multiple data centers in multiple regions. These services allow you to scale your servers vertically (ie increase the power of the servers) and horizontally (more servers). Used with services like proxysql to spread load etc, you can get very flexible setups.

The core of many of our setups is to use a service called Elastic Beanstalk (EB). Elastic Beanstalk is a powerful set of services that allow you to operate and monitor a self healing, high availability setup. It sets up the load balancer used to route incoming web traffic to the web servers depending on load on each server etc, and also provides a firewall to restrict public access to just the open ports you need for your application to work. Elastic Beanstalk tracks how many virtual servers in the Elastic Compute Cloud (EC2) you will be running at any one time, and maintains this number of servers. It also allows you to define triggers to add additional servers depending on shared load across all servers, or any other triggers you define to add ro remove servers from the system.

One of the key considerations with Elastic Beanstalk is that you can only use server images from the one elastic beanstalk environment if you are looking to restore backups into the system. So – what we would normally do is fire up the elastic beanstalk environment, and then take an AMI image from one of the running servers. We would then create a new EC2 server from that image. This server will be used as a seed for the system. What I mean by a seed, is that we can make any changes to the system, or setup the application on this seed, and mount filesystems, and connect to databases etc, and then take an image of that seed once we are happy with it, and then within elastic beanstalk, update the base AMI id (the image that all servers started within the elastic beanstalk environment use) with the image from the seed.

The other advantage of running a seed server is that it can be used as a semi-staging server so you can test code changes before they are rolled out to full production, whilst still being in the production environment. The seed can also be used to run cron tasks for the system to keep cron tasks away from user facing servers so that extra load does not impact user experience. This is very useful for systems like Moodle / Totara that can run some rather large data collection / processing cron tasks. It is also handy to ensure that these tasks are only run on a single server, rather than all user facing nodes (servers) trying to run the same cron tasks at once.

With this setup, using elastic beanstalk to monitor server laoding and automatically cycle out replacement servers when anything goes wrong, or add and remove servers as needed to handle incoming load. There will always be a small period of time while new servers are launched where load may exceed capacity, but this can be minimised by having sensible early trigger levels for scaling, and suitably sized servers to handle typical load on the system. Ideally running at least 2 web servers is ideal.

To add capability to the system for speed optimisation or stability reasons, other fun things to try are to add AWS memcached or REDIS services into your application to cache session data or pre-compiled code in order to speed up operations. This is highly recommended for Moodle / Totara setups. You can also look to use tools like s3fs as alternatives to amazons EFS systems. This can be higher performing, but comes with additional risks with synchronisation settings. You can also investigate using rsync of files between shared filesystems and local (on-server system drives) in order to maintain optimised end user performance whilst maintaining the ability to update files across all servers relatively easily.

That’s a brief run-down of some of the high availability systems that the team at Webmad operate for various clients that need to factor for variable traffic loads without end user experience failure. If you’d like to get into more details, feel free to contact us to discuss your requirements, and how we can customise a system that will work for your needs.