Categories
Interaction Security Technology

Cookies – trick or treat?

One of many annoyances of the internet these days is the dreaded ‘Please accept our cookies’ popup you see on a great number of websites, warning you of the intention of the site you are visiting to give you things called cookies. They sound soo sweet, digestable, and innocent. But how many of us actually know what they are, how they are used, and if they are dangerous or not?

So – what is a cookie and why are they on the internet?

A cookie, in the internet sense, is a wee fragment of data that a website can store in your web browser for a defined period of time. This can be until you close your browser, it can be days or weeks. Once a cookie is stored on the end users browser, that cookie of information is sent to the server with every new page request or interaction with that websites server. Cookies are restricted to only send data back to the domain name that set them. A cookie is unique to each user – they may store the same information, but because they are stored on the end users device, they are unique to that user.

Where they get powerful is that website developers can store data in a cookie that enables them to customise our browsing experience on their website. Typically what this looks like is when a user has logged in to a website a token is stored on a cookie for that user session so that every subsequent request to the server can prove that it is from the logged in user, and the server can customise its response according to your profile and stored settings. This is really useful.

Where this can get risky though, is when you visit websites that use advertising networks. Advertising networks can set cookies on your computer to track what websites you have visited, and your preferences so they can target you with ads for things they think you need. This is seen as predatory, and can give these networks a huge wealth of information about you and your online habits. The more websites an advertising network is used on, the more data they can collect.

Its this predatory use of cookies on websites that has given cookies their bad name. Cookies as an object are quite harmless – they do not contain code that gets executed or anything dangerous, but they can store information that can be used to identify individual users and ‘follow’ them around. To break up the amount of data that can be used to identify a user, it is recommended to either use a cookie blocker in your browser that can determine if the cookie is from an advertising network or not.

While cookies are generally safe to accept, websites in many geographic locations nowadays need to request the users permission before they can store cookies in their website browsers. The lawmakers in these regions pass laws to make this mandatory for sites doing business in these regions so that their people can make informed decisions on what information can follow them around on the internet.

If you visit a website that you know you won’t be logging in to or signing up for, then there is no need to accept the cookies on that site. If you are keen to interact with the site, and have a customised experience, then accepting cookies is quite fine. You can always clear out cookies from your browser at any stage – the process varies depending on what web browser you are using, but you can view the content of any of the cookies, and delete whichever ones you prefer.

Categories
Hosting Technology

What is a Content Delivery Network (CDN)?

This past week the buzz-word floating about internet related conversations has been the drop out of a huge chunk of the internet related to an outage from the CDN provider Fastly. A good number of websites went out world-wide, and high traffic sites experienced either total outage or parts of their networks unable to be reached. It felt like a digital apocalypse for many. For some of our clients there was glee as their competition were taken offline by this outage. In the end, it was only for an hour, and late in the evening New Zealand time, but it still caused panic.

So how did an outage at a company no-one in the general public has really heard of before, cause such a ruckus? Well to get to the bottom of that we need to get a better understanding of how the internet functions, and some of the tips and tricks that webmasters employ to get their content in front of their users as quickly as possible so as not to lose users.

When someone goes to a website on the internet there is a flurry of communication between their device and various internet services to then serve the web page. Here is a rough pictorial guide to what happens:

Once the user has told their web browser what website they are wanting to view, requests are fired to Domain Name Service (DNS) servers in order to translate the address entered into an address that computers understand (an Internet Protocol (IP) address). That information is then used to talk to the appropriate server (or load balancer if the website is big enough, which then directs traffic to an available web server) to return the web page you have requested. That page may have a number of images and fonts and scripts linked to that all need downloaded in order to display the website you have requested on the device you are requesting from.

That’s a bit of the background behind how the internet works for websites. But where do CDN’s fit into this mix?

Ever called someone overseas and noticed the delay between what you say, and their response? This effect is called latency. It’s the delay between your initial request, and you getting a response. Even with a global network using fibre connections, which are as fast as the speed of light, if I request a website on my device here in New Zealand, and it is hosted in the UK, every request to the web server is going to take at least half a second just to get from my device to the server and back, and that does not factor for any processing time on the web server slowing things down as well. If a web page has 30+ media assets, which is very common now-a-days, the website will feel almost unusable. The further away a server is from its users, the slower it will be able to respond to user requests.

This is where CDN’s come in. A global Content Delivery Network is a network of computers located around the world. These computers are set up as a cache for the websites you are visiting. Website owners tell their domain names to resolve to the servers of the CDN instead of the origin servers, and then the CDN is configured to know how to get teh requested content from an origin server where the content is hosted. So, the first time you visit the website, the CDN server which is geographically closes to you, fetches your content from the origin host. It also keeps a copy of the content that the origin server has served, so if anyone else needs that content, it can return it directly instead of needing to route the request to the other end of the globe. This has the end effect of the website appearing to be served from the location of the CDN’s server that is closest to you. So each request to the web server now takes 50ms instead of 500ms+ The more ‘edge’ locations the CDN has, the better the chances of them having a server as close to you as possible.

The other advantage of CDN’s is that you now have a pool of servers serving your website traffic, so if one edge location drops into an error state, other servers can take up the slack, without the need for a huge amount of traffic back to the origin server, adding load.

CDN’s also get around a bit of a flaw in the way that internet browsers load media assets from web servers. Most web browsers will load content in a ‘blocking’ way, meaning they only open up a maximum of 10 connections (typically its only 4-6 connections without tweaking) to a remote web server / domain simultaneously. This means you have to wait for one asset to complete download before you can fetch the next one. Using a CDN, all assets can be downloaded simultaneously in a ‘non-blocking’ fashion, so page load speeds are vastly improved here too.

Due to all of these advantages, it makes a lot of sense for websites being served to a global audience to use a CDN to make their websites quicker for their end users wherever they are in the world. And there are a number of providers that offer this service to website owners. Some you may have heard of, like Cloudflare, Akamai, and Amazon’s Cloudfront. Fastly is another provider in this space that has a huge number of servers scattered around the globe, and boasts very impressive latency figures worldwide, which is how it has become popular with a number of larger websites around the globe.

Knowing what we know about CDN’s now, it becomes easier to understand how half the worlds websites dropped out. The official line from Fastly is that a configuration error caused ALL of their CDN servers to refuse to serve any website content. It took an hour to resolve. If this had have been one or two servers then the CDN would have healed itself nicely and no-one would be the wiser – sites may be a little slower for some locations, but generally it’d be fine. But if you push out a global configuration that wipes out the function of all your servers, there is no saving that until you push out a revised configuration that undoes the breaking change. The more clients you have, the more websites are effected. From this outage, its easy to see that Fastly have a large client base around the world, and no doubt they are now contemplating their options for reliable CDN providers.

If you need help getting your websites working at optimal speed in front of a global audience, using trusted CDN partners, get in touch with Webmad and we’ll help you plan and implement solutions for optimal performance.

Categories
Technology

What is a Progressive Web App?

For a long time, mobile apps have been the in thing. Businesses needed mobile apps to engage customers. To get your brand on their phones. But mobile apps have for a long time been expensive. And you need to develop an app for each of the various mobile environments – Apple’s iOS and Google’s Android.

The problem with a lot of these apps is they typically don’t actually need to be traditional apps. The only reason to have a proprietary app developed for the various mobile environments is to enable interaction with hardware on the device. Things like working with bluetooth, audio or customising use of the devices camera. Most apps that have been developed don’t need this, and this is where progressive web apps (PWA’s) can offer a cost effective solution.

Most of the functionality that these apps need can easily be covered with a web page. Doing this gives universal compatibility between mobile devices, desktop computers – basically anything with a web browser. This means developing for one environment, and knowing it will work everywhere. This takes much less time, and as its using standard web formatting, there is a much wider available pool of developers who can assist.

The biggest hurdle to using web technology on mobile devices has always been that its doesn’t work when there is no connectivity to the internet. Thankfully this is where progressive web apps come into their own. Progressive web apps add a layer of functionality that allows offline caching of data, both with the use of databases embedded into the web browsers themselves, and tools to detect if we have connectivity to the source web servers or not in order to use the local (on device) storage or not.

The other advantage of progressive web apps is that they are now accepted in both of the mobile environment application stores. Standard web pages don’t get that luxury. Standard apps have a long approval process for each and every update you release through the app stores, whereas PWA’s you can update on the fly whenever you need, so any security or bug fixes are on-device the next time the users device has internet connectivity. This is a major improvement especially if you were to release into production with any issues – waiting a week or so to get an update approved can be fatal to your brand.

So – PWA’s are cost effective, have wide compatibility across devices and platforms, and are easier to maintain long term. If you don’t need any hardware integration outside of what a standard web browser can do, then they make a lot of sense. If you are in need of an application for mobile devices, get in contact and we can talk through the various options, and what will suit your needs best.

Categories
Security

What is 2 Factor Authentication (2FA) ?

Its become increasingly popular for websites these days to request two factor authentication to be added to your login for extra security. This is a good thing… but why? And what is 2FA?

There are lots of different ways to authenticate yourself. These get lumped into 3 main groups, called factors:

  • Something you know ( ie a password or phrase you can remember )
  • Something you have ( ie a device that you have with you that can give a code to assist with authentication, or something like a credit card )
  • Something you are ( ie a fingerprint or facial recognition, or an iris scan like in the movies )

Soo – knowing there are 3 possible factors that can be used in authentication, 2 factor authentication is simply authentication that uses a method from 2 of the main authentication type groups. Generally the ‘something you are’ type verification is tricky to implement – some cell phones and laptops have fingerprint verification, some mobile phones boast facial recognition as well… but in practice this is fairly hit and miss… you burn or cut your finger and you are locked out, or you wake up in the morning looking a bit rough, ad you are locked out.

Typically 2 factor authentication in the real world is done using a password or pin number (something you know) and something you have (either a mobile phone with an app on it or something like a credit card). Your EFTPOS card has had 2FA since waaaay back. The internet is just catching up. Its coming from a place where all you had to know was a password – a password for your email, a password for your banking, a password for your computer login… and they all must be unique and 8+ characters long with a capital and a number and a symbol and your first pets name and… well the list goes on. All things from the ‘stuff you know’ pile.

So to bring in the ‘Something you have’ group, what most places do now is they rely on your smartphone to be able to provide the ‘something you have’ component – most people live with them on their hip, and they are easy to code for. Either send them a text message (sms), or write a mobile phone app that can provide a code that only the web server and the mobile phone app can validate its a particular user.

Why is this so much better than single factor authentication?
It is becoming increasingly easy to brute force decrypt a password. Heck – some poorly written websites have even been known to store passwords in plain text, so they are humanly readable if you get access to the storage that holds them. By adding 2FA, even if someone did manage to work out the password, they won’t have access to the device that completes the authentication, so whatever it is you are protecting with authentication, is still safe as long as it requires both password and a second factor.

If you’ve got a website that you need to secure, we strongly recommend 2FA if possible. I know some people who can help make this happen

Categories
Projects

So… Its turning to crap huh?

Likely you don’t have time to read this then… but in case this is pre-emptive and not reactionary – heres some tips for when you inevitably find yourself at the wrong end of the shovel trying to work your way back to breathing space…

Firstly: Stay calm!
Your a web developer. Yeah people get frustrated, and there’ll be a bit of puffery, but think through what is the worst that could happen. 99% of the time it’ll just be some sites are unavailable for a bit, there may be some reputation damage, or if theres been a mistake with some calculations – there may be some financial repercussions. This is not ideal at all – totally, but getting stuck on this will not help get the problem solved.

Secondly: Start from the top.
In my spare time I enjoy being an audio tech around the place. Troubleshooting is very similar in that world – a mic or speaker or something isn’t working, you work out how to fix it. And the process is the same. Start at the top of the signal chain (ie from the microphone), and check you have signal at each point in the chain from there right on through to the speakers. In development land, The input into the system is the end user – what does their browser hit first. Throw in some logging at that point and make sure the browser gets there. For many PHP frameworks that means starting at the index.php file and then following through a series of includes narrowing down till you find the culprit code / system / resource failure. Don’t be afraid to throw exit statements around – the system is already broken right? If you can be doing this in a development environment – ideal – else do what you can.

Thirdly: Bring your toolbox.
A tradesperson is no good without their tools. Ours is a bit more virtual than most, but the theory is the same. Know how to get log information, get familiar with linux tools like:
grep -rnw ./ -e "text to find" //recursively find text
tail -n50 filename //show last 50 lines of a file
tail -n1000 filename | grep "text to find" //find text in the last 1000 lines of a file


For php – learn how to turn debugging on:
ini_set('display_errors',1);
error_reporting(E_ALL);

for production environments investigate silently logging to a file – many frameworks support this, so it has minimal client impact. Otherwise send error information via email when it happens, or use services like rollbar or sentry to track issues.
Don’t be afraid to echo and exit. dump what data you do have to the browser so you know your script is on the right track. break your code down into components in order to rule out chunks of code from being the issue. Once you know what isn’t broken, you can quickly hone in on what is.

Lastly: Communicate.
Tell people what is going on. Often talking through a problem is a key to solving it – whether the person understands what you are on about or not. Just by replaying it in your head trying to package the problem for others to understand, you will think through potential areas you have missed checking. Have great communication with the key stakeholders. If people understand what is going on and what you are doing to try to fix the problems, they are likely to be more lenient, but also they are likely to feel part of the team and with a bit of control, rather than on the outside questioning what on earth is going on. Open and honest is the best policy. Don’t hide things. If you’ve cocked up – own it. If a member of your team has dropped the ball – don’t throw them under the bus, but the whole team wears the responsibility of getting things solved and communicating to the client / stakeholders what is going on.

We’re all human. Crap happens. Its how you approach it that makes the difference.

Categories
Interaction Projects

Architectural Lighting with website control

Yep – thats right – controlling building lighting from a website. I mean – how hard could it be right? Nowadays there are plenty of apps for your phone to control all manner of LED based lights. But we were keen to take this concept a bit further. So we built https://webmad.co.nz/tower/

The concept:

Our office is in an old air traffic control tower. We thought it’d be pretty neat to light up the cabinet at the top (the bit with all the windows that the controllers would have sat in) for Christmas, and give people the opportunity to request whatever colour they want the lights to display, through a public web page. Bit of a marketing ploy, but also a fun wee project to flex some of the many skills within our team.

The gear:

  • SP108E LED Wifi Magic Controller (we are using 1 of these to control all LED’s)
  • WS2815 DC12V RGB LED Strip Light (we are using 4 of these – each 5m strip has 60 LED’s per metre, and pulls a total of 90 watts)
  • DC12V LED Power Supply 10A Switch Mode Transformer
  • Wifi access point connected to a network with DHCP (most home wifi routers would do the job fine here)
  • A computer (headless raspberry pi would be more than fine here)
  • A smartphone for initial configuration

The LED’s, power supplies and controller were all sourced from Ali Express relatively cheaply. The rest we “had lying around”…

The hardware setup:

So – one power supply per 5m LED strip. The strips have connectors at each end so you can connect them up for the data connection, and tails (leads allowing power connection) so we connected all 4 LED strips into one long strip. We’ve been careful to make sure that its wired so all turn on at once, so as not to overload any one power supply trying to power all lights at once. Options to do that can be to switch everything on or off all at once, or separate power to each of the LED strips so that each strip can only be powered by one power supply, and the only linking wires between strips are data wires, not V+ or V-.

On one end of the LED strip, attach the SP108E controller. One it is powered up, you’ll need to connect to it using the supplied instructions in the box and their smartphone app, which allows you to set the wifi network that the controller connects to. On your router you should be able to tell the DHCP server to assign a static ip address to the controller, so that you can then consistently connect to the one IP address that controls the LED’s.

Once we can connect to the controller consistantly, we can then put https://github.com/Lehkeda/SP108E_controller onto the raspberry pi, set the IP address to look at the controller, and start playing. You will likely need to change the LED’s per segment, and number of segments settings to ensure all the LED’s are getting signal and behaving as expected.

How we’ve tweaked it up from there:

We’ve set up a database table to store a colour change queue, and tolde teh php script above to poll that database table for changes. If a change is noticed, we can then fade between colours (the fade is a custom function we have written). We then have a public facing website that allows people to load up colours into the queue, and the building changes colour every minute to reflect the next queue colour. No changes in the queue? It’ll just show the last colour until something new comes along.

Keen to enable this level of control in your building or office space? Its quite neat allowing your people / clients to interact with your building, and also its a draw-card that can be promoted widely – people love getting reaction from actions they have taken.

I’f you’d like to set up something similar, large project or small, get in contact – there are lots of really neat ways to make this happen, this is just one. We can assist with your next project.

Categories
Hosting Security

Why do I need an SSL certificate on my website?

Heres the thing… many websites don’t need one. Will the world break? Nope. Will you be putting your best face out to the world if you don’t have one? Well… not really. And this is the tricky bit.

Most browsers nowadays will mark your website as not being secure if you don’t have an SSL certificate, and you will be penalised in search result rankings by the big search players like google etc for not having one. Seems a bit unfair really… but – lets take a look at why we have SSL certificates, and then it might be easier to see why they are actually a good thing to have.

So – what on earth is this SSL thing anyways?

SSL stands for Secure Sockets Layer. Its not like a physical thing. Its a protocol. Don’t zone out. This bits important. SSL is a method of communicating from one device to another, typically from your computer / laptop / mobile phone / tablet / whatever, to the server which hosts your website.

So normal communication for website traffic is sent in plain text. It uses HTML coding language to make it look pretty when you see it, but anyone could read the content and if you can understand html, even just a little, you can probably get the gist of what is happening on the page. If anyone was to get a copy of the communications between your device and the server (this can potentially happen at internet routers etc), they could see what you are up to, and potentially take over your communications and impersonate you to the server, and do things you probably didn’t intend.

A huge majority of the websites out there are the equivalent of an online brochure out there on the internet. So who would care if anyone has seen the content of peoples interactions with your site? Well yeah you wouldn’t really, and its not compulsory for this type of website to have an SSL certificate. But where this falls over is if your website has a contact form, or you ask for any sort of user input. If people could intercept that information, thats not ideal for your clients, and likewise not ideal for you.

This is where SSL comes in. It’s a protocol that defines a method of secure communication between your device and the website server. By securing the communication, no one can listen in on what you send to the server, or what the server sends back. Woo!

Jolly good… So why do i need an SSL certificate? Can I put it on the wall? Frame it? Is there a ceremony?

Yeah nah. What an SSL certificate does is it proves the server is who it claims to be, so that when you you set up an SSL communication link with it, the communication gets encrypted with a special hash (long string of numbers and letters that are mathematically representative of something) which proves that the communication is legitimate. That special hash is called the certificate. If any part of the communication can’t be decrypted with the certificate, lets say part of the communication has changed etc, then the client device can easily pick that up and fail the communication. Because the communication is encrypted, if anyone is watching the traffic, they would need that certificate in order to decode it… Only the device that set up the initial communication channel with SSL can decrypt the communications.

An SSL certificate is locked to a particular domain name. So if someone was to copy your website, they could not use your SSL certificate because it wouldn’t match the domain. Some SSL certificates allow for multiple domain names (sometimes referred to as SANS) to be serviced by the one certificate (lets say you have a website that has multiple domain names pointed at it, but its all served by the same server). You can also get what are known as wildcard ssl certs which are valid for any subdomains of your primary domain name. ie shop.example.com and web.example.com

You can also get stronger SSL certificates. This is measured by the number of bits (digital measurement) of numbers and letters that are used to make up the certificate. So you can get 256 bit through to 2048 bit at the moment, with 1024 to 2048 bit certificates being the industry standard at the moment. The more bits your certificate is, the harder it is for someone trying to decrypt anything signed with it.

The third parameter you deal with when purchasing your SSL certificate is that you need to verify that you are who you say you are. This can be done in 2 ways. Either domain verified or organisation verified.

  • Domain verified: This is the easiest form of certificate to get. All you need to do to prove ownership is either verify you have access to an email address linked to the ownership of the domain name you are trying to protect, or to place a file on the website hosting for that domain at a particular location so that the issuing authority can visit it to prove it’s you. Some issuing authorities also allow for DNS based verification where you alter a DNS record on your domain. Thi is by far the quickest option, and can be completed in minutes.
  • Organisation verification. This is harder and takes quite a bit longer. You have to verify the domain name as above, but you also need to verify that the company or organisation purchasing the certificate is a valid company or organisation, and has a physical address and phone number verified by a 3rd party like the yellow pages etc. This process can take days or weeks.
Who gives these certificates out, and why can’t i just invent my own?

Well – you can generate your own certificates – these are called self signed certificates. But – because you make it yourself, no-one trusts them, cos you could say anything about yourself, and no-one else can verify your statement. I mean, I’m actually the worlds best chef… I could generate a certificate to tell you this. But if you asked my wife or kids…

Because of this, we need certification authorities who are globally trusted, who can then verify anyone looking to get an SSL certificate is who they say they are courtesy of the domain checks above or the organisation tests. Examples of this are Sectigo and GeoTrust. Different providers offer different services and levels of insurance against your communications being decryptable. These also come at different costs.

What do they cost?

Depends. There are providers like Letsencypt which provide free domain verified SSL certificates. These are great for most brochure websites mentioned above, and give you enough security for web browsers to call your website secure, and your customers peace of mind. If you are offering e-commerce on your website, or any form of data access which is potentially sensitive, then it is strongly recommended to purchase an SSL certificate provided by a provider that offers insurance, as these providers have high trust relationships with web browsers, and give you support with installation and ongoing security of your setup. Purchased SSL certificates typically start from around $10NZD per year + installation, through to multiple thousands of dollars per year (bank level) – it really depends on what you need the certificate to do.

Do I need it?

Nowadays, yip you really do. You need some form of SSL certificate, be it free or paid, just so your website looks safe out there on the internet. This is even more critical if you are wanting to attract visitors using search engines (you are penalised in ranking if you don’t have one) or you offer online products for purchase (e-commerce). Because you will be accepting user credentials or contact details etc, and in some cases accepting payment details, it is imperative for user security that all communications are secured.

There are also newer web technologies that will only work with SSL connections – things like websockets.

If you need assistance with getting your website secured, or have any issues with SSL certificates, contact the team at Webmad and they can get you all set up.

Categories
Hosting

What happens when a domain name expires?

[ Disclaimer: this is primarily written for the New Zealand context, so anything ending in .nz, but some parts are generally applicable ]

Oh dear. Your invoice for domain renewal has landed at the wrong email address, or your existing domain name registrar has gone quiet. This is definitely less ideal, and can leave you in the position of having a domain name that has expired. Lets explore what that means, and your options.

So. Domain names expire. You can think of it like ‘owning’ a domain name is more like a subscription.. You subscribe to the domain name, you pay for it each year, and you get full rights to it. When the subscription ends (the domain expires) then the domain moves into a process of expiring.

The domain name is placed into ‘Pending Release’ status for a period of 90 days. In this state, the domain name is inactive (mail and websites won’t work) but it is still registered to you. You can renew at any stage during this 90 day period (some registrars charge more to renew your domain the closer you get toward the 90 day mark) and by doing so, this reactivates the domain name. You can also transfer your domain name to another registrar during this period if you want – only some registrars allow this incoming transfer, or allow you to get the domain ownership code while the domain is expired, so it can pay to check first. If the registrant of the domain (You) fails to renew by the 90th day, the domain name is released available for registration on a first in, first served basis by the .nz Registry.

Ideally you’ll catch your name back in that 90 day period. As the domain gets closer to the 90 day mark, it’ll get listed on services like https://www.expireddomains.co.nz/ so people can bid on the domain – highest bidder wins the domain provided that service catches the domain when it becomes available. This part gets interesting.

On the day that the domain name is set to ‘drop’ and becomes available for anyone to register, there is a set sequence that isn’t very well documented out there, but here is the process:

The domain gets queued up by the domain name commission for the next domain release window ( this is documented at https://docs.internetnz.nz/faq/general/ ). The release maintenance window runs from 00:29:00 to 00:34:00 and all domain names should be released during this maintenance window. So – at some point in that window, your domain name is going to become available. You are able to send up to 15 requests per second to try and catch the domain within this window, to try to be the first one to catch the domain when it becomes available. Its really a gamble as to whether you will land it or not.

The downside of this process is that once its gone through the process of being released to the public, you really have no say on getting the domain back. You’ve had your chances. That’s it. Its painful, but unfortunately the domain is completely out of your hands.

Domains can be confusing at the best of times. If you are having issues, or need a hand, get in contact and we’ll do our best to get you the best outcome.

Categories
Security

How to prevent email spam from my website

“Get a website” they said. “It’ll get you heaps of new clients” they said. You’ve invested into a website that acts as online brochure with the aim of bringing in clients and potential sales. Its got a contact form, maybe you have a blog on there to try to show you are still relevant… Isn’t it disheartening when what feels like the only contact you get through the site is spam. It plagues your inbox, it gets filtered to your spam folder, and then you never know what is legitimate or not… Aaaargh!

We hear it a lot. “I’ve started getting a lot of spam from my website…”. Firstly, we are going to go through how on earth all the spam is getting there in the first place, and then we’ll go through a list of preventive tools that you can use to help avoid getting bogged down in the ‘noise’, allowing you to focus on your real clients, ideally without forcing them to jump through hoops to prove they are legitimate.

So… Where is all this spam coming from?

Nowadays, most spam generated on websites comes from automated processes, often referred to as Bots. Basically some clown somewhere decides it’d be great to try to get their message in front of the website owner, or in the case of blogs, even potentially in front of your target audience by getting their comments published in your website. Yeah most of the time you wouldn’t be daft enough to publish their comments, but if someone does, and they get their message / website link hosted on your site, then your SEO helps promote their SEO and they win the battle of Google-sberg. Not ideal for a clean internet. But, that little bot of cheap bot code can be run against hundreds of websites, and keep on trying with no further cost to the people who developed it, and potential for payoff, so the spam keeps rolling in. Small tweaks to the bot code gets around little changes made to try to prevent its effect. So, we’ve got to get smart.

Basic workarounds:

Your standard bot simply reads the code that is used to display a form on your page. It then plucks out all input fields, populates them with some form of content, and fires them back at your website, which then emails the submitted data to you / someone. One of the simplest methods of detecting bogus entries to your site, is simply to add an extra field into your forms, that is hidden from normal users (ie using the css property “display:none” or similar, ideally applied to a class name so that its harder for the bot to recognise it is a hidden field). If you detect if there is content submitted on the hidden field, ie content that got there that no normal user would have been able to fill in, then we can pretty reliably say that the submission is bogus. This type of spam rejection is sometimes called a ‘honeypot’ – the bot sees the lure of another input to fill in, gets its hand in the jar, and is consequently found with honey stuck to it. Poor thing.

Many form plugins for popular web systems allow for honeypot style traps built in, to be enabled on forms you create with their tools – plugins like Gravityforms for WordPress. I’d recommend that when evaluating form plugins, this is a quick win option that helps sway selection of the best fit.

Captcha, Recaptcha, and annoying your users.

Sounds sinister eh? Don’t Captcha me! But what is a Captcha? You’ve likely seen them. Its those funny wee ‘Type the text you see in the image’ questions that you get on some forms, and half of them aren’t even readable, and you just get that little bit frustrated ‘cos its effort. Its not even for your benefit!

So Captcha is the term for those image recognition questions. Why do we have them? Well – because they are hard. Not just for humans – they are really hard for computers to figure out. How do i tell a line or shape from a letter of the alphabet. Humans are great at pattern recognition, especially when trained to do it since around the age of 5. Computers? The harder the image (ie warped text, lots of foreign objects, characters without solid borders etc) the less likely the computer / bot will be able to resolve it to a satisfactory, correct answer. This method works well at preventing spam… but also for putting off legitimate clients, unless they have good enough reason to contact you to move past the hurdles you put in front of them.

So then Google put some weight behind ReCaptcha – a similar concept, but with some extra smarts behind it. Instead of just throwing an image onto the page, it uses some code that is only rendered in the web browser of your visitor, and uses that to then add the verification image, and outsources validation of that image to the ReCaptcha service. Pretty cool stuff. Still a pain to fill in for your users, but doing it this way gives the same powerful tools to more forms systems out there on the web, in a consistent way, and has good rejection rates.

The latest version of ReCaptcha doesn’t show images anymore – either a wee tickbox to tick to show you are a human, or an option to not show anything at all, and just rely on neat detection algorithms. Many websites rely on this method – its not perfect, but it does a pretty good job against most incoming spam.

Are there ways to weed out spam without relying on user entry / client side tricks?

How good of you to ask. Why Yes. Yes there are. There are a number of services out there that you can forward the content of your submitted data to, and they run filters on it and can detect if the content is obvious spam (anyone wanna buy some viagra or cialis?). In the WordPress world the most obvious one is Akismet. The great things about these tools is they can be run retroactively on previous comments in your system to weed out spam from them as well. Very helpful. Another we have had great success with, that allows integration with a variety of web systems is Cleantalk.

These third party filtering systems use learning filters to target the ‘in season’ spam content trends, and block them, so you don’t need to stay on top of them. They aren’t perfect – it is possible you will get some false positive recognitions (legitimate messages that are seen as spam on content analysis alone) but typically they give good interfaces for whitelisting content or users so the systems can learn from their mistakes.

So… What should we do?

The best approach to most problems is multi-faceted. The options represented above all attack spam submissions in different ways. Traps, challenges, and filters. We have found our most reliable setups have been mixtures of each, depending on the context of what we are looking to protect. To prevent spam in blog comments and contact forms / calls to action: a honeypot to catch most of the bots, and cleantalk to catch the ones that get through is a good fit. For user registration forms or user login protection – recaptcha works well as your client already knows they have work to do to get at the goodies in store once they put in the effort to get past your hurdles.

Find what won’t annoy your users, and use that. There is plenty of options out there. Still stuck, or not sure how to implement your changes? I know some people who could help.

Categories
Hosting

Re-streaming video from webcams to websites

What’s the problem?

One of the powerful things you can do with the internet nowadays is access web cameras and video sources from around the globe. Over the last 5+ years the team here at Webmad have been hosting web camera re-streaming services for https://taylorssurf.co.nz. The site runs a few IP cameras based at a local surf and recreation beach here in Christchurch, New Zealand. Since starting with this site, we’ve managed a number of different methods of getting the video from the various cameras, and for various clients as well.

Late evening view from one of the cameras streaming from the top of our office building

So the problem we are trying to solve is how do we get the video from the cameras out to viewers on the internet so that hundreds of people can view the video streams at once. Typically a camera has a limit of around 20 connected users at a time if you are trying to access the camera directly, and if your camera is on a fairly limited bandwidth internet connection (at Taylors Mistake we can only get VDSL speeds at best) then multiple people trying to access the cameras at once will kill the streams pretty quickly. The other issue is that the default streams from the cameras we’ve got there are not in overly friendly formats for websites (rtsp etc) meaning you’d have to use flash based video players, which pretty much all web browsers look down on these days.

So… What can we do?

In New Zealand, ISP’s don’t charge for internet traffic between two endpoints within their networks. This is fantastic as it means there is no charge for bandwidth between the cameras at remote locations and our re-streaming server we host locally, provided we use the same internet provider. This allows us to cost effectively re-stream the video, meaning there is only one connection to each camera pulling in video feeds, and that can then re-broadcast the camera feeds on a high capacity internet connection allowing thousands of end users to connect to view the video images.

Solution 1: The mjpeg streamer

On cameras that only output an mjpeg stream, we developed the mjpeg-streamer. Basically what it does is it connects to the camera source, and then feeds that into a memory buffer. Then any consecutive requests to the script, instead of fetching the camera feed from the camera, will connect to that same memory buffer and return the camera feed to the end user as an mjpeg stream. By using php tools like imagemagick you can add image overlays onto the stream as needed. This system works really well so long as you have difference memory locations for each different camera you are looking to re-stream, and requires low resource usage on the server side, so it can be easily used on shared hosting. The downsides of using this type of streaming is that there is no ability to alter resolution easily, and mjpeg streams have questionable capability with most modern browsers, and have been known to crash browsers that can’t clear older frames from the video from their memory.

Solution 2: The RTSP / RTMP re-streamer

A camera upgrade eliminated our ability to use mjpeg streaming, so we were forced to update our streaming strategy. The best tools for the job came in the form of the open source Red5 flash streaming server and the open source FFMPEG application. The aim here is to use ffmpeg to pull the video from the camera, and feed it to the red5 flash streaming server. Clients must then use a flash based player to connect to red5 and play the video stream. This works well, and any overlays can be injected during the ffmpeg based ingest process. The code is at https://github.com/stephen-webmad/rtsp-restream

Where this falls over is that modern browsers no longer support flash based players. So, we had to move to something else.

Solution 3: HTTP live streaming

Remember how we are using ffmpeg in the solutioni above? Well – turns out there is another format it can output, HLS (HTTP Live Stream). What is HLS? It’s a sequence of bite sized chunks of the video stream, all tied together using an index file. Where HLS comes out tops for live streaming is that it enables you to pause and rewind the live stream, allowing you to go back as many chunks as are stored in the index, which can be real handy. The index file can be as big as you want. The player just polls the index file (checks in on it every few seconds) to see if there are any new video chunks to download, and grabs them if there are. You can see this in action at https://canview.nz

HLS is super handy for web browsers and mobile as its easily playable, and you only need to fetch small chunks of video, meaning page load times as measured by Google etc are typically much better. The downside of HLS streaming is that your live stream will be delayed a little as it needs to build up the video chunks for the players to download, but for most things, that is an acceptable compromise. For most of our video streams this is around the 30 second delay sort of mark, but this can be shortened by tweaking the settings.

By using the wonderful ffmpeg software, which runs nicely on linux servers, you gain the advantage of being able to overlay imagery, alter resolutions and framerates, you can handle almost any form of ip camera, and you can also output multiple resolutions, and snapshot images at whatever time interval you wish, if you are keen to serve a static image fallback or splash screen to your viewers as well. HLS is also relatively easy to embed into an html5 <video> element ( there’s lots of javascript libraries to assist with this task ) allowing things like fullscreen and picture in picture.

We’ve not yet created a git repository outlining how we operate this form of live streaming – this is likely to come in the future, but for now, if you are looking to live stream a web camera from anywhere in New Zealand, or the world if you have friendly international traffic allowances, do please contact us and we can help make that happen, or if you’ve got the resources, we can assist with getting you all set up to run it yourself.