To some, server log analysis may seem like an archaic SEO practice, more at home in a development studio than as part of a digital marketing campaign. However, if you are not analysing server logs for your clients or brand then you are missing a pretty significant trick.
As the more technically proficient of you at BrightonSEO will already know, analysing the raw logs from your server is the perfect way of finding out exactly how your website is being crawled. In fact, server log files are the only 100% trustworthy source you have, and these obviously give you a wealth of information about all search engine activity, not just Google.
There is a plethora of other search engine crawlers, web scrapers, spiders and bots that attempt to crawl your website every day and these demand in-depth analysis by you and your SEO team.
Admittedly, server logs can be somewhat intimidating to those new to the world of technical SEO or who simply aren’t comfortable sifting through masses of data in endless text files. Thankfully, and such is the importance of pragmatic server log analysis, there is now a wealth of affordable tools available. These display all the meaningful information you need so you can transform it into actionable technical SEO insight.
Indeed, if you have international clients then Google may not be the only search engine you need to consider when it comes to your web traffic – your server logs will tell you everything you need to know about which spiders are gaining access to your server and exactly what they are finding.
It is important to point out that server log analysis will not give you all the information you will find in suites such as Google Search Console or Bing Webmaster Tools. However, when it comes to crawlability the information is far more comprehensive and more up to date.
So, what should you be looking for when analysing your all-important server log files?
The most common mistake I see when individuals start doing their server log analysis is obsessing purely over crawl errors and the implementation of hundreds and thousands of redirects. Now, this is a very valid practice and one that intelligent server log analysis can aid significantly, but it really just scratches the surface of what you can achieve. These raw log files give you so much more data, so don’t just leave it there!
Indeed, it may be far more prudent to look at how you can stop these crawl errors from occurring in the first place, negating the need for mass 301 redirects.
Ever the scourge of the SEO, spambots are a pain. Not only do they skew your referrals and analytics, they can also take up valuable resources and are, basically, just downright annoying. Intricate server log analysis can help you identify the culprits and help block them using your website’s HTACCESS file or equivalent.
By taking a good look at the user agents gaining access to your server, you can ensure that the search engines you are most concerned with are getting the right access and, if they aren’t, you can investigate possible reasons why. This is also another good opportunity to ensure that these various user agents are accessing the areas of the site that you want them to.
What are your top crawled pages, are they the URLs that are the most important to your users? If not, then there could be some serious technical issues that need resolving.
You could then look at the differences between these most-crawled pages, their HTML attributes and their scripts to see what the trends are. This can also be a good way of determining whether your site architecture is as strong as it should be.
The information you retrieve from your server logs can often be a lot more useful than the staggered and limited information you can get from Google Search Console. Take a look at your crawl rate by bots as the various trends you see here will tell you a lot about how your website has been optimised for the various search engines.
Despite the much-debated announcement by Google, that 301 redirects do pass maximum link equity, you can do some very useful analysis of these by looking at your server logs. Whether or not John Mueller’s statement is true or not, having endless redirects is not an ideal use of crawl budget so should be refined as much as possible.
You can also see what temporary (302 & 307) redirects you have in place and decide, alongside your technical team, if these are absolutely necessary and if they should even be crawled in the first place.
This is just a snapshot of what is possible through intelligent server log analysis, but hopefully gives you a good indication of the power of these files and how you can use them to boost search performance.