An outstanding bit of free software essential for every webmaster wanting to understand how their webpages appear to search engines and what issues, if any, exist.
How does you website appear to search engines? Have you crafted proper titles and descriptions? Do you need a birds eye view of webpage information?
Screaming Frog SEO Spider is a website crawler tool used for analyzing on-page SEO elements. It helps users identify issues such as broken links, duplicate content, missing meta tags, and more, allowing for better optimisation and improved search engine rankings.
Enter any URL and the software fetches all onsite elements, presenting the information in rows and columns, like a spreadsheet. There are loads of uses for the tool even under the free plan.
High level Overview of Webpages
Sometimes, it’s handy to crawl a competitor’s website to see what they’re doing, or if they’ve missed something from their website. It’s worth doing if you suspect they have an advantage over you, perhaps to do with the way they write their SEO title tags, for example.
It’s not just for public websites either. Do you build your websites offline in a localhost for development and testing?
I use WAMP to develop sites and yet I use Screaming Frog to comb the site for errors such as dead links producing 404 pages. It will also show missing meta information. It will also remind me that I might need to noindex
certain pages before launching the website.
Introduction Screaming Frog SEO Spider
On Page Criteria
Enter the address of the website you’d like to crawl and click Start.
Once you’ve finished crawling a website you can scroll down to see how many pages have been included in the analysis.
Scroll to the right and you’ll see various columns about the website page address, the type of content it is, status code, page title, meta description, headings, meta robots, canonical links, size of page, word count, internal links and external links.
There’s an official user guide on the Screaming Frog site and I suggest you refer to it if you want the latest documentation from the horse’s mouth.
Resource Info Summary
Each of the rows represent “resources”. Resources include, among other things, webpages and images.
If you click a row, the lower bottom pane of the Screaming Frog interface displays information about that resource.
The same information is located in the horizontal columns at the top. I’ve broken down each column one by one, below…
Address Column
The address column is the URI, which is an acronym for uniform resource identifier. This is computing lingo for something uploaded to your server, whether it be text, image, video, audio or a script.
The screenshot below shows rows containing each URI:
The most common type of URI also known as URL, is a HTML webpage. Screaming Frog comes equipped with another column to the right of the Address column showing what type of URI we are looking at.
Content Column
The Content column sits to the right hand side of the Address column. Each row indicates what type of content is on each URI.
In the screenshot below we have a mixture of HTML webpages (with character set information), images (JPEG files, in this case) and applications (javascript).
Tip: If you just want to crawl webpages, select HTML only, as the free version of Screaming Frog has a 500 limit crawl, and because by default it crawls all website assets, you’ll be wasting that 500 crawl quota on everything else besdies actual pages.
Status Code & Status Column
The next two columns return the status of each URI. A status code of 200 means that everything is functioning successfully and that search engines are able to fetch those particular resources.
A 404 (Not Found) suggests resources from a particular URI are unavailable.There are several possible reasons.
A webpage might have been deleted, or the URI might have been changed to something else but there could be page on your site still trying to link to that resource.
Other reasons for seeing a 404 may be because the URI is included in your website sitemap or exists only as a canonical link with no actual page.
Click the row and use the lower pane to look for clues as to the cause of the 404:
Clicking the Inlinks and Outlinks buttons will normally reveal where the error is originating:
The cause of the errors shown above was simply because I had changed the page slug on a blog post and forgotten to update the corresponding canonical link. Tut tut for me! But I found the error thanks to this tool.
One more example: An eBook download link was broken. I could tell instantly by looking at the Inlinks information that I must be linking from an image, because there was no Anchor Text.
There was only ALT text (which I add to all my images) and that was the clue I needed to check the images on that particular webpage to locate the broken link and fix it:
By the way, I recommend looking at this useful list of HTTP status codes if you want to know what the most common ones mean.
Title Column
The next column is perhaps on of the most important, since webpage titles are the headlines uses in search engine snippets.
Page titles are what search engines use to help make decisions about what results to return, and is users see when choosing what results to click.
If you’re using WordPress, do not confuse the page title with the post/page title. The title, as seen by search engines, can also be seen in the top of your browser by hovering the mouse cursor over one of the tabs.
Title Length Column
A good page title must ideally fit into search engine results snippets without becoming truncated. The reason you don’t want page titles to be truncated is because the people reading them won’t get the whole title, and might not click.
Keeping page titles shorter is a good practice and if you keep them to around 55 characters, it is likely to fit.
The column for title length in Screaming Frog is invaluable because it lets you see where you can make improvements.
In 2014 Moz released a handy tool for inputting title tags and generating realistic Google preview snippets. They also emphasised the importance of understanding that there really is no “magic number”, but that on average, 55 characters gives some wiggle room. You can also a tool like the free Mangools SERPs simulator.
Meta Description Column
If meta descriptions haven’t been crafted for a particular webpage, you’ll know about it when you run the software.
When we were discussing page titles earlier, I showed a screenshot from the Google search engine results page. Now let me highlight the meta description of that page:
Each description should be unique and provide a summary of what each page is about. Assuming you have added these, they’ll show in the corresponding Screaming Frog column.
Like page titles, meta descriptions should be kept to an ideal length, and that’s what the next column is for.
Meta Description Length Column
I suggest a meta description length of no more than 160 characters. As you can see in the screenshot below, some of mine have exceeded the recommended length but not my much.
H1 Column (First Occurrence)
Heading 1 tags are also known H1. This is usually the main heading for the page, but it could also be your site title. Screaming Frog looks for two occurrences of the H1 tag and denotes them as H1-1 and H1-2.
The screenshot below shows that the words “Small Biz Geek” are used inside every page.
The reason “Small Biz Geek” is used multiple times is because by default WordPress has assigned a H1 tag to the title of my blog.
H1 Column (Second Occurrence)
The second occurrence of the H1 tag is denoted as H1-2. Not all websites have multiple uses of the H1 tag and I would say any more than three H1 tags is heading into risky territory.
The title of my posts and pages are using H1 tags since that is what each page is about.
H2 Column (First Occurrence)
Heading 2 tags are denoted as H2-1 in their first occurrence. In this case I’ve used the phrase “My passion for small business design, marketing & tech” as my WordPress website tagline.
WordPress has then automatically wrapped the website tagline inside the H2 tag.
Canonical Link Element Column
If you’ve included canonical links for each of your pages they will show up here. As I mentioned earlier, any 404 errors could be a typo with a canonical link.
I forgot to correct one of these links a few weeks back and it was showing as a sort of phantom page and therefore returned the 404.
Basically, all you need to know is that a canonical link for a webpage must match the URI exactly.
Word Count Column
The word count is all the words between the <body>
and </body>
HTML tags. This can be a bit misleading because dynamic websites usually have headers, sidebars and footers.
I’m more interested in the word count of the main content area on a page.
Inlinks Column
Number of internal inlinks to the URI. ‘Internal inlinks’ are links pointing to a given URI from the same subdomain that is being crawled. This is handy, because you can detect which pages on your sites don’t have enough backlinks (both internal or external) pointing at them.
If you produced a new page and published it, the chances are, it’s an orphan page, or at the very least, needs to be linked to from other relevant pages on your website.
For me, this is one of the handiest things about Screaming Frog.
Click a URI row containing a wepbage with Inlinks and you can view those links in the lower pane:
Outlinks Column
Number of internal outlinks from the URI. ‘Internal outlinks’ are links from a given URI to another URI on the same subdomain that is being crawled.
External Outlinks Column
Number of external outlinks from the URI. ‘External outlinks’ are links from a given URI to another subdomain.
Response Time Column
Time in seconds to download the URI. Other tools exist online for checking webpage load speed such as Google Lighthouse.
Last Modified Column
Read from the Last-Modified header in the servers HTTP response. If there server does not provide this the value will be empty.
Redirect URI Column
If you’ve created a 301 redirect for an old link, the Redirect URI column shows where it is pointing.
The example below shows that I originally published a page but later redirected the link:
500 URI Crawl Limitation
Screaming Frog is limited to 500 URIs under the free/lite version. What it actually means is that if the website you crawl exceeds 500 pages (500 URI’s) then you’ll need to upgrade to a paid licence in order to get all the data.
This might be a moot point since many small business websites or indeed small websites. You might have as little as 5 pages (5 URI’s). However, if you have an eCommerce store, the product listings could run into the thousands of URI’s, especially if you have lots of images.
This is why you might want to select to only crawl HTML assets if it’s just webpages you’re wanting to analyse. No need for all the other assets, which will run into the thousands.
Summary: Worth Having Even Without a Paid Licence
This is an impressive piece of software and a must have tool for anyone running a website. The level of insight offered under the free/lite version is simply fantastic because it offers up errors that are glaringly obvious from a search engine perspective but invisible to the naked eye.
Get started with Screaming Frog for free.
SiteAnalyzer says
For a technical audit, I recommend to look towards SiteAnalyzer – a free program for auditing and technical analysis of the site. At the same time, the set of functions is practically not inferior to paid counterparts