Let’s discover Googlebot, the site-scanning Google’s crawler

The name immediately makes one think of something nice, and indeed the corporate image also confirms this feeling: more than anything else, however, Googlebot is the fundamental spider software with which Google is able to scan the pages of public websites, following the links that start from one page and connect it to others on the Web, and thus selecting the resources that deserve to be included in the search engine’s Index. In short, this little robot is at the basis of Google’s entire crawling and indexing process, from which it derives its ranking system, and it is therefore not by chance that the search engine team has devoted more attention to the subject: let’s try to find out everything we need to know about Googlebot, the crawler that has the task of scanning the Web for sites and contents on behalf of Big G.

What Googlebot is

Today, let us therefore take a step back from the issues related to optimisation practices and try to briefly explain what Googlebot is and how it works, but above all why it is important for a site to know how Google looks at us – in a nutshell, because having a basic understanding of how search engine crawling and indexing works can help us to sense, prevent or solve technical SEO problems and ensure that the site’s pages are properly accessible to the crawlers themselves.

The latest in chronological order to delve into this topic comes from the update of Google’s official guide to Googlebot, but the crawler had already previously been the focus of an episode of SEO Mythbusting, the YouTube series by Martin Splitt who, prompted by the requests of many webmasters and developers and by the precise question of Suz Hinton (Cloud Developer Advocate at Microsoft, host of the occasion), went on to clarify some features of this software.

On this occasion, Splitt had provided a clear and simple definition of Googlebot, which is basically a programme that performs three functions: the first is crawling, the in-depth analysis of the Web in search of pages and contents; the second is indexing these resources, and the third is ‘ranking’, which, however, ‘no longer does Googlebot’, he further specified.

In practice, the bot takes content from the Internet, tries to understand the subject matter of the content and what ‘material’ can be offered to users searching for ‘these things’, and finally determines which of the previously indexed resources is actually the best for that specific query at that particular moment.

What Googlebot does and what it is for

Wanting to go deeper, Googlebot is a special software, commonly referred to as a spider, crawler or simply bot, which scans the web by following the links it finds within pages to find and read new or updated content and suggest what should be added to the Index, the ever-expanding inventory library from which Google directly extracts online search results.

This software allows Google to compile over 1 million GB of information in a fraction of a second, and so behind its cute appearance – the official image of Googlebot depicts precisely a cute little robot wi