HTTP/2, an evolutionary leap in the web (including for sites and Google)
It has been called a significant step forward in the complex system of the Web, capable of a offering a whole new set of opportunities for optimizing online applications and improving performance. This is certainly a bombastic presentation for HTTP/2, the second version of the HTTP protocol that-though perhaps more slowly than expected-is also emerging in discourses related to technical site optimizations. Indeed, Googlebot began supporting it in its scans in November 2020, while as of March 2021 it is listed for network requests in Google Lighthouse and PageSpeed Insights reports. So here is an overview of HTTP2, to understand what it is, how it works, why we should care about it, and, concretely, what are the pros and cons of its use and, no less important, what can be its positive aspects for SEO.
What is the HTTP2
HTTP/2 stands for Hypertext Transfer Protocol version 2 and represents the second version of the HTTP protocol, made with the specific goal of improving the speed and efficiency of data transmission, making Web pages faster and more responsive.
More precisely, it is an optimized expression of the semantics of the Hypertext Transfer Protocol (HTTP for all), the protocol that has served the Web for more than 20 years, of which it represents the evolution, and thus HTTP/2 is a protocol for controlling communication between a browser making a request and the server containing the requested information.
In addition to ensuring a technical breakthrough over the previous generation, while maintaining maximum compatibility, the HTTP/2 protocol improves the loading of web pages on various browsers and offers great application simplicity to users, that do not have to make any particular changes to the site to apply this technology.
In particular, HTTP/2 allows a more efficient use of network resources and a reduced perception of latency, introducing the compression of the header field and allowing multiple simultaneous exchanges on the same connection; also introduces the unsolicited push of representations from servers to clients.
The history of HTTP/2: not just an evolution of HTTP
HTTP/2 was officially standardized in 2015, almost two decades after the introduction of its (last) predecessor, HTTP/1.1, and some 30 years after the first insights, related to the period of the birth of the Web.
In fact, the Hyper Text Transfer Protocol is the application protocol that made the World Wide Web possible, and over this long period it has grown from a simple means of retrieving network resources to a fast, secure, and versatile digital communication system.
The first documented version of HTTP, HTTP 0.9, was released in 1991 and was the result of the vision of Tim Berners-Lee, the web pioneer, who designed it with the goal of simplifying high-level data communication between web servers and clients. This version laid the groundwork for the evolution of the protocol, which saw the introduction of HTTP1.0 in 1996 and HTTP1.1 in 1997.Since then, HTTP has undergone a series of incremental improvements, always keeping its original mission at its core: to make data communication over the Internet simple and efficient.
However, the increasing complexity of the Web necessitated the development of a new version of the protocol, and in particular the increase in the number and size of resources required to load a single Web page was making HTTP/1.1 increasingly inefficient. HTTP/2 was then introduced to address these inefficiencies and improve the speed and security of web browsing, allowing more efficient use of network resources, reduced perception of latency due to header compression, and a number of simultaneous exchanges over the same connection, to which it also adds the introduction of server push.
This brings us to February 2015, when the HTTP Working Group of the Internet Engineering Task Force (IETF) released HTTP/2, the second major version of the protocol, developed in response to Google’s SPDY protocol, an HTTP-compatible project that aimed to reduce latency in web browsing. In May 2015, the HTTP/2 implementation specification was officially standardized, marking a new chapter in the history of HTTP.
The confrontation between HTTP/2 and SPDY did not end with the introduction of HTTP/2, and the competition between the two protocols has spurred further evolution of HTTP, pushing for more and more advanced features. This process of continuous innovation is what has allowed HTTP to remain at the center of the World Wide Web, despite the challenges posed by the increasing complexity of the Web.
According to the latest w3techs findings, as of today HTTP2 is used by more than 35 percent of all sites, a figure that is down from the peak reached in January 2021 (when the same w3techs group announced that HTTP/2 had surpassed 50 percent of all sites worldwide) due in part to the subsequent introduction of HTTP/3, which in fact is already adopted by 27 percent of sites.
HTTP/3 was first introduced as a working draft by the Internet Engineering Task Force (IETF) in November 2018 and was actually standardized as RFC 9114 on June 6, 2022. HTTP/3 retains all the features of the previous version, adding the adoption of the QUIC protocol instead of TCP for data transport: QUIC, which stands for Quick UDP Internet Connections, is a transport protocol developed by Google that combines the features of TCP and UDP to provide a faster and more reliable connection so that HTTP/3 is capable of delivering these features with greater efficiency and speed.
On the subject of Google and HTTP/2, however, it is important to note that for now Googlebot is capable of crawling sites over HTTP/2 as of November 2020 (and as early as May 2021 John Mueller confirmed that it was crawling more than half of all URLs over HTTP/2), while it is not yet capable of crawling over HTTP/3.
What is a protocol and how online communication works
To better understand the functioning of such a system it is also good to remember what is a protocol, using the help of Ruth Everett: essentially, protocol is the set of rules in place to manage the request between client and server. Typically it consists of three main parts: the header, payload and footer.
- The Header contains information such as the source and destination address of the page, as well as size and type details.
- The Payload is the actual information that will be transmitted.
- Finally, the Footer directs the request to the intended recipient and ensures that the data is free of errors when transmitted to the browser.
The technical characteristics and main features of HTTP/2
HTTP/2 makes applications faster, simpler, and more robust (“a rare combination,” says Google’s Ilya Grigorik), enabling it to override many of the previously adopted HTTP/1.1 alternatives and address these concerns within the transport layer itself.
Since its design, this protocol had as its main goals reducing latency by enabling full multiplexing of requests and responses, minimizing protocol overhead through efficient compression of HTTP header fields, and adding support for request prioritization and server push.
Technically, HTTP/2 is based on the same syntax as HTTP/1, so this protocol is more of an upgrade than a complete migration; this was an intentional decision, to make the transition as smooth as possible.
These are some of the main technical features of this protocol:
- Binary commands
HTTP/2 introduces a change to the transformation protocol, which goes from textual to binary to complete the request to the response cycles: the same tasks are performed, but using only binary commands 1 and 0 instead of text.
This simplifies the implementation of commands, which become easier to generate and analyze.
Multiplexing allows multiple requests to be made simultaneously on a single connection: in this way, the payload is divided into smaller sequences, analyzed and transmitted on a single connection and then reassembled before they reach the browser.
The main objective of this change was to solve problems related to resource-consuming requests and help prevent requests and responses from blocking others.
- Compression of the header
Header compression is designed to reduce the overload provided with the slow boot mechanism in HTTP/1. As most websites are rich in graphics and content, client requests cause multiple header frames to be sent almost identical to the browser, which can cause latency and unnecessary consumption of already limited network resources.
The header compression mechanism offers the ability to compress a large number of redundant header frames and allows the server to maintain a list of headers used in previous requests; essentially, the headers will be encoded into a compressed block and sent along with the client.
- Server Push
The server push mechanism allows you to insert resources that could be used in the cache of a browser before they are requested; they are sent, without waiting for the response of another client, also the information or resources that are expected to be present in future requests (based on previous requests).
This avoids the need for another round trip request and response and reduces the network latency that results from different resources used to load a page.
- Priority in the flow
The Stream prioritization allows you to give preference to particular data streams, based on the dependencies and weight assigned to each.
In this way, the server can optimize the allocation of resources according to the requirements of the end user.
- HTTP2 e HTTPS
HTTP/2 support is only available via encrypted connections, and therefore requires HTTPS. HTTP/2 therefore always requires a secure connection, and this means that all sites using HTTP/2 must also use HTTPS, improving the overall security of the Web.
Unsurprisingly, the two protocols complement each other in many ways: they increase security for users and applications, but they also require fewer TLS handshakes and reduce resource consumption on both the client and server sides.
How HTTP requests work: the truck analogy
Everett’s article also gives a useful analogy to understand HTTP requests, using a truck as a reference.
Basically, a truck represents the request from the client to the server and the road traveled by the truck is the network connection; when the truck that carries the request from the browser reaches the server, it will load the response and return it to the browser.
The HTTPS protocol adds a layer of protection to these responses, to ensure that no one is able to look inside the truck to see what it contains, such as personal data or sensitive information.
The main problem is that the trucks that make the request cannot travel faster than the speed of light: they must also travel at a constant speed, regardless of how large the demand is and how far they have to travel to reach it.
When making requests via HTTP/1, every truck needs its own road or network request, and for certain requests must also be made new network requests; all this is added to the latency.
Typically, only six simultaneous connections can be made at a time, and other requests are forced to wait for network connections to be free. Cascade diagrams are a useful way to see this latency in action.
How things change with the HTTP/2 protocol
It is at this point that HTTP/2 can intervene and provide a positive impact on the behavior of requests.
For example, thanks to the multiplex function more trucks can run on a single road simultaneously, so the network connection is able to handle more requests and provide more answers faster.
The content of these requests and answers remains the same: they are only handled in a slightly different way.
The differences between HTTP/2 and HTTP
The analogy also allows us to highlight the main differences between HTTP and HTTP/2, because the new protocol not only improves the efficiency of the previous one, but also introduces a number of different features.
Basically, HTTP/2 does not change the semantics of the HTTP application in any way: all the fundamental concepts, such as methods, status codes, URIs, and header fields remain in place. Instead, HTTP/2 changes the way data is formatted and transported between the client and the server, which handle the entire process, and hides all the complexity from applications within the new framing layer. As a result, all existing applications can be delivered without modification.
In short, HTTP/2 breaks down HTTP protocol communication into an exchange of binary-encoded frames, which are then mapped to messages that belong to a particular stream, all of which are multiplexed within a single TCP connection. This is the basis that enables all the other features and performance optimizations provided by the HTTP/2 protocol.
Returning to the version comparison, one of the first and most significant innovations lies in the fact that HTTP/2 is a binary protocol, unlike HTTP/1.1, which is a text protocol capable of handling only one request per TCP connection: as clarified, this allows HTTP/2 to handle requests and responses as bit streams, which are more efficient to parse and less error-prone than text, and to process multiple requests simultaneously, reducing page load times.
Another important difference between HTTP/1.1 and HTTP/2 is the introduction of the push server. In the previous protocol, the server can only send resources to the browser in response to a specific request: therefore, if a web page requires many resources, the browser must make many separate requests, each of which is time-consuming and can lead to duplicate data on the transmission cables, requiring the use of additional protocols to ensure that the information received is error-free. HTTP/2 introduces the concept of server push, which allows the server to actively send resources to the browser before the browser requests them, and this can further reduce page load times because the browser can start processing resources as soon as it receives them, instead of having to wait to request them.
Finally, as mentioned, all sites that use HTTP/2 must also use HTTPS, improving overall web security.
The benefits of HTTP2 protocollo for sites
Being an up-to-date technology, the adoption of HTTP/2 brings some benefits for sites.
The first is technical and practical: the update to HTTP2 is not a migration and does not require changes to Urls, but it is simply a change of protocol that does not require application efforts or special interventions.
Everett highlighted “four of the biggest SEO benefits”, a non-exhaustive list of the overall benefits of HTTP/2.
- Web performance
Many of the new HTTP/2 features have been designed to improve site performance and to help save the resources needed to scan sites.
For example, multiplexing means that requests and responses will not block each other, which helps reduce latency and, in turn, provides faster web performance.
The ability to send and receive more data per communication request is another practical example of performance benefits.
Moreover, prioritizing the flow allows effective use of resources, which reduces the time needed to provide content requests to the user.
- Mobile performance
In addition to overall web performance, mobile performance can also be improved with HTTP/2, which is designed in the context of today’s usage trends, which certainly favor mobile devices.
Multiplexing and header compression help in particular to reduce latency in accessing Web pages, and this also occurs on mobile networks, which can have a limited bandwidth.
In essence, HTTP/2 optimizes the web experience for mobile users in ways that were previously attributed only to desktop users, including through performance and security.
- Improved User Experience
Performance improvements also positively influence the user experience: it is easy to imagine that a fast-loading site leads to greater customer satisfaction and general brand favor.
As Google says, there is a 32% increase in the bounce rate if a page load goes from 1 second to 3 seconds, and HTTP/2 is just one way we can try to improve the loading speed.
- Greater security
Since HTTP/2 must be served over HTTPS, it ensures that all websites are encrypted and protected.
In addition, it also helps to ensure that the applications themselves are protected from any malicious attacks, which may result in manual penalties for the site or potentially have it removed entirely from search results.
Next to these “pros”, even HTTP/2 brings some disadvantages to consider, as is the case with all technologies.
The first negative aspect reported by the article is that not all browsers still support HTTP/2. In fact, since 2015 most major browsers have added support for the new protocol, but it is good to make sure that the main browsers from which users access the site are supported.
This is however a limited problem, as the incompatibility mainly concerns obsolete versions of browsers, which have a low overall use, as revealed by the graph of the site Caniuse.com.
Due to the server push function, there is a potential waste of bandwidth due to the data that can be sent to the browser but not actually used: “just because a page upload request may require a particular resource or another request may be expected, it does not always mean that it will,” says Everett, and this is likely to determine that “unnecessary resources can be sent to the browser”.
Also, because multiplexing can make the server “receive short bursts of a number of requests at once, it has the potential to overwhelm the servers, especially if they are not limited”. There may be slight delays and complications in debugging, due to the binary format used instead of the text format used in HTTP/1.
How to implement HTTP2 on site
Implementing HTTP/2 on a Web site is a process that requires a number of key steps and some familiarity with Web servers. We begin by verifying web server support for HTTP/2: major services such as Apache, Nginx, and Microsoft’s IIS support HTTP/2, but may need to be upgraded to the latest version.
Once server support has been verified, the next step is to enable HTTPS on the Web site, which is necessary to actually implement HTTP/2.
The exact procedure to switch to HTTP/2 varies depending on the web server we are using, but generally involves editing the server configuration file to enable HTTP/2 and then restarting the server.
Finally, once HTTP/2 is enabled, it is important to verify that everything is working as expected by taking advantage of one of several online tools designed to check whether the Web site is actually using HTTP/2.
Upgrading to HTTP/2 ultimately depends on our server and hosting provider: if, at the moment, we are unable to support HTTP/2, we need to talk to the server administrator or hosting support to get the appropriate guidance.
If, on the other hand, the server can support HTTP/2, it may automatically serve content via the new protocol. We can make sure “that the server is able to support it by using a CDN that also supports HTTP/2 and having an up-to-date HTTPS certificate”; also, we can check if the server is able to support HTTP/2 by using the http2.pro site, which informs about the server’s capabilities with respect to HTTP/2, ALPN and Server-push support. Another useful tool is Chrome Dev Tools, which allows us to check what resources are currently being served over HTTP/2.
Google and HTTP2
As anticipated, since mid-November 2020 Google has been scanning sites over HTTP/2, and after about six months it was already scanning more than half of all URLs with the new protocol. However, Googlebot does not support all the features introduced by the HTTP2 version and, in particular, some features such as server push, which can be useful for rendering, are still being evaluated.
In addition, since March 2021 even the PageSpeed Insights tool uses HTTP/2 to make network requests, if the server supports it; if a site is on HTTP/2, as a result you may see your Pagespeed Insights scores increase – but that does not mean an increase in rankings, let’s be clear.
Previously, all requests were made with HTTP/1.1 due to constraints in the connectivity infrastructure; with this improvement, it is more likely that there is “greater similarity among results of Lighthouse from PSI and Lighthouse CLI and Devtools (who have always made requests with h2)”, although “consistency between environments is almost impossible“, explain from Google.
It is worth noting that it is not possible to “force” Googlebot to scan our site on HTTP/2: if the site supports it, it will automatically be eligible to be scanned with the protocol, but for the moment Google will only do so if it considers it beneficial in terms of saving resources.
Therefore, if the site supports HTTP2, it means that it is eligible for crawling over HTTP2, but only if it benefits the site and Googlebot, which will otherwise simply continue to crawl the site over HTTP/1.1.
Conversely, we have a method to disable scanning over HTTP/2 protocol: instruct the server to respond with an HTTP 421 status code when Googlebot attempts to scan the site over HTTP2, or (in case of problems or technical impossibility) send a message to the Googlebot team for a temporary solution.
In general, however, Google’s preliminary tests found no problems or a negative impact on indexing caused by crawling over HTTP/2.
HTTP2 and advantages for the SEO
All the positive aspects of this protocol, combined together, can also have a positive impact on the SEO.
We reiterate that there is no direct increase in the ranking resulting from the use of HTTP2, as confirmed several times by Google, but in any case, the performance and UX increments can still help to achieve the objectives set by the Page Experience Update, as well as affecting, in some way, the visibility of a site in Google Search and conversions.
Then there is another interesting aspect, revealed by John Mueller in an intervention at Google I/O 2021 and reported by Seroundtable, because scanning in HTTP2 could improve the budget crawl, ie “a mix of how many Urls Google wants to scan from your website – the scanning request (crawl demand) and how many of your Urls our systems think your server can handle without problems – the crawl capacity”.
With HTTP/2 scanning, Mueller explains, this becomes a bit more complicated “because a single connection can include multiple Urls“; overall, however, Google thinks “that HTTP/2 scanning gives our systems the ability to request multiple Urls with a similar load on your servers“.
However, the decision to scan with HTTP/2 “is based on the fact that the server supports it and that our systems determine a possible increase in efficiency“; this means that Googlebot “will not need to spend as much time scanning your server as before”, and that in general the new protocol, with its features, brings improvements that help both the Google scan and the service infrastructure of the website.
Theoretically, Google’s subsequent official guide also clarified, the adoption of HTTP2 for Googlebot should make scanning more efficient in terms of server resource utilization: with HTTP2, Googlebot can open a single TCP connection with the server and efficiently transfer multiple files over this protocol in parallel, instead of requiring multiple connections. The fewer connections open, the fewer resources the server and Googlebot have to spend on scanning.
Simplifying, this means resource savings, both on the server side and for Googlebot. However, p the same document that reveals later that using HTTP1 or HTTP2 for crawling “does not affect how a site is indexed and, consequently, does not affect how often Google crawls a site.”