Interaction to Next Paint or INP, the new Google Core Web Vital
It has been talked about for exactly one year, but today we have official confirmation (although we still have to be patient before practical implementation): starting next March 2024 INP or Interaction to Next Paint will become the Core Web Vital metric for responsiveness, replacing FID. In fact, after much testing and gathering feedback from the community, Google’s Chrome Team is ready to “take the wheels off the bike” of INP, which will no longer just be experimental but become fully effective and almost operational in terms of ranking.
The new metric for responsiveness: INP
INP was first introduced in 2022 in an article by Jeremy Wagner on blog.dev, but attention soared after the metric was also officially announced during Google I/O 2022, with Annie Sullivan and Michal Mocny‘s talk focusing precisely on the topic of responsiveness and the actions Google has taken to strive for a general improvement of the web ecosystem on that front.
At the end of twelve months of testing and experimental use, during which the metric has been made widely available in Google’s tools with a lot of dialogue work with the community to verify its effectiveness, in these hours came the news that was already glimpsed, namely that Interaction to Next Paint is ready to replace First Input Delay among Google’s Core Web Vitals, as Rick Viscomi and Annie Sullivan explain.
It all revolves (again) around the concept of Core Web Vitals, the set of metrics introduced in 2020 by Google to enable a concrete, objective and measurable assessment of the experience users have on a site’s pages. At the time it was Ilya Grigorik, Web Performance Engineer, who presented the project and unveiled the first three metrics chosen to measure page performance, anticipating that the list would be checked periodically. Fast forward to today, because indeed we are about to get acquainted with a new Core Web Vital, namely Interaction to Next Paint or INP, which measures page responsiveness, that is, the ability to respond to user input.
What Interaction to Next Paint means
Interaction to Next Paint or INP is the metric that assesses responsiveness by recording the latency of all interactions throughout the page’s lifecycle; the page’s INP is recorded as the highest value of these interactions or the value closest to the highest for pages with many interactions, and a low INP value ensures that the page is always reliably responsive.
It is therefore a full-page lifecycle metric, just like Cumulative Layout Shift, and thus measures all interactions, not just the first one, changing and updating continuously throughout the entire lifecycle of the page; also, as in the case of CLS, an INP value is not recorded until the user leaves the page.
INP is also referred to as runtime responsiveness to differentiate it from simple load responsiveness, and in practical terms it measures the entire input latency from the time a user interacts until they actually see a visual response, not just the initial delay on the main thread.
What is responsiveness and why is it important for a site
Wagner also delved into the meaning of responsiveness, a value that estimates how quickly a page responds to user input and is critical to people’s positive interaction with pages.
When responsiveness is good, pages respond quickly to user interactions: when an application responds to interactions, the resulting changes in the user interface are visual feedback that “tells us, for example, whether an item we asked to add to a site’s shopping cart is actually added, whether the contents of a login form are authenticated by the server, whether a mobile menu has opened, and so on.”
Some interactions naturally take longer than others, but for particularly complex interactions it is important to quickly present initial visual feedback that signals to the user that “something is happening.” The time until the next paint is the first opportunity to do this. Therefore, the intent of INP is not to measure all possible effects of the interaction (such as network fetches and UI updates from other asynchronous operations), but the time until the next paint is blocked. By delaying visual feedback, we may give users the impression that the page is not responding to their actions.
The practical goal of INP is to ensure that the time from when a user initiates an interaction until the next frame is drawn is as short as possible, for all or most of the interactions made by the user.
The video clarifies these issues, showing a visual representation of poor and good responsiveness: on the left, long tasks block the accordion from opening, which causes the user to click several times, thinking the experience is interrupted. When the main thread catches up, it also processes delayed input, causing the accordion menu to open and close unexpectedly.
A better metric for measuring responsiveness
First Input Delay is “an excellent metric for measuring input responsiveness during page loading,” the Googlers say, and when it was added to Core Web Vitals in 2020 it represented “a huge step forward” from previous tools because it offered developers a new way to measure responsiveness the way real site users experience it. Unlike similar metrics that “only approximate page interactivity, such as Total Blocking Time (TBT) and Time To Interactive (TTI),” Viscomi and Sullivan specify, “FID directly measures user experience”-essentially, a page could have slow TBT or TTI and still be perceived as responsive because of the way real users interact with the page.
Although it has indeed improved the way we measure responsiveness, FID has not been without its limitations, and there is another aspect that has decreed its what we can call obsolescence: the Web continues to become faster and more capable, and users expect richer, more interactive interfaces, so looking only at responsiveness during page loading does not tell the whole story. A more holistic approach to measuring responsiveness is needed, and INP goes in that direction.
The name FID itself immediately reveals the first two limitations: “first input” and “delay.” FID reports only the responsiveness of the first time a user interacts with the page. Although first impressions are important, the first interaction is not necessarily representative of all interactions over the life of a page, the guide explains. In addition, FID only measures the input delay portion of the first interaction, which is the amount of time the browser had to wait (due to main thread occupancy) before even starting to handle the interaction.
All of this was analyzed and led to the introduction of INP, which, instead of measuring only the first interaction, takes into account all interactions, reporting one of the slowest in the entire page duration. And, instead of measuring only the lagging portion, INP measures the entire duration from the beginning of the interaction, through the event handler, and until the browser is able to draw the next frame-a process that clarifies the interaction name until the next display. These implementation details make INP a much more comprehensive measure of perceived user responsiveness than FID.
What the introduction of INP among CWVs means for site owners and analytics tools
Before we get into the technical details about INP, it may also be useful to re-read what Martin Splitt, Developer Relations Engineer on the Google Search Relations team, wrote about the impact the Core Web Vitals revolution may have on Search Console reports and those working on site optimizations.
As mentioned, the highlight is that in March 2024 INP will replace FID as part of Core Web Vitals. To help site owners and developers take the necessary steps and evaluate their pages for the new metrics, Search Console will include INP in the Core Web Vitals report as early as the end of this year, and then stop showing FID metrics at the time of the final “replacement” and use only INP as a metric for responsiveness, in addition to join Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS), which remain confirmed instead.
From an application standpoint, those already working to improve the site in compliance with Core Web Vitals have already considered page responsiveness, and, according to Splitt, the improvements made for FID are a good basis for improving INP and responsiveness. Achieving good Core Web Vitals can help succeed with search and ensure a great user experience in general, the Googler adds, but “a great page experience implies more than Core Web Vitals and good statistics within the Core Web Vitals report in Search Console or third-party Core Web Vitals reports do not guarantee good rankings.”
The definition of INP and the value of the metric
Wagner’s original article explained in detail how INP works and how to measure it, and also offered a number of initial suggestions for improving the value, starting with the assumption that good responsiveness is essential to ensure a good user experience-the technical optimization aspect of INP was then further explored in a subsequent in-depth discussion.
Picking up on the definition, Interaction to Next Paint is a metric that aims to represent the overall latency of a page’s interactions by selecting one of the longest individual interactions that occur when a user visits a page.
For pages with less than 50 total interactions, INP is the interaction with the worst latency; for pages with many interactions, INP is often the 98th percentile of interaction latency.
As of today (2023), Google has since told us, 93 percent of sites have good FID performance on mobile devices, but only 65 percent of sites have good INP on mobile devices.Since, as mentioned, INP paints a much more accurate picture of responsiveness, these numbers help us see more clearly the room for improvement ahead.
How INP is measured
Going back to the basic definitions, an interaction is a set of related input events that are triggered during the same logical user gesture: for example, “tap” interactions on a touchscreen device include multiple events, such as raising the pointer, lowering the pointer, and clicking, that can contribute to the overall latency of the interaction.
The latency of a single interaction consists of the longest duration of each event that is part of the interaction, where the duration is measured from the time the user interacted with the page until the next frame is presented after all associated event handlers have been executed.
The duration is the sum of the following times:
- The input delay, which is the time between the moment the user interacts with the page and the execution of the event handlers.
- The processing time, which is the total time required to execute the code in the associated event handlers.
- The presentation delay, which is the time between the end of the execution of the event handlers and the presentation of the next frame by the browser.
In general, a low INP means that the page has consistently been able to respond quickly to all, or the vast majority, of user interactions.
Calculating INP: what are the optimal values
According to Wagner, ascribing labels such as “good” or “poor” to a responsiveness metric is difficult: on the one hand, Google wants to encourage the development of user experiences that offer good responsiveness, but on the other hand, it is necessary to take into account the fact that there is considerable variability in the capabilities of the devices people use, and accordingly set expectations that are truly achievable by selecting a goal that is not impossible to achieve on low-end devices.
In light of this, it is important that the responsiveness metric is appropriate for a wide variety of use cases, and to be certain of achieving this goal, a good threshold to measure is the 75th percentile of page loads recorded in the field, segmented between mobile and desktop devices:
- An INP value of 200 milliseconds or less means the page has good responsiveness.
- An INP greater than 200 milliseconds and less than or equal to 500 milliseconds means that the page responsiveness needs improvement.
- An INP greater than 500 milliseconds means that page responsiveness is poor.
However, since INP is an experimental metric, threshold indications may change over time as the metric is fine-tuned, warns the article.
What’s in an interaction?
It also becomes useful at this point to understand what is meant by an interaction, and Wagner dwells particularly on this aspect.
As for Interaction to Next Paint, an interaction consists of one of the following actions:
- Click the mouse on an interactive element.
- Touching an interactive element on a touchscreen-equipped device.
- Pressing a key on a physical or on-screen keyboard.
An interaction can be composed of multiple events – for example, pressing a key is composed of the keydown and keyup events and touch interactions contain the pointerup and pointerdown events – and all the events of an interaction are part of a so-called logical user interaction.
Each interaction consists of three phases: input delay, processing time, and presentation delay, and the image above shows the phases of a single interaction. The input delay occurs from the time an input is received and can be caused by factors such as blocking activities on the main thread. Processing time is the time it takes for the interaction’s event handlers to execute. At the end of execution there is the presentation delay, which is the time it takes to render and paint the next frame.
The duration of event callbacks associated with an interaction is the sum of the times of the three phases; the event with the longest duration in the logical user interaction is recorded.
Similar to CLS, the INP is calculated when the user leaves the page, resulting in a single value that is representative of the overall responsiveness of the page over the entire page lifecycle. If the page responds quickly to high percentile interactions, it means that interactions at all lower percentiles are also fast.
What happens if there are no interactions
In some cases, the page loads, but no user interactions occur. This can happen for several reasons:
- It is possible that a user loaded the page, but got distracted and never used it.
- The user loaded the page, scrolled through it (this is not an interaction that INP takes into account), but never clicked, tapped, or pressed a key on the keyboard. Perhaps the useful part of the page that the user was looking for required no interaction to reach.
- The page was visited by a bot (e.g., a search crawler or headless browser) that was not programmed to interact with the page.
In all these cases, no INP value will be reported.
Why INP does not evaluate the worst interaction latency
We might ask, at this point, why Google chose to measure Interaction to Next Paint by taking “one of the longest individual interactions” and not the worst interaction latency.
Wagner responds that worst interaction might be appropriate “for pages with a relatively low number of interactions,” but not all web pages are the same and “some require more interactivity than others, such as a text editor or a video game application than a blog or news site.” For pages with a very high number of interactions, in particular, sampling the worst could be misleading, and even on websites that prioritize responsiveness occasional hiccups occur, and these interactions should be overlooked.
In contrast, by focusing on a high percentile, but not always the highest, it is possible to properly assess whether the majority of a page’s interactions receive a timely response.
The relationship between INP and FID: INP is more reliable
Since its introduction in 2022, INP immediately attracted the interest of the international SEO community, mainly because this new metric immediately seemed to collide with First Input Delay, representing almost an evolution of it. This was also confirmed by Mocny’s presentation at Google I/O 2022, in which the Developer for Chrome Speed Metrics and Core Web Vitals (!) admitted that “FID has some pretty big blind spots,” adding that “that’s why we are introducing a new experimental responsiveness metric, Interaction to Next Paint,” leaving open even then the possibility that INP would replace (or at least flank) FID within the essential web signals and in Page Experience signals.
Delving into the above from a practical point of view as well, the difference between INP and FID is obvious: First Input Delay takes into account only the first interaction and measures only the input delay, not the processing time of event handlers or the delay in the presentation of the next frame, while in contrast Interaction to Next Paint considers all interactions on the page.
Being a load responsiveness metric, FID applies basic logic that “if the first interaction with a page in the loading phase has little or no input delay, the page has made a good first impression.”
INP goes beyond this “first impression,” however, because it covers the full spectrum of interactions that can occur from the moment the page begins to load to the moment the user leaves the page. By sampling all interactions, it manages to assess responsiveness comprehensively, which “makes INP a more reliable indicator of responsiveness than FID,” Wagner summarizes.
The differences between INP and FID and the advantages of INP
Discussing the differences between FID and INP, the three engineers point out that First Input Delay “measures the waiting time from the first user interaction to the time when the browser is able to process event handlers related to the interaction,” but does not include the time it takes to process event handlers, to process subsequent interactions on the same page, or to display the next frame after event callbacks are executed.
However, they note, responsiveness is critical to the user experience throughout the page lifecycle, since users spend about 90 percent of the time on a page after it loads. Moreover, since FID measures precisely only the input delay of the first interaction, it is likely that web developers have not proactively optimized subsequent interactions as part of their improvement process.
This is where INP comes in, which measures the time it takes for a web page to respond to user interactions, from when the user begins the interaction until the next frame appears on the screen. With this new metric, Google hopes “to obtain an aggregate measure of the perceived latency of all interactions in the page lifecycle,” a “more accurate estimate of the loading and execution responsiveness of web pages.”
FID vs. INP: features, calculation and optimizations
The article also presents a valuable summary and comparison table between the characteristics of First Input Delay and Interaction to Next Paint.
In general, INP tends to have lower pass rates, and the difference in the measurement process requires further optimization of the code. Specifically, the points to focus on are:
- FID measures the duration between the first user input and the time when the corresponding event handler is executed.
- INP measures the overall latency of the interaction using the delay
– of the single largest interaction for less than 50 transactions.
– of one of the largest interactions for more than 50 transactions.
2. What it depends on
- FID depends on the main thread’s readiness to run the event handler needed for the first interaction. The main thread may be blocked because it is processing other resources as part of the initial page load.
- INP depends on the availability of the main thread and the size of the script executed by the event handlers for several interactions, including the first interaction.
3. Primary cause of poor scores
- The optimization process is similar to that of FID for each interaction, but it also requires the use of rendering models that prioritize key UX updates over other rendering tasks.
How to measure INP: tools and techniques for calculating responsiveness
Instead, it is Wagner’s article again that goes into the technical details of how to measure Interaction to Next Paint on a site’s pages, while also offering useful practical tips for correcting situations where the values are not optimal.
First, INP can be measured either in the field or in the lab (with some effort) through a variety of instruments.
Among the field tools are:
- PageSpeed Insights.
- Chrome User Experience Report (CrUX).
– via BigQuery in the experimental.interaction_to_next_paint table of the CrUX dataset.
– CrUX API via experimental_interaction_to_next_paint.
– CrUX dashboard.
Wagner cautions, however, that currently collecting INP metrics in the field only works on browsers that fully support the Event Timing API, including the interactionId property.
Among the lab tools instead, we can use:
- Lighthouse panel in DevTools, available in “Timespan Mode.”
- Lighthouse npm module.
- Lighthouse user flows (user flow).
- Web Vitals extension for Chrome.
How to improve the INP value
If our website reports INP values that fall outside the “good” threshold, we can obviously make some interventions to try to improve performance and thus responsiveness.
Fixes begin with identifying the critical time for responsiveness, i.e., whether during page startup or later.
How to improve INP during page startup
According to HTTP Archive, Total Blocking Time (TBT) correlates twice as well with INP as it does with FID. TBT is a laboratory metric, but if high TBT values are observed in laboratory instruments “it could be a sign of higher INP values” in the field as well.
To improve responsiveness during page loading, we can examine the following solutions:
- Remove unused code using the coverage tool in Chrome’s DevTools.
- Use the performance profiler to find long tasks that can be optimized.
Improving INP after page startup
But a page’s Interaction to Next Paint can also be affected by what happens after the page is started, because the metrics are calculated as mentioned based on inputs sampled throughout the page’s lifecycle.
When this happens, we can examine some areas for solutions:
- Use the postTask API to prioritize appropriately.
- Schedule non-essential work when the browser is idle with requestIdleCallback.
- Use the performance profiler to evaluate discrete interactions (e.g., activating a mobile navigation menu) and find long tasks to optimize.
- Third-party scripts. Third-party scripts, which are sometimes not needed to process an interaction (e.g., advertising scripts), can block the main thread and cause unnecessary delays. Prioritizing essential scripts can help reduce the negative impact of third-party scripts.
- Multiple event handlers. Multiple event handlers associated with each interaction, each running a different script, can interfere with each other and cause long delays. Some of these activities may be nonessential and may be scheduled on a web worker or when the browser is idle.
- Prefetching. aggressive prefetching of the resources needed for subsequent navigations can be a performance benefit, if done well. However, if you prefetching and rendering SPA paths synchronously, you can end up having a negative impact on INP as all this expensive rendering tries to be completed in a single frame. This is the case when you do not prefetch the path and start the necessary work (e.g., fetch() ) and unlock the display. It is therefore worth reviewing whether the framework’s approach to prefetching provides an optimal UX and how (if at all) this may affect INP.
Ultimately, then, to achieve a good INP score, developers will need to focus on reviewing the code that is executed after each interaction on the page and optimize chunking, re-hydration, loading strategies, and the size of each render() update for both first-party and third-party scripts.
We welcome Interaction to Next Paint
According to Googlers, the INP score is intended to be a “better compass for websites” to improve responsiveness and performance, establishing a new level in measuring page responsiveness as truly perceived by users.
Initially, the international community reacted with curiosity but no apparent frenzy to the news, partly in light of their experience with Page Experience and Core Web Vitals, which ultimately proved to be a ranking factor of little impact (at least perceived) despite the lofty premises.
Now, however, things are getting more serious, and we have less than a year to figure out how to optimize our pages to intercept this new metric, with the goal of making our sites increasingly high-performing and (in this case) actually responsive.