Web Performance: Leveraging the Metrics that Most Affect User Experience (Google I/O ’17)

Web Performance: Leveraging the Metrics that Most Affect User Experience (Google I/O ’17)

[MUSIC PLAYING] SHUBHIE PANICKER: Hi, everyone. My name is Shubhie Panicker. I’m an engineer
working on Chrome. PHILIP WALTON: And
I’m Philip Walton. I’m an engineer working
on the web platform team. SHUBHIE PANICKER:
Over the last year, we’ve been part of our
metrics team in the web platform developing a set
of new metrics and APIs that are usercentric
in that they capture user-perceived performance. We’ve developed a
framework for thinking about user-perceived
performance that we want to share with you today. And Phil and I are really
excited to be here sharing these metrics and APIs with you. In our past lives, we’ve
been web developers, and we understand the pains from
gaps in real world measurement. And before Google, I worked on
web frameworks for apps like Surge, Photos, G+, et cetera. PHILIP WALTON: And before
working on the web platform team, I worked on
Google Analytics. So I know a lot about– and I’ve seen a lot of
the challenges around– tracking performance
in the browser. SHUBHIE PANICKER: So this is
the goal of our talk today– to help you answer
this question– how fast is my web app? You’ve certainly
asked yourself this. And this may seem like a
straightforward question. But the problem is
that “performance” and “fast,” these
are vague words. What does fast mean
in what context? Fast means different things
on navigation, or clicking, scrolling, or animations. So what is performance? And what is fast
in these contexts? And fast for whom, exactly? PHILIP WALTON: Right. The truth is
performance is hard. We all know this. And for web developers, it’s
harder than it should be. That’s one of the reasons
we’re talking about this. There’s a lot of tips and tricks
that you might have heard. And when not implemented
or understood in the right context, they can
sometimes make things worse. So in this talk we
don’t want to give you more of these tips and tricks. We want to talk about a way
to think about performance, a framework, a mental model
for understanding performance measurement. And then the hope is that once
you understand this model, you have a lot more
tools at your disposal to solve performance problems
yourself in your own app. But before we do that,
let’s talk about some myths and misconceptions
around performance today. So I would say this is probably
the most common myth that I hear– some variation
of this sentence– “I tested my app, and it
loads in x.xx seconds. So the reality is that
your app’s load time is not a single number. It’s the collection
of all the load times from every individual user. And the only way
to fully represent that is with a distribution
like the histogram you see here. In this chart, the numbers along
the x-axis show load times. And the height of the
bars on the y-axis show the relative
number of users who experience the load
in that particular bucket. As you can see, while the
largest buckets and the most users were between
maybe 1 and 2 seconds, there were many, many users who
experienced much longer load times. And it’s important to not
forget about these users. So this pattern toward the right
is often called the “longtail.” And unfortunately, it’s very
common in the real world. SHUBHIE PANICKER: And this
histogram actually illustrates the difference between
measuring performance in two very different contexts. And these contexts
are measurement in the lab versus measurement
in the real world. And by lab, I mean great tools
like DevTools, Lighthouse, WebPagetest, other continuous
integration environments you might have set up. Lab is important. It gives you a sense for
how your changes are going to behave in the real world. It helps you catch
regressions before they hit your live production site. And they give you deep
insight and breakdowns. So you track down
and fix problems. So lab is super important. It is necessary. But lab is not sufficient. Real world measurement, on
the other hand, is messy. Real devices, various
network configurations, cache conditions, all of
these different conditions for real users are impossible
to simulate in the lab. Real user measurement
helps you understand what really matters to your users. It helps capture
their actual behavior, which may be different from
your assumptions or your lab settings. So to really answer the question
of, how fast is my app?, it’s important to measure
this in the real world. So in our talk
today, we will focus on real world measurement. PHILIP WALTON: So coming back
to this myth for a second, there’s another reason why
the statement is problematic– the question, when
exactly is load? Is an app loaded when the
window load event fires? Does that event really
actually correspond to when the user thinks
the app is loaded? So I’d argue that load is
not any one single metric. It’s an entire experience. And so it can’t
be repre– sorry. I meant to say it’s
not one single moment. It’s an entire experience. And it can’t be represented
by just one metric. So to better understand and
illustrate what I mean by that, I want to show you an example. I’m going to play a video of
the YouTube web app loading on a simulated slow network. And I want you to pay attention
to how the video loads, the app loads, and notice that things
are coming in one by one. So can we play the video? OK. So think about how
that felt. And now, I want to play the second video. And I want you to pay
attention to how you feel watching the second video. Think about the experience. Can we play the second video? So that feels
different, doesn’t it? I bet some of you were not sure
if the video was even playing. And that’s the point. When you don’t give that
feedback to the user, it makes them feel something. So these two videos– as
I’m sure you guessed– load in the exact same
amount of time. But the first one seems faster. At least, it feels nicer because
things come out right away. It’s like if you
went to a restaurant and you sat down at a
table, waited for an hour, and then they brought you your
drinks, appetizers, entree, dessert, check, and dinner
mint all at the same time. That would feel weird. You would wonder why they
waited until the very end. So again, you might look at
this and then you might think, OK, well, we should optimize
for the first initial render. Get content there
as soon as possible. That’s what this proved, right? And again, that’s–
sometimes that’s true. But that’s not always true. Sometimes when you
do that, you can make things worse in some
cases and cause other problems. So I’m going to play
another example– a real life example– from Airbnb’s mobile web site. And so for context,
I know personally that the Airbnb engineering team
cares deeply about performance and user experience. And they try to make their
pages as fast as possible. So one way they do
this is use server side rendering to
deliver all the content in the initial request. And it shows because the
page loads really fast even on a slow connection. The problem is that on slower
devices that take longer to execute JavaScript,
the page is rendered, but it’s not usable for
a couple of seconds. And you can see
that in the videos. Can we play the third video? So as you can see,
the user here tried to click a few times
in the search bar and then nothing was happening. And it wasn’t until maybe
the sixth click or so that the component pane
from the top scrolled down. And so to be clear,
this video is from a simulated slow device. It doesn’t represent the
majority of their users. But Airbnb is committed to
providing a good experience for all of the users. And they wanted to fix this,
and they care about this, and so they’re currently working
on a fix to this problem. And I just want to
mention on a personal note that I’m really happy and glad
that Airbnb was willing to let us show this to you. I think it’s cool that they
want other developers to learn from their experience. So can we go back to the slides? All of these examples
that I just showed illustrate why you
shouldn’t measure load with just one single metric. Load is an experience and
you need multiple metrics to even begin to capture it. SHUBHIE PANICKER: So this
is another commonly held misconception, “you only need
to care about performance at load time.” Now, loading is super important,
but it’s certainly not everything. And historically, we’ve
all fallen into this trap of narrowly focusing on load. And part of it is just our
own developer outreach. Our tools focus pretty much
exclusively on loading. The reality is that there’s
lots of other interactions that happen long after
load, all kinds of clicks, taps, swipes, scrolls. Think of all the time you spend
on new sites, in your email, on Twitter or Amazon. Load is a really small fraction
of this overall user session. And users associate performance
with their entire experience. And unfortunately,
the worst experiences stick with them the most. So this is a summary of
the problems that we’ve highlighted today so far. Real world metrics
are a distribution. They should be seen
on a histogram, not as an individual number. Load is an experience. It cannot be captured with
a single moment or a single metric. Third, interactivity is
a crucial part of load, but it’s often neglected. And finally, responsiveness is
always important to users way beyond load time. So these are the questions that
we want you to ask us today. And these are the
questions that we hope we can answer for
you as part of this talk. User perceived
performance is important. What are the metrics that
accurately reflect this? How can we measure these
metrics on real users? How can we integrate these
measurements to understand how well our app is doing? And finally, how to
optimize and prevent regressions going forward? So in this segment
of the talk, we want to talk about these new
metrics and the basic concepts underlying them. So we’ve all used
traditional metrics like DOMContentLoaded and Window
OnLoad to measure load time. The problem is that
they don’t really correspond to the user’s
experience of load. They have almost nothing to do
with when the users saw pixels on the screen. For example, a CSS Dial
might be hiding the content when DOMContentLoaded fires. And even if the
content is rendered, interaction can be blocked. The JavaScript
might not be there to hook up a critical
handler, for example. And these old metrics
completely ignore interaction, even though we know that
interaction is super important for modern web apps. PHILIP WALTON: So what
are the key experiences that matter to users and
shape their perception? I think it’s helpful to
frame these as questions that the user might be asking. Is it happening? So did the navigation
start successfully? Has this server responded? Is there anything that indicates
to the user that it’s working? And then is it useful? Has enough content rendered
that the user can actually engage with the page? And once content is rendered,
is the content usable? Can they interact with it? Is it blocked? Is something preventing that
interaction from happening? And finally, is it delightful? Are the interactions smooth,
natural, free of lag or jank. And is the overall
experience good? So now let’s look at
how these questions map to measurable metrics. Here’s an illustration of
a page’s load progress. So the first frame over there
is just the blank white screen before the browser
has loaded anything. The second frame represents
the first paint metric. It’s the point at which anything
is painted to the screen that the user can see,
anything different from what the screen looked like
before the response. The second frame shows
first content full paint– the second metric. It’s when any of the
content is painted. And by content, I mean
something in the DOM. It doesn’t just have to be text. It could be images, or
canvas, or SVG, something in the DOM that’s
painted to the screen. In the third– or I should
say, the fourth– frame, you see some more
stuff coming in. But it’s not quite enough
content to be meaningful. And then you get to
first meaningful paints in the fifth frame, where
the user can actually engage with the content. Enough stuff is rendered
that the user can– what they came for is here and
they can start consuming it. And then finally,
the last metric timed interactive is when
the page is both meaningfully rendered and
usable, meaning it’s capable of receiving
input and responding in a reasonable amount of time. SHUBHIE PANICKER: So Phil
said that the First Meaningful Paint is when the page is
useful and the user can engage. This is when the primary content
of the page has rendered. But what is primary content? Which elements, exactly? Now, not all elements
on the page are equal. There are some elements
that are important. We call them Hero Elements. And when these Hero
Elements are rendered, you have arrived at the user
moment of, it is useful, and the user can meaningfully
engage with the page. So here are some
examples to show you what I’m talking about. These are Hero Elements
for some popular sites. So for YouTube, we think
on the YouTube watch page, the Hero Element is likely the
thumbnail of the primary video and the Play button. For Twitter, it is likely
the Notifications Count and that first tweet. For the Weather app, it’s
probably the primary weather content, even though
there might be tons of other stuff on the page. So when these Hero
Elements have rendered, this corresponds with first
meaningful paint and the, it-is-useful user moment. PHILIP WALTON: And
you might notice that some of these Hero
Elements are constant based and some of them are more
interactive components. Like in YouTube, for
example, the Hero Element is rendered when the
thumbnail is loaded and the Play button is visible. But it’s probably not actually
usable until the JavaScript that controls the Play button
has run and enough of the video has buffered to actually
be able to start playing. If Hero Elements
are interactive, then not only does
rendering them matter, but also when it’s
usable, when it’s CDI is. However, there are times, as we
mentioned, when interactivity can be blocked. SHUBHIE PANICKER: So to
understand why important elements might be blocked
and not interactive, think about a time when you
were in a long line somewhere– let’s say at the grocery
checkout or the bank– you’re standing in line, and
there’s one or two customers who are confused,
or they’re angry, and they hold up the line
causing a long delay. This is what long tasks do
on the browser’s main thread. These are tasks that run long. They occupy the main
thread for a long time. And they basically block
all the other tasks in the queue behind them. And scripts are the
most common cause of long tasks, all the
work that scripts do, in terms of parsing,
compilation, evaling, et cetera. So if you’ve used
DevTools, you’re familiar with all the primary
type of work, Style, Layout, Paint, Script. It turns out all of this
happens on the main thread. And it also so happens
that most interactions– things like taps, clicks,
and even animations– typically also need
the main thread. So you can see how
this can be a problem. A long script is running,
begging the main thread, and the user is
trying to interact. And these interactions are
basically waiting in the queue. And this manifests as jank
to users, as delays in click, jank in scrolling,
or jank in animation. So you might wonder
how long is long? What is long? And so we define long
to be 50 milliseconds. Scripts should be broken
into small enough chunks so that even if
the browser is idle and a user happens to
interact, the browser should be able to finish
what it’s doing and service those
inputs, that interaction. And so 50 millisecond
chunks will ensure that the RAIL
guidelines for responsiveness is always met. Now, you might have heard
a lot about 60 FPS and 16 milliseconds. And some of you might wonder
why isn’t this 16 milliseconds? And so the reason is,
yes, if you are animating, then 16 milliseconds
is important. But animation issues are a small
subset of responsiveness issues at large on the web today. And if you know
you are animating, then yes, you have to
share the 16 milliseconds budget with the browser. Now, long tasks are
the cause of most of the responsiveness
issues on the web today. And scripts are, by far,
the most common cause of long tasks. PHILIP WALTON: So just
to recap, this table shows how each of these metrics
map to the user question from before. So the question, “is it
happening?” maps to the metrics for First Paint, First
Contentful Paint. “Is it useful?” maps to
the First Meaningful Paint and the Hero Element timings. “Is it usable?” maps
to Time to Interactive. And then the last one,
“is it delightful?” maps to what Shubhie just
mentioned, long tasks, or maybe more accurately, the
absence of long tasks. SHUBHIE PANICKER: So
you might be wondering how metrics like
First Meaningful Paint or Time to Interactive
can work for every app. And you’d be totally right. One size cannot fit all. We’ve actually
spent a lot of time in our metrics team
trying to develop these generic,
standardized metrics that work for every app. And what we’ve
learned is that it’s incredibly hard to do that. And that also makes it
hard to standardize. That said, there is value in
these generic, standardized metrics. And so these
baseline metrics that work for the majority case, say,
let’s say, 70% to 80% of apps out there. And we have made such metrics
available in our tools. You might see them in
Lighthouse, DevTools, by WebPagetest. And we are working to
consolidate these definitions. Down the road, we
expect analytics to start surfacing
variants of these metrics. The main thing to understand
for these out-of-the-box, generic metrics is that don’t
assume that they accurately capture the use, is it useful,
and is it usable moments for your apps. Try them out. See how well they work for you. And when it comes to
real user measurement, we encourage you to
supplement these metrics with your own
custom user metrics. Or customize these metrics
and make them your own. Make sure that they work
really well for your app. And we’ll show you specific
tips for doing that later. PHILIP WALTON: So now that
we understand and have these metrics, the
question is, how do we get these in JavaScript? That’s the most important
thing to measure on real users. Historically, we’ve
used– like we said– metrics like DOMContentLoaded
and WindowLoad primarily because they were easy
to get in JavaScript. I assume every
web developer here knows how to find out
when WindowLoad happens or when DOMContentLoaded
happens. But these other metrics have
traditionally been a lot harder– sometimes impossible– to get in JavaScript. And trying, sometimes, to find
them can lead to problems. This code sample shows how
you would detect long tasks before these new metrics. And this is kind of a hack. So what this code is doing
is it’s effectively making a requestAnimationFrame loop. It’s measuring frame
after frame after frame. And it’s comparing the time
stamps from the current frame to the time stamp on
the previous frame. And if this is longer
than 50 milliseconds, then it’s considering
it to be a Long Frame. But there’s a lot of
problems with this method. I mean it works, but it
adds a lot of overhead. It prevents idle blocks. It’s not great for battery life. And it doesn’t even tell you
the source of the problem. You don’t know– you might know
that there was a Long Frame. And so you can assume
there was a Long Task. But you don’t know what
script caused that Long Task. And this isn’t just a
hypothetical example. This pull request
on the AMP project is basically them
taking that code out because they
realized that it was more trouble than it was worth. The number one rule of
performance measurement code is that you shouldn’t be
making your performance worse by trying to figure out how
good the performance is. So these hacks show the
need for real APIs built into the browser so
the browser can tell us when performance is bad. SHUBHIE PANICKER: So
web performance APIs are the browser solution
to real world measurement. These are standardized APIs. So they’re available in multiple
browsers, not just Chrome. And when available,
we definitely recommend that you
use these APIs. In practice,
though, you will use a combination of these APIs,
as well as your own JavaScript Polyfills. And the reason why
Polyfills are necessary is because the implementation
timeline on browsers will vary. And we are asking you to
customize and supplement these metrics. So these are the core
building blocks, as we see it, for web performance. We have High
Resolution Time which you might be familiar with from
your use of Performance.now. The Performance Observer
is an important piece. It replaces the old
Performance Timeline. And it overcomes
its limitations, such as no pulling,
it’s a low overhead API, and its avoids
[INAUDIBLE] conditions from a shared buffer. So this is what the usage
of Performance Observer looks like. And it also happens
to be the code that replaces the hack
that Phil showed you just a little bit earlier. So Performance Observer usage
is fairly straightforward. You create a Performance
Observer and make a callback. And then you say observe
with expressing interest in certain entry types. And as entries of that
type become available, the callback is
invoked asynchronously. And there are many
different entry types. Long Tasks is what we
show in this example. But this could just as well
have been Resource Timing or Navigation Timing or
Paint Timing, which is a new metric we’ve introduced. This also serves as
a really good example of Long Task usage. You can basically use
this code to understand responsiveness
issues on your app. The callback is
called asynchronously when the main thread is
observed to be busy for more than 50 milliseconds at a time. And Long Task is
available in Chrome Stable today, so I encourage
you to try it out. PHILIP WALTON: So this table
shows what our recommendation is for how you would track these
metrics into your applications. And just to reiterate,
having these tracked in your
applications is what allows you to measure these
metrics on your real users, not just running it in the lab. So First Paint and
First Contentful Paint can be measured with Performance
Observer with the Paint entry type. This is available in
Chrome Canary today. Long Task can be measured
with Performance Observer also since Chrome 58. That’s Chrome Stable right now. For Hero Elements, it’s
a little bit trickier. Because you have to identify
what your Hero Elements are. And you basically have to
write some code to figure out when that’s visible. And I should mention that
along with this talk, I’m going to be
publishing an article on developers.google.com/web
very soon. It will be up when
this video goes up. That goes into more detail on
how to do all of these things. So you don’t have to worry
about if you’re taking notes or whatever. Also I should mention that
we’re working on a native API to make this easier,
where you could annotate, tell the browser what
are the Hero Elements. And then the browser would
tell you when they’re loaded– or when they’re rendered. For First Meaningful
Paint, at this point, before we develop a
standardized metric, we think you should use Hero
Element timing as a substitute for First Meaningful Paint. The First Meaningful Paint
metric is very, like we said, generic. It will– we try to
be one size fits all. Hero Elements is for your site. And so it will always be more
accurate than First Meaningful Paint. And finally, TTI, we released
the Polyfill today, actually. For the TTI Polyfill,
it’s on GitHub. And you can go try
it out right now. To give an example of
what the usage looks like, you essentially import
the module in JavaScript, and then you call the Get
First Consistently Interactive method. And that returns the promise. And the promise resolves
to the TTI metric value in milliseconds. And then once you have that,
you can send it to analytics. To get a sense for
what the Polyfill does, I should mention that the Get
First Consistently Interactive method takes an options object. You configure it for your site. And what you can do is you
can pass it a lower bound. The Polyfill will
assume the lower bound. By default, it’s
DOMContentLoaded. But you can give it a
better metric for your site. So the way this works is
you have the main thread with Long Tasks and Short Tasks. And you have the
Network Timeline. And then you have
your lower bound, which, by default,
is DOMContentLoaded. What the Polyfill does is it
uses these Resource Timing and Long Task entries
to search forward in time for a quiet
window of 5 seconds– at least 5 seconds– where there
are no Long Tasks and no more than two network requests. Basically it’s saying, once
we get to that quiet window, we think that the app is
most likely interactive now. And then it considers the
moment of interactivity to be where the
last Long Task was. So that’s a bit of how
this Polyfill works. Again, you can pass it a custom
lower bound for your site. And one example of what
you might want to use is the Hero Element timing. That would be a great example. You also might want
to pass, basically, the moment all of your
event handlers are added. Because if your event handlers
have not been added yet, the site is probably
not interactive yet. SHUBHIE PANICKER:
So Phil showed you how Long Task can push out
your Time to Interactive. But there’s lots of
other interactions that we’re asking
you to care about, way beyond learning,
like clicks and taps. And delays in these can
basically cause pretty bad user experiences. So you’ve probably
wanted to know when these important
events are delayed. And ideally, there would be a
first class platform API that would answer this question. And we actually are
working on such an API. But today you can actually
use this code sample to understand the gap. You can basically use the
difference of event.Timestamp and the current time
in your event handler. Now event.Timestamp
is our best guess of when the event was created. And so this can be the
hardware timestamp, or when our best guess is
when you tapped the screen. And this difference
will tell you how long the event was
spending waiting around on the queue for
the main thread. Now here, if that difference
is more than 100 milliseconds, we send it to Analytics. Now, we haven’t shown this
here, but you can also correlate this back to
your Long Task Observer. You can actually look at what
Long Task happened in this time when my event was
blocked, waiting. And those are
likely the culprits. PHILIP WALTON: So once you’ve
measured these key metrics and sent them to some
analytic service, you want a report on
them to see how you’re doing that will allow you to
better answer the question, is your app fast? So this is just one
example of a histogram that I threw together
from TTI data for an app that I maintain
using the Polyfill that we just showed you. And the point is not to look at
these numbers or compare them, but the main point
that I want to make is when you’re tracking
your performance metrics in your
analytics tool, then you can drill down
by any dimension that your analytics
tool provides. So in this case, we can see the
difference between Performance on Desktop versus Mobile. You might also want to consider
the difference between one country from another country,
or geographic locations where maybe network availability
is not as great, or network speeds
are not as high. It’s important to know how
those differences manifest in the real world on real users. In cases where you can’t
show a whole histogram, I recommend using
percentile data. So you could show the
50%, the median number. You can also show things
like the 75th percentile, the 90th percentile. These numbers give a
much better indication of what the distribution
was, and they’re much better than just averages
or just one single value. So a really important question
is, do performance metrics correlate with business metrics? And again, if you’re tracking
your business metrics in an analytics tool and
your performance metrics in an analytics tool– and this
shows the value of tracking the stuff on real users– then you can see and you
can answer this question. All the research that
we’ve done at Google suggests that good performance
is good for business. But the really important
thing is, is this true for your users
for your application? So some example questions
you might want to know, do users who have
faster interactivity times buy more stuff? Do users who experience more
Long Tasks during the check out flow drop off
at higher rates? These are important questions. And once you know the
answers to these questions, you can then make
the business case for investing in performance. I hear a lot of
developers saying they want to invest
in performance, but somebody at the
company won’t let them or won’t prioritize it. This is how you can
make that a priority. And finally, we haven’t
talked about this yet, but you may have been wondering. All of the data
we’ve been showing is for real users who
made it to interactivity. And you probably know some
users don’t make it there. Some users get frustrated with
the slow-loading experience and they leave. And so it’s important to
also know when that happens. Because if it happens 90% of
the time, the data that you have will not be very accurate. And so you can’t know
where the TTI value would have been for one
of those users, but you can measure
how often this happens. And perhaps more
importantly, you can measure how long they
stayed before they left. SHUBHIE PANICKER:
So we’ve discussed a lot of specific
metrics and APIs, and we’ve shown
you code samples. And so now we want to
back up a little bit and provide some
higher level guidance on how to best leverage
these metrics and APIs. So one great thing
about everything we’ve introduced today
is that all of these are usercentric
metrics and APIs. So by definition,
improving these will improve your
users’ experience. So the first piece of wisdom
is drive down First Paint and First Contentful Paint. And all of the traditional
wisdom for fast loads applies here. Remove those render-blocking
scripts from Head. Identify the minimum
set of styles you need and inline
them in Head. You might have heard of
the app shell pattern. That helps improve
user perception. The idea there is to
very quickly render the header and any sidebars. Now First Paint and First
Contentful Paint are important, but they are certainly
not sufficient. It’s really important to
improve your overall load time. So it’s not just enough to be
off to a good start in a race. It’s really important to make
it past that finish line. And Time to Interactive
is the finish line for loading for
interactive apps. So more specifically,
minimize the time between First Meaningful
Paint and Time to Interactive. We saw in the Airbnb demo
it was important for users to interact with
that search box. Now, to a shorten your
Time to Interactive, identify what is the primary
interaction for your users. Don’t make assumptions here. Do they tend to browse? Or do they tend to interact with
a certain element right away? And then figure out what is
the critical JavaScript that’s needed to power that
interaction and make the JavaScript
available right away. One common culprit
we’ve seen are large, monolithic JavaScript bundles. So splitting up JS
like code splitting will take you a long way here. And the PRPL pattern fits in
here, specifically the first– the P and R of PRPL. Ideally, ship less JavaScript. But if not, at least
defer the JavaScript. There’s tons of JavaScript
that the user is never going to need, all those
pages that they are not going to visit, all the
features that they’re not going to interact with. If there is a
widget in the Footer that’s below the
[INAUDIBLE] or they’re unlikely to interact with,
defer all of that JavaScript. The third thing we have
is reduce Long Tasks. Cracking down on the
Long Task will really help responsiveness
on your app overall. However, if you really
need to prioritize, at least think about Long Task
in the way of those really critical interactions. On load, it’s Long Tasks
that are pushing out Time to Interactive
or Long Tasks that are in the way
of the checkout flow and other important
interactions for your app. Scripts are, by far, the
biggest culprits here. So breaking up scripts
will certainly help. And it’s not just about breaking
up scripts on initial load. Scripts that load on single
page app navigations– like going from the Facebook
Home page to the Profile page, or clicking around like on
the Checkout for Amazon, or the Compose button in Gmail– all of this JavaScript
needs to be broken up so it doesn’t cause
responsiveness issues. And the final thing
we have for you today is holding third
parties accountable. Ads and social widgets
are known to cause the majority of Long Tasks. And they can undermine all of
your hard work on performance. You might have
done a ton of work to split out all
your code carefully, but then you embed a
social plug-in or an ad, and they undo all of that work. That get in the way of
critical interactions. So to get an idea of
this, we’re actually doing a partnership with SOASTA,
a major analytics company. And so they’re doing a
bunch of case studies. And there’s some preliminary
data that came in. They picked a couple of their
sites, their customers who had third-party content. And on the first site, they
found that 93% of Long Tasks were because of ads. On the second site, they
found 62% of Long Tasks were about evenly split
between ads and social widgets. Now, Long Tasks API
actually gives you enough attribution to implicate
these third party I-frames. So we encourage you to
use the Long Tasks API. Find out what damage these third
parties are doing on your apps. PHILIP WALTON: And once
you’ve optimized your app, you obviously want to make sure
that you don’t regress and go back to being slow. You don’t want to put a
bunch of work into this and then have it all be for
nothing if one new release turns everything bad. So it’s critical that you have a
plan for preventing regression. So this is a workflow
that I promote. You start off with writing code. You implement a
feature, fix a bug, improve the user
experience in some way. And then before you release
it, you test it in the lab. I assume a lot of
people do this. You run it through Lighthouse. You run it through DevTools. Make sure that it’s not slower
than your previous release. And then once you
release it to your users, you also are going
to want to validate that it is fast for those
users that you released it to. You can’t just test in one. You should– these things
complement each other. You should be testing both in
the lab and in the real world. And so for some
automation ideas, the best way to
prevent regression is to automate this process. You’re probably going
to slack on a little bit if you don’t have it built
into the release and automated. So Lighthouse runs on CI. And there’s actually a
talk tomorrow afternoon by Eric Bidelman
and Brendan Kinney that goes into how to do this. And I recommend
checking that out if you want to learn how
to run Lighthouse on CI. If you’re using
Google Analytics, you can set up custom
alerts that trigger when some condition is met. So for example, you
could get an alert if suddenly the number of
Long Tasks per user spikes. Maybe a third party
you were using changed their JavaScript
file, and things got worse, and didn’t know. And so this is a good way
of finding out that stuff. So getting back to the original
question, how fast is your web app?, in this talk, I hope we’ve
given you enough of a framework to think about performance
in the big picture in a usercentric way. I also hope we’ve given you
enough specific tools, metrics, and APIs that you need to answer
this question for yourself. We know the situation
isn’t perfect. We know we have more work to do. And Shubhie is working on
this, leading our efforts here at Google on the standards side. And so she can talk about
some of the things that are coming down the road. SHUBHIE PANICKER: And so this
is our final and last slide. And I just want to say that,
yes, we know there are gaps. And there’s a number of
APIs that we are working on. We’d love to have a first class
API for Hero Element timing. The idea there is that you
guys can annotate the elements that matter most for
your sites and then the browser can put those times
on the Performance Timeline. Secondly, we are working
on improving Long Tasks, mostly by improving Attribution. We really want to tell you which
scripts are causing problems and more detail breakdown so you
can actually take action right away and fix those issues. Secondly, we want to really
have an API for Input Latency. So you don’t have to go
through all those work arounds that we showed you
for Event Timestamp. Ideally, for your important
interactions for your app, you should be able to
how delayed they were, which Long Tasks
were in the way, and when the next
render happens. And then there’s other
inputs that we haven’t even touched on that are in our
backlog, things like scrolling, and composited animations. And finally, I just want
to leave this with saying, we want this– we
said a lot today– but we really want this
to be a two-way dialogue. We want to hear from you. We want to hear about
your frustrations. Don’t be quiet about
those gaps in measurement and your frustrations
with performance. Try out these APIs
and polyfills. And please, file bugs on
the Spec Repos on GitHub. This is actually the best
way to report issues and make feature requests. And if you’re working
with Analytics, whether it’s a different
team or a third party, push on your Analytics to
adopt these new metrics. Ask them for these histograms
like Phil showed you. And we are pushing on
Analytics too on our end. Star the Chromium
bugs on Performance. This is actually a signal we use
for prioritization internally. And we need these
signals to make a case for working on measurement. And finally, as
Phil said, we have all the links in the article
that he will publish shortly. And they will also be
linked from the video. PHILIP WALTON: Yes. So thank you. And this is how you
can get a hold of us. [APPLAUSE] [MUSIC PLAYING]

Be the first to comment

Leave a Reply

Your email address will not be published.