Learn Web Security with Google (Google I/O ’17)

Learn Web Security with Google (Google I/O ’17)


ELIE BURSZTEIN: Bonjour. My name is Elie Bursztein, and I
work on anti-abuse and security research. And today with Yuan, who works
on web security research, and Eric who works on
webmaster relationship, we’re going to tell you how you
can learn about web security with Google. But before we get started,
let’s do a quick show of hands. How many of you
had their website compromised or know someone
who had their website hacked? Come on. Don’t be shy. Raise your hand. Yep. Most of you raised your hand. This is not surprising. In 2016, hacking has
never been so prevalent. We’ve found 32% more website
compromise than ever. So hackers tries to hack your
website for many reasons, from attacking your user
by sending malware to them or trying to phish them to using
your resources as an attack platform to hack other websites
to try to steal your data and expose them. In either way, when
you get hacked, the consequences
are pretty dear. You are losing the
trust of your user and potentially
suffer financial loss. Some of the lasting
effects, such as user trust, can take years to recover. So this is why it is
essential to keep security at the forefront of
your web strategy and make sure that you
invest in security constantly from when you develop a new
thing, so when you maintain it, so when you upgrade it. I’m pretty sure you already
know that, because otherwise you would not be there today, right? So today, what
we’re going to do, we’re going to walk you
through the resources that Google provides to
help you secure your website and defend against hackers. More precisely, we’re going
to go into two topics. First, Eric will cover how you
can get help from Google when you get hacked. What are the resources we
have to help you clean up and to help you
restructure your website? Then with Yuan, we’re
going to give you a sneak preview of our
upcoming web security courses. And we are very excited to
release later this summer. To make it very
practical and give you a sense of what
you’re going to learn, we’re going to give you a
short overview, followed by a sneak preview of one of
the lectures– the SQL injection lecture. And of course, because
we wanted to have something which
is very hands-on, we have a lot of demo today. That is, if that
works, of course. And let’s go and jump right in. Eric, they’re all yours. ERIC KUAN: Thanks, Elie. So first off, hi, Mom. I’m on livestream right
now, and my mom’s watching. So I’m super excited. I’ve always wanted to do that. So hi, Mom. [APPLAUSE] Let’s jump right in. So what you should do if
your site gets hacked. You’re most likely going to
get some type of notification. From Google’s perspective,
that notification is this red interstitial
that you may have seen. It could be some type of
notification in search results. Now, notifications
don’t only have to come from Google, right? If your users are emailing
you, and they’re saying, I’m getting redirected, there’s
something wrong with your site, or if you’re monitoring
your web traffic and you see sudden spikes,
those are all early warning signs that your
site might be compromised. And you’ll want to
check those out. As Elie said, if your
site is compromised, there are financial
burdens, brand reputation. It’s really annoying. And worst of all, your users
are trying to get to your site, and they can’t get to your site. So it is in your best
interest to clean up as soon as possible. Now, the process of cleaning
up, it can be quite daunting. It’s going to take some time. But if you follow these
steps methodically, there’s a really high chance
that you can clean your site. And so I’m going to walk
through these really fast. The first thing is to
quarantine your site. That means taking
your site offline. Or if you can isolate
it to certain parts, take certain parts
of your site offline. In this phase,
you’ll want to change any usernames, passwords,
user permissions. The whole point
being, you don’t want to be hacked while you’re
trying to fix your site. And also, you don’t
want your users trying to get to your site
while it’s compromised. So taking your site
offline temporarily is usually the best move. You also want to start
building your team here. Talk to your hosting provider. They may be able to help you. Talk to your wider team. Let them know that in
these next few steps, you’re probably going
to need their help. Identification. This is probably the
most difficult part and the most time consuming. Hackers are constantly
trying to prevent you from removing the
hack on your site. And so they’ll do weird little
tricky things like cloaking, where you’ll go to a
page, and you’ll see, oh, this page is
showing an HTTP 404, but it’s still serving spam to
your users and search engines. So identification
is really important. But it’s going to
take some time. At this phase, you’ll also want
to identify the vulnerability, because you want to understand
how hackers got into your site. Cleaning up is just about
removing those files and then testing, making
sure that your site is running well again. That’s the main
part of cleaning up, and that’s really
all you need to do. Patching is about closing
those vulnerabilities that you identified earlier to
make sure that hackers can’t compromise your site again. This is super important. A lot of people miss
this, and their sites do get compromised again. So don’t forget to close
those vulnerabilities. One of the easiest
things you can do is just to update your
site software, your CMSs, your plugins, things like that. Just make sure that
they’re updated, and that’ll close a
lot of vulnerabilities. Finally, if your site
was flagged by Google– you saw those red interstitials
on your site or you saw some warnings
in search results– you’ll want to tell Google
that your site was hacked or that your site is cleaned up
so they can remove those flags. The steps we talked
through just now, they’ll work for most types
of hacks on your site. But we’ve been doing
a lot of research, and we’ve realized that a
lot of hacking campaigns, they work in very,
very similar ways. And this helps us understand
how attackers are scaling their attacks, how they’re
trying to fly under the radar, go against detection. And this also helps us build
better detection systems and better documentation
for our users. We’ve identified three major
hacking campaigns so far. The cloaked keywords and
link hack, where they create cloaked pages. They drop keywords and
links on those pages. The gibberish hack. The gibberish hack creates a
bunch of long-tail keywords and then uses the domain’s
reputation to rank well. And then when users
click in, they get redirected to these
spammy malware bad sites. And finally, the Japanese
keyword injection. This targets specifically
Japanese brand-name goods and sends users to
counterfeit websites and attempts to sell
them fake goods. Because we’ve been able to
cluster the sites in this way, we’ve been able to create
really great documentation, step-by-step guides for
each one of these hacks. And I’ll link you to these
documents in a second, because I think it
will really help if you, your friends or
someone that you know, as you’ve shown in the show
of hands, has been hacked or will be hacked in the future. First, though, I want to talk
you through the gibberish hack. Understanding how
these hacks work is super interesting,
because it’ll help you with remediation in the future. Now, this is what the
gibberish hack looks like. It’s really plain. Again, long-tail
keywords, trying to use a domain’s reputation
to rank well, redirects users. The underlying
mechanics behind it are also pretty
simple when you put it into three separate pieces. The user clicks on
the site, they’re redirected in some way or
fashion to a PHP generator script, and then those users
are then redirected to the spam. So the first important
part of this whole chain is this redirect. If you can identify how that
redirect is happening, where it’s happening, you can identify
the other parts of your site that have been
compromised as well. In this example, the htaccess
file has been compromised. Three lines of code. It’s going to redirect your
users coming from major search engines and then send them
to this page right here, the spam dot php
page generator page. So you can see right now
that you’ve identified the other piece of the chain. And so this is the piece that
you want to identify later. It’s not going to be
called spam dot php. That’s way too obvious. We’ve seen hackers call
things horse duck 2. We’ve seen them try to
mask as core files like wp underscore config instead
of wp dash config. They’re really
trying to trick users from accurately and
quickly fixing their sites. So from the redirect,
we see that you’re sent to the page generator page. Now, you’re probably
going to open this file. You want to figure
out what’s going on. You’re curious. You’re curious how they’re
doing all this damage. And you’re going to get
something like this. This is just pure gibberish. They’ve obfuscated. They’ve encrypted. They don’t want you to find
out exactly what they’re doing. It’s really difficult
to understand what a lot of these
scripts are doing. Even if you do
de-obfuscate, and you take the time to figure out this
is exactly what the code is, it’s still not really
human readable. This is not coding
best practices. Your CS professors would be
appalled by looking at this. Luckily for you, all you really
need to do is remove this file. I would ask that you back up
these files just in case– just in case they
are good files. And later on, if you do want
to do some forensic work, it could be helpful having
these files as a backup. So just back up just in case. You can see these two files,
they’ve done a lot of damage to a website. And so that’s why
you don’t even want to be in the phase of
cleaning a website. Cleaning a website is difficult.
It’s financially costly. It’s annoying. You have your brand
reputation on the line. And that’s why the key takeaway
here is that prevention is key. So let’s talk about
a couple quick things that you can do today
right after the session in order to help
with prevention. First off, back up your site. There are a lot of people
that don’t back up their site. And that’s a little
bit baffling to me. Definitely back up your
site as often as possible. If you do get compromised, one
of the easiest ways to recover is just to restore a backup
version of your site. Remember that even if you’re
restoring the backed up version of your site, that vulnerability
that the hackers initially got to probably still exists. So you want to fix
that vulnerability. Secondly, sign up
for Search Console. If Google does find that there
is compromise on your site, we’ll send a notification
via Search Console. This is one of the ways
that Google communicates with webmasters about,
not only security issues, but a lot of search
issues as well. And finally, update your
code, your CMS, your plugins, your themes. Like I said before, this is
one of the most common ways that attackers
compromise a site. And I know this is
difficult, because I’ve talked to a developer before,
and she said, my clients, they don’t want to update their
site, because if we update the core CMS build
files, they’re going to mess up a
whole bunch of plugins. Yeah. That does happen, but it’s
really in your clients’ best interest to update. And you have to convince them. You have to talk to
them about making a site that is both secure
and still works for them. So that’s a really,
really important piece. As I said earlier, there’s
a lot of documentation that Google can
give to help you. Our webmaster hacked help center
is at g.co/hackedwebmasters, and that’s where our
security guides are, the ones for the specific
hack campaigns that we’ve identified. And we’re constantly
building more of these guides for different
types of hacking things that we’ve identified. The second thing is the
webmaster help forums. We have a lot of awesome
top contributors, experienced webmasters, and
Googlers in these help forums to help you remediate
your site, fix your site, identify any vulnerabilities. And finally, follow
us on Twitter. We give not only
security updates, we’ll give you
updates about search. Now I’m going to
hand it back to Elie to talk to you about the second
part of securing your site, and that’s building
a safer website. ELIE BURSZTEIN: Thank you, Eric. [APPLAUSE] So as Eric said, when it comes
to security, prevention is key. And so far at Google, we didn’t
have much of a public course to help you out. So we decided over the
last year and a half to develop a new
security course which is meant to be very
hands-on so you can have a very practical
knowledge you can apply to secure your website. The core idea
behind the course is to have very short, focused
lecture on core concept that you can apply
immediately, and a lot of exercises so you can build
on hands-on experience and have this knowledge that you
can have to protect your site. So let’s look at what
the course looks like. And then we’ll do a few
demos along the way. The course will be a set of 12
lectures, which will be grouped into three main categories. The first thing we’re
going to discuss is how to handle
user data safely, from how do
authenticate my user? How do I maintain
session with them safely? And finally, of course, how
do I encrypt my communications so that when they interact
with your website, your users are safe? So second thing we’re going
to cover is web attacks. These are attacks which are
specific to web security and that you need to know to
make sure that you are not vulnerable to them. We’re going to cover
the four big ones, which are XSS, CSRF, SQL
injection, and clickjacking. And finally, but not least,
we are going to tell you how to enter securely
third-party content, whether you would like to embed
a widget from a third-party website and make sure
that when you do that, you do not get hacked– and you say get hacked– to how do you deal
with user content? How do I make sure
that I don’t have toxic content on
my website or they don’t post infamous picture
into my beautiful stream? And finally because, as Eric
said, when you get hacked, it’s really
difficult to recover, we have a lecture
dedicated to investigation and show you concrete
case of hacking so you can learn how to
investigate them and clean up. So in the event you get hacked,
you’re already prepared. So for each lecture, we’re going
to provide you a few materials. First, we’re going to
give you slides where you can review the lecture. We’ll do a video of the
slide with some explanation. And the most important
part, we will give you exercises you can
do and the quizzes so you can know how well
you’re doing and if you have understood the concept. So exercise was really
a key development point. We have built a ton of those. We have over 50
exercises for you. They are going to be in multiple
aspects in which [INAUDIBLE] aspect of web security. First, you have attack exercise,
where you wear your black hat and you try to attack websites. So you get into the
mindset of the attacker and understand how they go
about hacking your website. Then, of course,
we go to defense, where you know how to apply the
best state-of-the-art technique to protect your website, and
you really understand what is the secret to [INAUDIBLE]
and how you can apply them successfully. And finally, for
some of the things, especially for hacking, we
will cover investigation, where we give you some puzzle
and some interesting hack to look at and understand
if you can figure them out. One of the essential
challenges we had to overcome was it’s not easy to
teach web security, because you manipulate
vulnerable code, right? So we can’t put it
online because we’re going to get hacked,
so how you do that? Originally, people
come up with the idea of using virtual
machines or have people install a ton of packages. And you have to run
all those things. This is not very ideal, because
it’s very resource-intensive. You have to do
[INAUDIBLE] stuff. It’s on your computer. It also limits the amount
of device you can use. There is no way you can
use a tablet, for example. So about a year ago, I was
thinking of the problem and we say, well, what
technology has so much evolved? Let’s try to do
something different. Let’s try to use
web app technology. And the crazy idea we had
is, let’s build a web server directly into a web page. I know that sounds
crazy, but the idea was we have web service worker
so we can make it offline. We can make it intercepting the
request and just respond to it. So let’s build that. And you know what? Let’s throw in a web SQL
database and maybe a PHP interpreter and see if it works. And it was crazy, but we
tried, and it actually worked really well. So now what we have
for you is a test bed where you just go to the
website and then everything happens to your browser,
nothing to install. You get started. It’s very easy,
and it all happens as you are on the real website. So that being said, this
is our world, right? And you probably want to see it? Yes? [APPLAUSE] All right. So let’s jump to our
first demo, which will show you the framework. So we’re going to do a simple
website, which is going to allow us to login our users. So to do that, we’re
going first to need to have a server system. So as everything really looks
like node with [? express. ?] So if you’re familiar
with node.js, you will recognize the syntax. We need to declare our root. So because we’re doing a login
page, we need two web pages. First, we need somewhere where
the user would land and would have a form. So let’s create that. Then let’s add a
second root, which would be processing
the login information and decide whether or not
you authorize to login. So we’re going to
create two roots. One is a get, as you can see,
and the other one is a post. And for now, we’re just
going to say hello world, just to see if it works. Remember, this all is local
so you can see it on the bar. This is all local host. No connection. All offline. And we hope it’s going to work. All right. So let’s load the exercise. We load the framework and go to
the page and see if it works. And here you have it. We have a working web
page into our browser. Now, let’s add a little bit
things because, you know, it’s a login page,
so we need a form. So let’s add a form. Fortunately, our
framework authorized and supported templates like
any normal web framework. So let’s add a little bit
of CSS, which looks pretty, a password, and a login field. And let’s reload to make
sure we see our login form. So to add the template, I
forgot to add– thank you, Yuan. We have to add the index HTML. So we’re going to load the
template, return the template, reload, and hopefully you
have a nice Google I/O form. Yeah. So that’s great. But it doesn’t do
much right now. We have a form, but
we need to process it. So let’s have a bunch of users. So here what we’re
going to do is we’re going to add a database. So adding a database in
the framework is very easy. All you have to do is create
a database, one line of code. The form will do all
the magic for us. Then we’re going to add a user. So let’s create a user table,
which will just contain a login and the password. All right. Let’s add the user– a Google
I/O user– for the demo. And by the way, do
not do that for real. Do not store your
password in here. Do not do this, right? That’s just for demo. Please don’t do that. It’s insecure. But for the demo, it will do. So let’s add that and that other
little bit of Javascript code to check where is the database
and check if it’s happened. Here it is. So we get two variables
from the environment. Run it into an SQL query. Test if it’s correct, and we
should have everything for– and we’re good to go, right? So let’s try it. We reload. Now, we have a page. Let’s try with a wrong password. So GoogleIO, and I don’t
know, password 123? User not recognized. Works as intended. Now, let’s copy the
same, which is insecure, and let’s try again. GoogleIO as a user and the
password and click on it. Ooh. We are logged in. So we have a fully functional
login system with a database website, few lines of code. This is a framework we built,
and this is the framework we use to create our exercise. Pretty cool? Yes? [APPLAUSE] All right. Let’s go back to the slide
and talk a little bit more about the content. So we showed you the technical
framework we have behind it. So the framework is
great, content is better. You need it. So let’s jump into one of our
lectures, the SQL injection lecture, so we show you
what kind of content you’re going to learn. So SQL injection is one
of the most deadly attacks you can suffer from. And so I’m going to tell
you a little bit about why SQL injection vulnerability and,
of course, how to prevent them. And then after that, Yuan
will do a few of the exercises to show you how you
can learn it hands on. So why does SQL injection exist? Well, it turns out that
during our little demo, we introduced a vulnerability. One source of
vulnerability, because when you have an SQL
statement, an SQL contains both keywords,
which tell you what to do, and also parameter, which
tells you what to look for. If an attacker is
able to control one of those parameters, he
can actually inject keywords, and the server has no way to
distinguish between the two. We actually had
the vulnerability in our very own code. So this is a mix between
keyword and parameter, which makes this condition possible. So what an attacker
can do with that? Well, an attacker
with the ability to control SQL query will
bypass any type of security check you have and will
do an unexpected query from reading sensitive data
to deleting your database, encrypting it, and do
all nefarious things. More formally, the consequences
of such an attack, if you have it, is it breaches the
confidentiality integrity of your website, because
everything is a database, and also it allows
to authenticate without knowing the
passwords, for example, because you can defeat
any type of check. So there are multiple
types of SQL injection. There is a classic one we’re
going to demonstrate today, and there are more advanced ones
such as the blind SQL injection and a second-order injection. I’m not going to talk
to them about today, because we don’t have time. Let’s just focus
on the classic one. So the classic one, as I
explained, works very simply. The attacker, instead of
sending what you expect, will do the unexpected and will
try to manipulate the SQL query by sending specific payloads. Then it will result in
an unexpected SQL query, and then your
database will be just happy to do whatever
the attacker wants and potentially extract
very sensitive data. Here’s a concrete example. If the attacker can
inject a username, then it can decide instead of
say username to say, let’s say, Google, then close the field
and then add dash dash, which [INAUDIBLE]. As a result of that, as
you can see on the screen, well, you are basically
bypassing authentication. So to make it more concrete,
let’s jump to our second demo. So we’ll show you the
attack live, hopefully. Demo, please. All right. So we’re back to our demo. And if you remember– can
you go back to the code? If you remember, we
took the username and directly put
it into our query. You can see it on the screen. So we didn’t do any checks. We’re like, OK, we trust the
user input, which you should never do, and as a result will
be able to log to the website without the password. So let’s demonstrate that. So the way to do that
is, as we explained, is you use your username,
so that’s GoogleIO. But then, instead of
doing what is expected, we’re going to add a quote
to close the parameters, and then we’re going to escape. And escape in SQL is just this. How many of you believe
that’s going to work? 1% of you. You have no faith in me, man. Come on. All right. So we’re going to do it. OK. Let’s try it. All right. So here it is. We were able to log to the
website without any password, because we let the attacker
manipulate the input, which you should never do. All right. Let’s go back to the site. So how do you prevent that? There is a simple way
to defend against that. This is called parameterize
query, also known as prepared statement. The idea is, instead of
using the variable directly into your query, you’re going
to actually write your query and specify where the
parameter should go and then input it afterward. That will prevent SQL injection. You should also always escape
the user input, because you should not trust them. It will open you to many
other attacks such as XSS. So that concludes our short
explanation of SQL injection. Yuan is going to,
in a few minutes, show you how the exercise works
and how the framework works. But before that, how
many of you would like to get early
access to the course? All of you. That’s brilliant. So Yuan is going to tell you
that just before the exercise. Yuan, it’s all yours. YUAN NIU: All right. So signing up is very easy. You just have to register
using the link on the screen, starting today. g.co/learnwebtech. OK. Let’s switch to the demo, and
we’ll show that slide again at the end of our presentation. So since we’re familiar
with SQL injection already, we’ll just skip to that. As Elie mentioned, each
topic will have material so that you can
learn on your own. So slides, the lecture, a
related reading list, quizzes and, of course,
the exercises that are going to give you
some hands-on experience so that you can reinforce
what you’ve been learning. You’ll be attacking, defending,
and investigating sites. And to make it a
little more fun, we’re taking inspiration
from the pie versus cake war to craft some of our scenarios. So in our world,
the pie syndicate is a little worried that a new
cake shop has just opened up. And so they’re going to try to
attack their rival’s website. And then the cake
guys are going to be defending themselves and
investigating these attacks. All right. So we’ll be sticking with
the basic exercises today. And since we’ve pretty much
done attack number one already with Elie, we’ll just
skip to number two. And for this, I’ll
need my black hat. OK. All right. So we have our
objective on the left. Hey, pie minion, that cake
shop is still in business. We can’t keep losing
slices of our territory to rival industries. They still have a mostly
online operation for now, but we must act quickly. Word is that they’re still
vulnerable to SQL injections. We’ve gotten a leaked
copy of their server code, but, unfortunately, no idea
who any users might be. We need you to batter
down the defenses and get their customer list. Get to it. Pie Boss. All right. So very helpfully, we have
a direct link to the page that we’re going to attack. And we have a copy
of the server code. So we have the same
vulnerability as before, but because we don’t
know the user name, our previous bad
input won’t work. So let’s take a
look at the code. And on line 12, we see that
actually it doesn’t really matter what the select statement
returns as long as it returns anything at all. And so this is where
we’ll be targeting our new crafted query. We just need to get the WHERE
statement to return true at all times. And to do that, we’ll just add,
let’s see, and/or 1 equals 1. And this has the effect of
saying select from users where username
equals admin or true. And so now we’re going
to force the statement to always evaluate to true. And now we have access
to their customer list. And that popup means
that we’ve succeeded. So we can move on
to the defense. OK. Let’s close this. OK. For this, I’m back on team cake. And I got my white hat for that. So our mission, once again,
greetings, fellow baker. As you know, there’s been
chatter of an upstart pie maker with crusty connections. We suspect they’ve
been trying to attack cake shops for some
nefarious and no doubt irrational purpose. Your site has been identified
as one of these targets. In particular, it seems
your login page leaves you vulnerable to SQL injection. Please fix it at your
earliest convenience. Cake Boss. OK. So this time we’re
going to use the editor. And let’s see, I’m
going to refresh. Let’s see. Refresh again. Sometimes the service workers
need to get updated properly. So it takes a
little bit of time. OK. There we go. And let’s verify first
that we are still vulnerable to this
SQL injection attack. And indeed we are. OK. So we’re going to use
our editor to fix things. And this editor comes
with syntax highlighting, linting, all the nice stuff. And we’ll also need these
three buttons on the side. So the first one will run our
code against the test framework and make sure that what
we’re doing will succeed, will not succeed. You know, it just tells us when
we’ve finished the exercise. In this case, we failed
because we haven’t actually fixed anything. The second button is
going to give us a hint. And finally, the third
one will reset everything so that we can
start from scratch. So to go about
actually fixing this, we’ll just use
parameterized queries, or it’s also called
a prepared statement. So remember, we’re not supposed
to use user input directly. So instead we’ll
tell the SQL prepare a statement that has
specified exactly when to expect external input. So here, this is the
parameterized statement. And then we’ll give it the
user name and password here. Let’s see if that
passes our test. OK. So this should be good. Now let’s refresh
and double-check that it actually works. Oops. And we’ve successfully
defended ourselves against this particular
SQL injection attack. So remember that this is
just the basic exercise. So in practice, in creating
actual websites in the live, you would also sanitize
user input, as Elie said. And it can’t be
said enough times, never trust external input. So that’s it for the demo today. Let’s not tempt the
demo gods any further. Back to the slides please. [APPLAUSE] So thank you for
attending our session. You can sign up for early
access to the course, which will be released this summer. And if you’re interested
in learning more about web tech or
maps or whatever, head over to some
of the sandboxes. I think there’s one
by stage 6 that’s going to talk about
progressive web apps. And thank you again, and
enjoy the rest of I/O.

Be the first to comment

Leave a Reply

Your email address will not be published.


*