Enhancing Web 2.0 Accessibility Via AxsJAX: A Tutorial at Google – Charles L….


woman: Good evening,
ladies and gentlemen. Welcome to the latest instance
in the Open Source Developers at Google
Speaker Series. Our speaker series is designed
to help the public understand more about what some
of the open sourcers here at Google are doing
in their time here at Google. And I am pleased this evening
to introduce Charles Chen and T.V. Raman, both of whom share
an office here and work on
accessibility issues, and these gentlemen tonight
will be discussing enhancing Web 2.0 accessibility
via AxsJAX. Gentlemen, take it away. Raman: Thank you, Leslie. [applause] Thanks for coming, everyone. It’s a Monday evening. We really appreciate you
showing up and taking the time
to come to this talk. So my role in this talk
will be to talk as little as possible and allow
Charles to talk. So that’s my challenge
and at the end you can grade me
on how well I do. And Charles’ challenge,
of course, will be to talk without pausing so that I don’t
get a chance to cut in. So what we’ll talk about is
injecting accessibility and usability into
Web 2.0 applications. The background
for this work is that– As you know, if you take a look
at the kinds of web applications
Google builds, we and other Web 2.0 developers
have been for the last two years
pushing the envelope on what can be done
in the web browser. And the result has been
extremely dynamic, highly interactive
HTML interfaces that work really well,
but in the past have also presented
significant challenges to users with special needs. So, for instance,
if you cannot see and you are user a screen reader
or a talking browser, a lot of the AJAX applications
caused no end of pain. We’ve been looking
at this problem for the last two years
or so and we’ve basically gone down
the route of asking, “How can you use
the same AJAX technologies “to enhance the usability of
Web 2.0 applications for somebody who cannot see?” The background for this work
is something called– which is now in sort of
W3C Last Call or getting ready
to W3C Last Call called W3C ARIA,
which is accessible, rich internet applications. It’s a spec, but the spec
can be sort of summarized in half a dozen lines. If you think of HTML
and JavaScript as the assembly language
of the web, it’s low-level
assembly language. You’re doing spans and divs
and attaching event handlers and styling it. So think of that
as the assembly language. The way to conceptualize ARIA
is that it’s a couple of additional opcodes
in that assembly language that helps adaptive technologies
like screen readers and self-voicing browsers
discover, introspect, reflect, and then provide
the right feedback for the HTML regex. That’s a lot of words. Basically, in simple terms,
it allows a screen reader to know that a menu is a menu
and then create it as such when you interact with it. What Charles and I
have been building is– If you have this
assembly language– Now when you write
your AJAX app, you typically use a toolkit. You typically use Dojo or any
other JavaScript library that you like or
in the case of Google, you might use GWT
and those things give you higher-level design patterns
to do your GUIs nicely. What we have created
is an open source library that we call AxsJAX
that allows you to do the same for good auditory
user interfaces. All of the output
that you will hear is actually produced
by the screen reader, so we don’t do the talking. We provide the
web developer the right JavaScript calls to make adaptive
technology talk. For today’s talk from here when
Charles takes over, he’ll show you everything
running on a Mac. He’s using Fire Vox,
which is Charles’s project. He started that as a student
and here at Google he is doing it
on his 20% time. We are doing the demos today– We are doing all of this today
on the Mac, because it’s one
of the best TTS engines that you can find
out there today. So for that I will
hand it off to Charles and, Charles, from here on
you’re supposed to prevent me from talking. Chen: Okay. Thanks, Raman. So now I’d just like to add
in a few other things about AxsJAX library. One of the key benefits
of this library is that not only is it something
that a web developer can use to improve their web page and
enhance their usability, it’s also something that a
hacker– you know, someone who’s just
really into coding, into doing things–
open source developers such as all of us here,
we can use this library and apply it to web sites that
aren’t using it themselves. Because at the end of the day,
this is just some JavaScript and the JavaScript
can be inserted into a web page through
various mechanisms. For example, bookmarklets
or Greasemonkey. And so that’s one of the key
benefits here is that even if the web developer– even if you don’t
own the web site, you can still enhance the
usability and accessibility of that web site using AxsJAX. And so with that, I’m going
to show you all a demo, and I believe that’s
the best way to show some of the benefits that you
can get from using AxsJAX. Now before we use it,
let’s look at a before-and-after scenario. So I’m gonna do
the before case first and I am going
to do a search for flowers on everyone’s favorite
search engine. Okay, so this brought up a page
of results for flowers, and you’ll see that, you know,
we have results. We have sponsored links. We have related searches. And… you know, you have many pages
of results that you can go to. Now if you’re trying to browse
this with a screen reader and get through it,
it’s rather painful. Because we have– Although we have a well
structured content– we have headings and section
tags and all of that– it’s still not that easy
to navigate. Because to go through
the results, what you have to do
is you have to say, “Take me to the next heading
or the first heading.” So it’s gonna take you here
to this first result. And all you will hear
at this point is you will hear
the title text. You won’t hear any
of the snippet content. So to determine whether
or not this result is something that you want
to click on, you’re going to have
to tell your screen reader to read the next bit, and
when it reads the next bit, then you’re gonna hear
the snippet. Now if this is the link
that you wish to click on, now you have to tell
your screen reader to go to the previous heading
to get back up to the link so that
you can click it, and you just repeat
this process until you go through
all the results. If you need to go
to the next page, then it’s gonna
take several tabs. So you’ll be tabbing
for quite a bit here to get to the next link. And if you’re a powers
Chrome reader user or a power Google user,
you probably know of a few shortcuts that can shorten
this process. But still,
it’s relatively painful. And then you also run
into the problem of trying to group
related items. So if you, say, brought up all
the links on this page and you got, you know,
all the URLs, but then– You know, how do you
differentiate between the sponsor links
versus the results? I mean, they’re all links
to other pages here. So that sort of grouping,
visually, you look at it and it’s obvious. All the sponsored links
are to the side. All the results are centered
to the left. But if you can’t see
and you’re just trying to determine that,
it becomes a little bit difficult. So let’s see– Let’s step back and think
about what the eye is doing for you
when you look at this page. So the eye is helping you
group things and it’s allowing you
to quickly scan the page and just look at the parts
that you wanted to look at that are important
to the task at hand. And so if the task at hand
is that you want to do some research
about flowers, then you want to look
at the results. You’ll see a Wikipedia link. You can click on that. If, on the other hand,
the task that you’re trying to accomplish is you want
to order some flowers, then you probably want
to look at the sponsored links to the right-hand side,
so you can order some. And so with that in mind,
you can think of a busy page
with a lot of content as a page with multiple lists
of related items, and you should be able
to switch between the lists depending on what
you’re trying to do and then go through the items
in each of the lists. And that is what AxsJAX– one of the things
that AxsJAX can provide. So I’m going to enable AxsJAX
and repeat the same query. Raman: You need me to get up? Chen: Uh, no.
Hang on. I’m just loading it. Raman: All right. computer voice: Plants,
gift basket delivery. 1-800-flowers.com. Welcome to 1-800-flowers.com,
the world’s one online florist. Be sure to visit our
summer garden of– Chen: So notice
how it started speaking both the title and the snippet,
and you heard both of those things together. Now instead of trying to hunt
around for the next result, I can simply press
a key and get there. So, okay. So now I’m going to go– computer voice: Send flowers,
plants, and gift baskets. Flower delivery from– Chen: And I can keep
doing this. [computer voice skips] Chen: And now you’ll notice
that I’m at the bottom of the page of results. Now remember earlier
I had mentioned that if you wanted to try
to get to the next page, you’d have to tab
multiple times. But if you think
about the task at hand, what you’re really trying to do
when you go to the next page is you’re just trying
to get the next result. So logically, if you press
the same button to go next, then when you hit “next”
at this point it should just take you
to the first result on the next page
and just do it for you. And that’s what it will do. computer voice: Flickr, flowers.
Flickr is almost– Chen: So notice how when
the next page loaded, it immediately spoke
the first result. So as an end user,
your mental model now isn’t about, “Well, a result
page has ten results on it “and now that I’m
on the tenth result, “I need to hunt
for the next page link “and then I need
to click that and then I need to find
the first result.” Your mental model
becomes simplified and you just need to think,
“Well, I want to get the next result.” You press a key for
a next result and you hear it and it’s just right there. Now I’d mentioned that sometimes
you might want other categories. For example, sponsored links, if you’re trying
to order flowers. So I’m going to show you
going through the different categories and
traversing within a category. computer voice: Sponsored links. Related searches.
Alternate search categories. Results. Sponsored links. Chen: Okay, so I’m gonna
go through the sponsored links
on this page. computer voice:
City of Flowers. Flowers and gifts for every– [computer voice skipping] Chen: And you’ll
see I can do the same sort of navigation
with that. man: Are you doing tabbing? Chen: No, I’m not doing tabbing. That’s a very
good question, yeah. With tabbing,
you’re jumping through all the focusable elements,
and– whereas with this,
what you’re doing is you’re going through
logical chunks of information. Raman: So he’s using
cursor keys. Basically using
the four cursor keys to do the navigation. Chen: Yeah, we’re using
the arrow keys, so. [clears throat] man: Keyboard arrows? Chen: Yes, exactly. So now one of the challenges is,
you know, whether you’re tabbing
or doing this, when you reach
the end of a page, to know that
you’ve reached the end and that the next time
you do this action, you’ve wrapped around
to the top. Because if you’re
not sure about where you are
on the page, then you could end up
just looping forever. So you’ll notice that
as I was doing this, it was playing
this little tick sound. So that’s an earcon. And so we have
different earcons depending on whether
you’re traversing through items,
traversing through lists, or if you’re circled around. So I’m gonna circle around here
and you’ll hear it. [computer voice skipping
and electronic ticks] Chen: Notice that it made
a blip sound that was different
from the regular ticks. The other thing
that you’ll notice is that there’s
an enlargement of what you’re currently
viewing, so. [computer voice skipping] Chen: So, for example,
I brought up this search result, and if I wanted
to magnify it, I can just press “+”
and I can make it larger. So you’ll see that
this is getting larger. But this is different
from using a traditional magnifier tool,
because unlike a magnifier– which enlarges
the entire screen and can be a problem sometimes
because you lose context of where you are overall relative to the rest
of the page– when you do this,
you’re only enlarging the thing that you’re
interested in. And so you still have a sense
of approximately where you are. Uh, yes? man: Could that also work
in IE? Chen: So right now
this will work in any browser that is ARIA
aware and works with ARIA. So currently IE doesn’t
support ARIA yet, but IE 8 is working on it. man: 8? Chen: 8, so, yes. Okay, so with that you’ve seen
a basic demonstration of what AxsJAX can do
to enhance navigation of a complex web page. And you’re probably thinking,
“Okay, this is really cool. How can I do this
for my web page?” Or, “How can I do this
for any web page that I’m interested
in enhancing?” And that’s what the rest
of today’s talk will be about. So with that,
let’s go ahead and start. By the way, all of this stuff, all of this information
is available online on the AxsJAX home page
and on the tutorials. So I am going to go
to the tutorial right now. So for tonight,
what we’ll be doing is we’ll be applying AxsJAX
to Google Product Search, and we’ll go through
the whole thing from start to finish and show
you just how you can do it on your own
for any web site. Now, first off, you need
to do a little bit of setup on your machine
to get all the parts that you need
to build this. So as I had mentioned earlier,
this technology relies on WAI-ARIA. And currently the browser
that supports WAI-ARIA the best is Firefox 3. So… I highly recommend getting
Firefox 3 Portable Edition, because this is a sandboxed
version of Firefox that you can download,
unzip, develop on, and it’s completely separate
from your default browsing profile,
so it won’t, you know, change any of your settings
or mess anything up. There are also– man: Safari? Chen: No.
Safari doesn’t support it yet. WebKit–WebKit
is also working on it, but they don’t
have it just yet. man: Today is just Firefox? Chen: For today, yes. But all the other ones are
working on it. So Opera, IE, and WebKit all
have it in development. There are also several
helpful extensions. So one of them is Greasemonkey. Remember,
I had mentioned earlier that what you’re running
with AxsJAX is you’re just running
JavaScript, and it can be inserted
in a variety of ways. Greasemonkey happens
to be a very easy delivery mechanism
because it will automatically load the script for you
when the page loads, so it’s a very helpful way
of developing this. The other thing
that’s nice here is that– Okay, so is everyone
here pretty much familiar with XPath or at least
know of it somewhat? Okay, cool. So if you want extra help
with XPath, then you can always Google
for an XPath tutorial, and there’s, you know– there are several excellent ones
that you can find that way. [man speaking indistinctly] Okay, that’s nice. [man speaking indistinctly] So there’s a tool for Firefox
called XPather, and this allows you
to use the DOM Inspector so that when you click
on an element on the page, it will then tell you what
the XPath to that element is. So you can very quickly discover
what all the XPaths are without doing too much work. There’s also something
called Event Spy. We won’t be using it tonight,
but it’s very helpful for dealing with rich
AJAX applications, because it lets you know what
the style changes are and which events are coming
from which elements, and that’s a very useful thing
to watch for. And then there’s Fire Vox,
because it’s probably nice to get some spoken feedback just
to hear results of your work. Raman: And so the advantage
of prototyping with Fire Vox is that the way we’ve done
the library, we’ve made sure that
the speech feedback you produce will work consistently across
all the screen readers that support ARIA. But Fire Vox basically
gives you an end-to-end
open source solution that you can prototype with. And so rather than, you know,
doing the right ones and debug under every
browser carried forth, so– write ones and then debug
on every browser with every screen reader, this gives you a way
to write and debug against a open source,
free solution that talks and then release it
into the wild with the hope
that it actually works with the rest
of the screen readers. Chen: The other thing
that’s nice to help speed up development is– So when you’re usually working
with Greasemonkey, you end up editing it
in line and then you have to save it
and then you close that and you reload and
that’s sort of a pain. So the easiest way of doing
it is to actually inject your script through
a local web server. And so there are several
self contained web servers where you just have
to unzip it and run. On Windows, the one
that we recommend is running Server2Go,
because that’s just really easy. You unzip it and it’s there
for your local machine. I’m on a Mac right now,
so I’m actually running something called m-a-m-p,
MAMP. It works the exact same way. And then after you’ve
gotten that, you’re all set up. All you need to do
is download a few simple skeleton files. And these skeleton files
form the basis for–right? So you just replace
the variables there with your content
and you can begin. And so with that,
let’s go ahead and start AxsJAXing Google
Product Search. Okay. So I’ve brought up
Product Search. Let me make sure that I turn off
some stuff here. computer voice:
The text field is emp– Chen: I’m gonna mute that
so I can talk. Now before we begin
AxsJAXing Product Search, let’s take a moment
and think about what the correct interaction
model would be, so let’s do a search
inside Product Search and let’s try
to find something. Okay, so I’ve done a search
for “Google” inside Google Product Search and
I’ve come up with some products. And you’ll see that there’s
an image. There’s a title
for the product. If it’s a book, it says,
you know, who the authors are. It has the price range,
where you can buy it from, and then a description
and some other information, such as whether
or not you can use your Google Checkout account. And that’s pretty much
the way the whole page is. You have that
and at the bottom you have some ways
to do query refinement. But mostly it’s about getting
through these results. So let’s see how
we can do that. I’m going to start up
DOM Inspector and then use XPather to find
the appropriate XPaths. Raman: So as he brings
that up, what we are essentially
doing is– There’s two steps, right? Finding out what to speak and
then how to speak it. The “what” to speak– Basically you are going
through the HTML and figuring out
the right bits. When he showed you the demo,
we showed you this thing as– Think of it as these
logical collections of categories of things
that you move through. From a programmer’s
point of view, think of it as trails that you
walk through the DOM. And so what he’s going
to show you is how you define
these trails and then how you navigate
them problematically. Chen: So first off,
just to verify that we have our development
environment set up correctly, you’ll notice that
on the skeleton file, the part that you’ll be
editing is inside this initialization function. And there are
two key parts here. The first part says,
“Put the CNR XML here.” I’ll get into that
a little shortly, which is this is basically
an XML string that defines what
the groupings are. And then there’s this line
here that says, “AxsSkel loaded
and initialized.” So when you load– If your environment is set up
when you first load it, you’ll get this message
that says it’s been loaded, it’s initialized
and it’s working. Okay, so now I’m going
to go ahead and select the first result. It’s going to look
a little bit better if I minimize this
and show it side by side. Hang on. Raman: So this
is the part where you sort of pick up what to speak. Chen: Okay, so I’ve clicked
on the content section of that result,
and now DOM Inspector is showing me something,
so I’m gonna look at it. Oh, by the way. On Windows,
this actually draws a little red rectangle around
what you just clicked on. On the Mac, it doesn’t seem
to do that. It seems to be
a Firefox issue. But it’s still really easy
to find out where you are, because you can look
at the DOM node inside DOM Inspector and it’s
pretty obvious where you are. Because if you go
to the JavaScript object, you can go to the text content
and you’ll see where you are. man: Could use Firebug. Chen: Yeah, you can use
Firebug as well. But the XPather tool
that I’m showing, that is only for DOM Inspector,
which is why I’m using that. But as you can see here
from the text content, you have “Google talking
by Brian Baskin, et cetera,” and that’s where you are. So from that, if you just
click on the element in the DOM tree, and
if you click on “evaluate” here, you’ll get the XPath expression
generated for you. And so this one tells you, “Okay, so this is your XPath
expression,” and that’s just given
to you by the tool. Now next you’ll want
to find out what– See, this expression
only selects one of the nodes for you. And really what you want
is you want to select all
of the results. And so if you just
come back here, you can go ahead
and click on the next result and see what XPath
you get there, and that should make
it very apparent what the correct
expression is. man: But there’s no XPath
that they’ve further refined, instead of having Firefox
or Safari XPath? Like, if all of that
applied… [continues indistinctly] Chen: Sure, you can do that. man: I could do it by hand. Chen: Yes, yes.
Of course you can do it by hand. I’m just showing a way
where even people who aren’t really
that sophisticated in terms of their
development– man: And just have Firefox. Chen: Yeah, and they
just have Firefox. They can basically
just use these tools and click through it,
and even if they don’t understand XPath that well,
they can still write a very good CNR file
that still works. Raman: In general, if you know
XPath after using these tools, you will probably go in and,
you know, optimize them by hand. Chen: Yes. Raman: I write these things
by hand. I mean, I don’t use
all these nifty tools. man: I do it by hand. Raman: Exactly.
Chen: Exactly. It’s much–right. Raman: More power
to us with this. Chen: But for everyone else,
there are these tools. So now I’ve clicked
on the second one and I’m going
to evaluate that. So notice the previous one
had TR2 here. So if I evaluate that,
now–big surprise– it’s changed to TR3. So a very easy way then
to get all the nodes is I just put a star here and
that’ll select everything. Raman: One is a prime.
Two is a prime. Three is a prime.
Everything is a prime. Chen: And this is
almost correct, right? It’s giving me
all the results. It is giving me one
extra header row here about showing all the items
and all of that. But if I go back here
and I look at my DOM, I’ll notice that, hey,
all of these rows, they don’t have class names,
but this one header does. So I’ll make a very minor
change here and just exclude things
that have a class name. man: So in this case,
you have to do it manually? Raman: Absolutely.
Chen: Yeah. So once I do that– And you’ll see
that I’ve now filtered it down
to just ten results on the page that I want. And so this is the XPath
that I’m going to use. Now if I can– Now while I bring up my
development environment– Okay, so the place that you’ll
be adding this to is a CNR file. CNR stands for “content
navigation rule,” and it’s basically just
a bunch of XML text that defines
where the items are and how you can
navigate them, the keys. So at top level
you have a CNR element, and the next
and previous tell you the keys that you use to go
to the next and previous lists. And for each of the lists
inside the CNR, you also have next
and previous telling you how to go through
the items in that list, and then you have the item and
that’s where you put the XPath. And there are a few other things
that you can do with this. They’re all documented
in the tutorial. There’s a reference
for what the elements are and what they do. But you can do a couple
of other things, such as you can add triggers,
so that when you reach the end of a list,
it’ll automatically click on that element for you,
and you can add targets where if you just type
that hotkey, it will automatically
click on that target for you. Yes? [man speaking indistinctly] Yes, it is like vi. The idea is that– man: Do you think you tried
to make it like vi? Chen: Yes. This way you can
do the whole thing without ever taking your hand
off the home row. So that’s sort
of an efficiency thing. But again, these keys can
be whatever you want to define. So as a web developer,
if another set of keys makes more sense to you,
then you can use those as well. Raman: So whatever keys
you define here, that actually will end up
getting generated as online help,
so your end user will basically be able to hit
a question mark and, you know,
hear all the key bindings. In general, we tried
to make the arrow keys sort of navigate
among categories and within a category. And I am a Unix user. I hate taking my hands
off the home row. So, you know, “a, j, k, l,”
obviously. Yes, “vi.” Chen: So what I’ve done now is
because we’re doing something that’s fairly simple– I’ve removed the rest
of the CNR file, because we don’t need it and
this is what you get. So this one is saying that– Actually, I don’t even need the
next and previous keys here, because we’re not going
anywhere else. But for within the list item,
you can go to the next and previous
item and here’s the– here’s the XPath that you’ll use
to fetch those results. Now I’m going to go ahead
and embed this in the JavaScript file. To make it easier,
rather than trying to reformat this manually, I can simply copy and paste this
into a tool that we have. Okay, so this has given me
a copy-and-paste friendly JavaScript string. And now I am going to put it
into that JavaScript skeleton file
that I’ve been using. Okay. And now I don’t need
this debug statement, so I’m gonna take that out. And we’re gonna save it. And now when I reload
the product page, it should
do the right thing, so let’s watch
what happens. Okay. And it helps to turn
on Greasemonkey when you’re actually
gonna do this. Okay, it didn’t– Okay. So now this is loaded
and I’m gonna try going through the items. So we had set “j”
to go to the next item. Let’s see what happens here. computer voice:
http://books.google.com/books [rapidly speaking
URL characters] “Google Talking,”
by Brian Baskin, Joshua Brashars, Johnny Long. Computers, Syngress,
Elsevier Science distributor. %C2%A0. 2007, paperback, 257 pages. Nationwide and around
the world, instant messaging use
is growing, with more than 7 billion
instant messages being sent every day worldwide
according to IDC. comScore Media
accepts Google Checkout. $4 to $284 at 81 stores. Compare %C2%A0 prices. Chen: Okay, so that– man: The arrow key
still works, right? Chen: Yeah, the arrow keys
still work. Raman: That’s why we fit– Chen: That’s why
it’s put right. If you look at the– Raman: Yes, so it’s called bold
binding, setting the file up. Chen: If you look
at this file here, I’m saying I’m binding both
“down and j” and– So I can bind
whatever keys I want. I could, you know, put “z”
if I wanted to do that. I can do whatever
I want here. So you listen to that and,
you know, the good news is we didn’t have
to do too much work. We just clicked
on a few tools, got an XPath, pasted it in, and we got something
that works, you know? I can go to the next,
previous item. I can do all of that. The bad news is, you know,
that utterance isn’t exactly ideal. So let’s look at what it did
earlier here. It went ahead and it started– First, it tried to read
this image URL, which is kind of annoying,
because you probably don’t want to hear that very long
image URL strand at the beginning. It got to this title,
which is good. But then after that,
it went ahead and it started reading
this description, whereas when you’re shopping,
you probably want to hear the prices first. And then it went
to that and it gave you the Google Checkout
availability, which is nice, but still, it put the prices
at the end. So what you really want is you
want to hear the title, you want to hear
the prices to see if it’s something reasonable,
and if it is, then you want
to continue listening on to the description to verify
that it is what you think it is based on the title. And at the end
you want to hear some additional information
like whether or not you can use your
Google Checkout account. man: So the order of reading,
is it from left to right, or–but you could order– Chen: Exactly. So this is where we get
into the next part. It’s–how do you
take text from this and then reformat it into
something that’s more pleasant, more appropriate,
for an Ro interface? Because, you know, for a vision
interface, this is fine, because the human eye
can scan it and jump to the correct part
of it at the right time. But when you’re
listening to it, it’s just going through it
in DOM order, and that, you know– that may or may not be the most
pleasant experience. There are also
a few other issues here. You’ll notice that
we did this very simply, so we didn’t enable going
to the next page, for example. So that’s still
something that needs to be done. And finally, notice
that when it started up, it didn’t actually
read the first item. I had to hit “j”
to get it to start. So we’re going to fix all of
that in the next iteration here. [man speaking indistinctly] Raman: Sure. Chen: Sure. At the end of the day,
it’s JavaScript, so you can write
whatever you can. man: Special key
that I could customize… Raman: Yeah, absolutely. Chen: Yes, definitely.
Definitely. Raman: So basically,
the way to think about CNR is it’s– the XPaths tell you
what collections of nodes. The arrow keys tell you how
to walk through them. And then it’s so, you know– if it used design, patterns,
language, right? It’s a visitor pattern. The default action
when you visit something is to speak its contents. But then you can
customize that and that’s what Charles
is going to show you by providing your
own formatter functions and then you can sort of
speak it the way you want. And you get to do this
as the web developer or the person, you know,
enhancing somebody else’s web application
by focusing on the quality
of the auditory output without worrying about how
different screen readers are then going to use it,
because by the fact that the screen readers
implement ARIA, which is sort of this
base-level standard, you sort of get
this common behavior across all the tools. Chen: Okay, so one of the things
that we wanted to do was get the next and previous
page links in there. So let’s find the XPaths
for that. I’m going to go ahead
and go on to the second page, because it’s saying it has both
the next and previous arrows. Can look at that. Oops. Okay, so I’ve found
the previous page link. It’s this one and one
easy way to identify it is if I look
at the image, and I’ll see that that’s what
its source is, is that. I can do the same thing
and find the next page link. Okay, so now I’m
on the next page link, and you’ll notice that
its image says, “nav next,” whereas the previous one
had the “g” Google image. So that’s one very easy way
to differentiate between the two links. So I can write an XPath
for that. And then the other thing
that I need to do is I need to add it– add a target so that when
the user presses “enter” on the current result, that
they get taken there directly. And so you’ll notice that
the link that you’ll be using is the link
that’s the first link. The element with
the class content. So that’s–that’s
how you get the link. And all of these targets,
by the way, are relative to the node
that you’re currently on. So you can specify it
such that you write one XPath because it’s all
structured the same that you then get it
for all of the items. So you don’t have to
do it individually. And so this– Whoops. Sorry.
Wrong CNR file. Okay. And so this is
the CNR file that we get. So this is
the CNR file from before. And after we add in all of that,
this is what we get. So you’ll notice
that we have the same– we have the same CNR
for going through the results, but we’ve added
a few things. So we’ve added that “Next”
and “Previous” link, and they’re called the triggers,
ListTail and ListHead, and we’ve also added the ability
to click on the current item just by pressing “Enter.” So it’s, “Go to result.” And so the way that ListTail
and ListHead work is that if you’re on
the very last item in a list, and you try to go
to the next item, then you’ve reached
the tail of the list so it should
take that action. And the same thing
with ListHead, which is if you’re on
the first item of a list and you try to go to
the previous item before that, and go backwards,
then you’re at the ListHead and it should
execute that action. And with something
that’s just a target where you just
have a hot key, that means that whenever
you press that hot key, that action
should happen. In this case
the actions by default is to click on
whatever the target is. So this is what’s causing it to
click on the next previous link, click–and click
on “Enter.” So let’s go ahead
and let’s change our CNR to use this
newer version. So again I’m doing
the same thing. I’m putting it
into the tool, and it will generate
a CNR for me– CNR string for me. man:
It added the quotes for you. Chen: Yeah, it added the quotes,
it did the formatting. It made sure that, you know,
I did it with 80 characters. So yeah. man:
[indistinct] Chen: Exactly. So yeah.
Copy and paste programming. But it works
in this case. And, you know, so this tool
is also part of the tutorial, and we have it online, so anyone
can go ahead and use it. So let’s go ahead
and let’s try reloading this now
with these changes. Again, we still haven’t
reformatted the speech, but we’ve added
these features so that you go to the next previous page,
you can activate an element. And let’s see
how this looks like. M’kay. Oops. computer voice:
Base.Google hosted– Chen: Yeah, it works better
if I turn on the sound–on mute. So first I’m gonna see if I can
activate this link and go to it. And so I just pressed “Enter”
and there I go. So it worked.
Okay? Now let’s see
what happens when I reach the last result
on this page. [rapidly speaking
URL characters] Chen: And so it took me to the
first result on the next page. computer voice:
HTTP:/– Chen: Now if I try to go
to the previous page, let’s see what happens. And indeed it went
to the previous page. So cool, the new CNR
has improved what we can do
on this page. But we still need to fix that
audio and make it sound better. So let’s look at how
we can accomplish that. So as I had
mentioned before, the order in which
it was reading things isn’t very pleasant because
what you really want to hear is you want to hear the title,
the price, the description, and then some other information
such as where you can buy it and whether or not
you can use Google Checkout. And so the solution
to doing that is to write
your own function that will generate
a formatted string and that will then speak it
using the AxsJAX library. And so here is
the code for doing that. You’ll see that this code
has several parts where all it’s doing is
it’s just fetching the content. So it’s fetching a title,
the description, the price, the seller,
the ratings, and whether or not
you can use Checkout there. And then it’s taking
all of that information and it’s
reconstructing it so that you get the title first,
followed by the price, followed by a description,
the seller, the ratings for that seller,
and then Checkout information. So it’s restructured
that thing into a message. It’s using the lens library to
create that enlargement effect. It’s scrolling you
to the result, and then at the end
it’s speaking that message
that you had generated using the
AxsJAX library. And again, the AxsJAX
library will then set all the appropriate
WAI-ARIA properties, which then trigger
the screen reader into saying the correct
thing to the user. man:
When it speaks the text, if I put a lot of spaces will it
pause by the number of spaces? Chen: It depends on your
Text-to-Speech engine and your screen reader,
so that varies. Generally if you want
to insert pauses, use a period. Raman: Yeah, spaces will not
cause the TTS engines in general to introduce a pause. They typically treat multiple
spaces as a single space. So that would work, yeah. Chen: So in the interest of time
I’m going to go ahead and just load up the final
version that uses this. And I will step through
what we did there and what
the changes were. Okay, so this is the same thing
as the skeleton file, but there have been
a few changes. So one of the key things
that you’ll notice is that now we have
all of these get functions. So we have “getTitle,”
“getDescription,” and what these things are using
is they’re just using XPath to find the node
with that content, and then it’s returning
that node’s text content. So it’s fairly standard
and straightforward. And here’s the function
that I’d shown earlier for speaking
the results. So this is a function
that’s reconstructing it and then passing
that message into the library and using
the library to speak. Raman: So earlier
the default of that action was to just take all
of that content and speak it. Now you basically use the custom
formatter that you wrote, and you get
the output you want. Chen: And you’ll
notice that we’ve also added a setTimeout
here at the front. So this means that
when it loads it’s going to try to speak
the first result. And that will help
because now the user will know when the page is loaded
and it’s ready to go. So let’s go ahead
and run this version of it. Raman: You mute it? Chen: So yet another
reason why I like having a local web server
and using Greasemonkey is that it’s very easy
to just try something out and choose
the different file. For example,
right now I can switch from the skeleton file
that I was working with before to this
finished version with just– just by commenting out a line
and un-commenting another. Okay, so that just
switched the file for me. And now if I reload, it should
give me the correct one. So let’s try this. computer voice:
…to $284. By Brian Baskin,
Joshua Brashars, Johnny Long. Computers, Syngress,
Elsevier Science Distributor. $C2%A0.
2007. Paperback. 257 pages. Nationwide and around the world,
instant messaging use is gro– Chen: And then if I go
to another one… computer voice:
HTTP://books.google.com/books?=[rapidly speaking
URL characters] HTTP://– Chen: Okay. All right. computer voice: HTTP://base.
googlehosted.com/base [rapidly speaking
URL characters] “Google It, Revisited.”
“Google It, Revisited.” Add to shopping list,
accept Google Checkout, 80– Chen: Okay, so this
is what you get when you don’t
completely reload the page. As you’ll see it was
running the previous script. So let’s reload
this properly. [computer voice and Raman
speaking simultaneously] Chen: Yeah. So it’s very important that you’re loading
the right scripts. Otherwise you get results
that you don’t really want. computer voice:
“Google Talking.”$4 to $284. By Brian Baskin,
Joshua Brashars, Johnny Long. Computers, Syngress,
Elsevier Science Distributor. %C2%A0.
2007. Paperback. 257 pages. Chen: So as you can see,
now it’s reformatted. It stopped speaking
the image at the beginning. It’s now reading title,
the price, the description, exactly the way
that we want it to. And if I keep going next
I can go to the next page and… [computer rapidly speaking
URL characters] “Biography: The Google Boys,”
DVD, $14.99, DVD documentary. Buy this
at Circuit City. 4,054, %C2%A0 sellers.
%C2%A0 ratings. Averages out
to 3.5 stars. Chen: See, it even told me how
many stars there were on that, which is kind of nice. So with that,
you know, we’ve– in just a short amount of time,
less than an hour, we’ve AxsJAXed
the Google Products search page. Raman: So the power
of all of this as Charles said
at the beginning is that it doesn’t
limit it to the web developer
who owns a particular site. In fact, even if you’re
doing it to your own site, sort of doing this in some sense
external to that thing– WAI-ARIA, Greasemonkey,
doing it with the script– allows you
to experiment very rapidly and figure out
what actually works, and then go
and do it on your site. If you do it on that site,
you are just someone who’s got an hour
of hacking time to spare and want to help a friend
of yours who cannot see. This is a very nice way for you
to go in there and do that. And then finally
if you are– you know, you can also
write something like this if you are
a screen reader vendor or an open source
screen reader creator. The nice thing is
they’re not tying this to any specific
adapter technology. It’s sort of–
we’re using some bits that are
implemented commonly across the various
screen readers. Firefox 3 is implementing it,
and as Charles said, the other browsers are
going to follow very soon. So it sort of gives you
a very good way of leveraging
the little time that you have to pay attention
to accessibility. So accessibility has been a big
success in the last few years in terms of getting
people’s awareness. People are now aware that
there are actually blind users who use computers and that you
need to worry about the problem. But once you
sort of get that far with knowing that you should be
doing the right thing, doing the right thing
has been quite difficult in terms of debugging it
with each browser, with each screen reader
combination. What should I do? Can I use AJAX?
Can I not use AJAX? Am I, like, doomed
to writing static HTML if I have to support
a blind user? And so one of our
goals here is to sort of– to spend some of those
minutes with respect to– you should not use
technology “X” if you want
blind users to use it. And try to
turn it around and sort of leverage
those same technologies that make the web dynamic
for the mainstream user, and make it work just as well
for the user with special needs. And now that we have this
huge level of awareness with respect to web developers
wanting to do the right thing, can we then give the web
developers the right tools so that that time
that you have to put into it gets leveraged
to the maximum. I will take questions, and I’m sort of breaking my goal
of, you know, not talking, so I’ll shut up. man: I don’t mean
to dominate the questions. What’s a good
guideline resource for access
about the user’s interface? Raman: You know, it’s– I really would answer
that question differently. So I would think of– I sort of think
of accessibility as just a special case
of usability. And a lot of times
there are guidelines written, and you can
look at guidelines, but you know
your application best, and just as you design
your application to be usable based on what
you are trying to do, I think the same applies
for the speech user for the low-vision user. And what I’d personally like
to see is a lot of innovation as opposed to simply
adhering to believing– So you can always
adhere to guidelines in their sort of
necessary conditions. They’re not
really sufficient. You know your
application the best. You know your
user the best. The goal is to sort of
experiment a lot and discover
things that work because I don’t believe
all the guidelines that need to be discovered
and written down have been
written down yet. man:
Perhaps start pulling out feedback
from the users. Raman: Yeah, so that’s
the thing, right? So the question is: what do you
do to sort of iterate rapidly, get user feedback,
and then move back? The way Charles and I
have been doing this is, you know, over the web–
distribute code over the web, open source it,
get users to use it early, get feedback from
early adopters, wired thing– you know mechanisms
like Google Groups or whatever else your
favorite online mechanism is– and then
look back rapidly and, you know,
continue to iterate on the code. So that actually applies
to all of those libraries. In fact, though we describe
this thing to you as a library it did not
start as a library. The library was actually derived
by first writing enhancements for about four
major Google applications: Google Reader, Google Search,
Google Scholar, and Google News. And then we
sort of looked back and extracted
the library out of it. And as we do more, that’s
actually what we are doing, and that’s
actually where we’d like to take
the community as well. So you know, from all
the questions you are asking you clearly understand
the problem very well. So you shouldn’t sort of
wait for guidelines, for somebody
to tell you what to do, but, you know, write,
play with the library, use it,
and you’ll probably discover things
that we’ve missed. And you know,
we can always add it back in. Chen: Yes. man:
So Raman and Charles, I’m very impressed
with your usability, your advances
in usability. But I do have one concern,
and that is that it seems fairly fragile,
what you’ve done, because you depend upon
knowledge of the DOM. Raman: Yes.
Mm-hmm, mm-hmm. man: And if Google changes
the format of the pages or if any author of the page
changes the structure, then you have to
do this all over again. Raman: So–so-so you bring up
a very good point. So basically the world
we live in today, unfortunately, is that HTML
and JavaScript and the DOM is being used
as a very low-level assembly language
for the web. That’s not the world
I would have designed, but that’s
the world we have. So how do you code
in the face of that situation? The best way
I know to do that, in terms of doing
good software, is when you have
fragility like that, isolate the fragility
in a very small component so that it does not
permeate the code base. So if you notice
in the thing that we built, all of that dependence
on the shape of the DOM is in that
one small XML file. It’s not anywhere else. Also if you
look at it… let’s say we do
100 of these, okay? And let’s say a year from now
ten of them break, five of them nobody is using
and nobody therefore notices, and five are being used. Guess which one
will get fixed. The five that are
being used will get fixed. And so it sort of has
this interesting problem of you have now pushed out
the fragility to the leaves. So the web
is basically not a set of six
applications or eight. The web is thousands
and thousands of applications. Every web site
is an application. And over time
what you’ve then done is you’ve isolated
that fragility into very small pockets. So I did this
about ten years ago, now. Feels like a long time. 1998 I started doing this
in the MacSpeak world. And everybody
looked at it and said, “This is very fragile.” It’ll work
because you are depending on the shape
of the web site. The first one I did this for was
actually Yahoo! Maps in 1998. And you know,
Yahoo! Maps came online and it was this
wonderful thing, right? You could get
textual directions. But you had
to sort of hunt for the four fields
to type the thing in, and then you
had to click “Submit,” and then you got
this busy page and then you had
to go hunt it out. In Lisp I basically wrote
the same XPart hack. Go, you know… Basically prompt the user
for the start and end location, make up a URL, submit it,
get the results back, pull out an XPart, and,
you know, speak the right thing. And everybody looked at it
and said, “This is bogus. “It’s completely dependent
on that web site, and it will break.” There are two things to notice
over ten years of experience. One, a sophisticated
web site– It costs that web site a lot
more to rearrange their DOM than it does you
as an open-source hacker to change one XPart. So in fact
those things don’t change. And the other
thing is that there are two halves,
I told you, right? One is the URL params
that you sort of fill in, and the other
is the shape of the DOM. In ten years of use, the shape
of that Yahoo! Maps DOM has probably changed
about six times. The cloning parameters
have not changed. And today we think
of that as a REST API. So today that’s what
all web APIs are founded on. So, you know, this is–
You bring up a good question, but in the context of how
web applications are done, this has zero impedance
mismatch, so it just works. And it sort of–
you know, so things that
get used get fixed, and be it robust, over time,
things that nobody uses fall down, and it’s like
a tree falling in a forest with nobody around–
you don’t hear it. Chen: Just a quick plug. So I’ve subtly or
not-so-subtly put up a page that tells you
how you can get your own Google
AxsJAX shirt. So you’ll notice
that we’re wearing very stylish
Google T-shirts right now. They have the Google Braille
design on front, and the AxsJAX logo
on the back. So if we could get
the camera on me for a minute. Chen: Charles
even does modeling. man:
Is it lace? Very nice. Raman: I’ll model it too.
Chen: Yeah. So yes, if you write
any AxsJAX script and it’s good, and you just post it to the
Google Accessible Groups list, send it to us, and, you know,
we’ll look at it. And we’ll send you
a T-shirt. Raman: So a short
explanation of the logo. Charles and I conceptualized it
over a healthy Google breakfast. Drew it on a napkin and a
graphics artist drew it for us. It’s–Web 2.0 is basically
powered by AJAX. So Web 2.0 has been
styled to look like a car. And so if you have
a running car, you can’t, like,
stop the car, like, change– …in that context for
accessibility and special needs. The enhancements
also need to come by that same
distribution mechanism for it to keep pace. If what you’re relying on
is some piece of shrink-wrapped software that you got
that sort of knew how Yahoo! Maps looked last week
or how Gmail looked last week, a year from now,
it’s going to be clearly broken. In fact it will be way more
broken with respect to fragility that Bob pointed out
than what we are doing here, because here what he have
is we’ve isolated the fragility that are specific
to a particular application into that piece. And we can rev
that independently without doing
a fresh release. So the magic here is
that the Google AxsJAX project– code base
is on subversion. A consequence of it
being on subversion is that everything
has an HTTP URL. So as Charles and I change
the code and check it in, our end users don’t even know
that something changed. It just works better. And if you think about it,
that’s how all the Google applications
that run on the web work, right? You don’t download something
every Monday morning to see what we’ve done;
it just works. Chen: Yes. man:
I have a technical question. Is there a browser capability
through the user’s agent to tell the web page
if accessibility is turned off
on the users, the browser? Raman: No. man:
Well, I could rent out the accessibility
DS file. Raman: Right. Unfortunately
there isn’t such a thing. That would be
a nice thing to have. man:
But will we ask the browser vendors
to support this? Raman: So I have–I have mixed
feelings about that. I mean, the thing is,
so now speaking not– so as a developer,
I would love to have that. man: In the users’ agent
they would have– Raman: Yes.
man: Just the… Raman: No, no. man: It’d be nice if they
would also add access– Raman: I understand. So that’s
what I’m saying. As a web developer, as somebody who writes
web applications, I would like
to have that. As an end user
who is blind, I hesitate about that
from a privacy standpoint. So this is why I think a better
solution is for sites– for a web site to sort of
have user preferences where a user can opt in for a
particular type of interaction. So notice that now you’ve
suddenly changed the equation. You have changed it
from saying, you know, “When you are blind,
I will always give you this.” One says, “You are allowing
the user to say what you want.” And, you know, on a given day,
you might not be willing or prepared to look at your
monitor even though you can see, and you might
go pick that feature and get
the right interaction. So my preference would be
to go to their cloud where it’s
under user control, as opposed to something
like a browser, sort of advertising
to the world that you’re there
for your blind or whatever. man: Still with user
opting in, still it doesn’t properly
protect their privacy. Raman: No, but at least
you are doing– you are making
the choice. You are making the choice,
and it’s under your control. And the thing is
it should be capability-based, not disability-based,
so it should not– The preference
you are sending shouldn’t be saying, “I am blind”
or “I am deaf.” The preference
you are setting should be, “I want you
to talk to me,” or, “I want you
to show me captions.” It should not be, “I want you to talk to me
because I cannot see,” because all
you are asking for is for the thing
to talk intelligently. You’re not asking for,
you know… man: That makes
a lot of sense. Thanks. Raman: Huh?
Sorry? man:
It makes a lot of sense. Raman: Yeah, yeah.
So that– So that’s sort of
a more complex answer. As a hacker it would be
an easy thing to say the browser should just
sniff the user’s machine and send all that
up into the header. I’m sort of very hesitant
to go that route. Chen: And, you know,
we can even show it to you in action, a reader. Raman: And this is
Google Reader, and this is– computer voice: Click here for
already enhanced Google Reader. Geeky 904. “Cool Tools 3.” Raman: So now, Bob, returning
to your fragility question, now that the script
is actually integrated
into the reader product base, code base, we basically have
regression unit tests that make sure that
as the shape of the DOM changes we catch those
in our unit testing. And then, you know,
go fix the scrap, so… computer voice: Life, make,
Pat, Pete, Google–Google– John, official
Google blog 18. Chen: Okay, so that’s
how I read that. I can open it. computer voice:
Articles loaded. Raman: So then notice
that it said, “Articles loaded,” and I’m pointing
at Jim specifically because I want
to highlight the fact that we can actually make
the screen reader say things without actually
putting it on the screen, which is sort of
a cute hack. computer voice: 5:10 p.m.,
three hours ago, “In their own words”: political videos meet Google
speech-to-text technology.” Chen: So you can do that.
computer voice: Star added. Chen: And it’ll even tell you
if you’ve added a star, taken a star away. And by the way, all of these
keyboard bindings are just part
of Google Reader by default, and all we’ve done
is we’ve watched for events, and when
the event happens, we use the AxsJAX library
to speak the appropriate thing. Raman: So basically
in an application, as I said, you need
to do three things. You need to pick
the content to speak, you need
to give the user a keyboard means
of moving among things, and then as he moves, you need to sort of format what
you’re going to speak properly. In the case of Reader, Google Reader is actually
a power user app. It’s completely
keyboard-driven, it’s– You know, you can also
use the mouse if you like, but it’s–a power user would
just use this with the keyboard. And it’s very,
very power– It’s a very efficient
user interface. So what we did was
to basically add in addit– augment the user interaction
with additional handlers that produce the right stream
of spoken feedback. woman:
How is getting Gmail? Raman: So basically the way
we are doing our work is that–so for–
so we’ve– the AxsJAX work we’ve done has been integrated
into Reader and Books, but there are a number
of other products that we have enhancements for that
are an early prototype. If you’ll go to
the AxsJAX web site, you will see the early ones
that we have, and that does
include Gmail. It works very fluently
with Gmail using the same navigation
that Charles showed you. So you can use H-J-K-L
or the arrow keys to browse through
your folders and threads. You can chat.
All of that works. And basically
as these things get robust, as we get happier
with the user interaction, they get integrated
over time. So Gmail, yes, it is there
in the prototype set. So if you go to the Google
AxsJAX project site there is a page there
called the “Showcase” that sort of has links
to all the things that run, and you can go
play with that. Chen: And, you know, you can–
computer voice: Unread. J. Kirkland. Hendricks
bookmarked page window is bad. [rapidly speaking
URL characters] Chen: So you can do that.
computer voice: Starred. Chen: So yeah.
The same thing works. man:
So Raman, when will we see
your work on a cell phone? Raman: When are you
going to sell me a phone? [laughter] man:
iPhone? Raman: iPhone,
youPhone, wePhone… man: Your voice keeps saying,
“percent.” Chen: Yeah, that’s actually
a minor problem with the TTS. I’m not escaping
all of the characters properly. That’s my bug
on the Fire Vox end, but that’s one of the things
I’m working on fixing. Raman: But I really like the
quality of the Text-to-Speech that Apple shipped
on Leopard by default. It’s a real, very,
very good engine. It’s really nice. So, you know, if you
actually pay–spend a little more time
formatting information, you can make it sound
really, really good. man: So is there…
technical diversity among the speech API
that may work in one platform– Raman: So again, that is
something we are abstracting for you through
the AxsJAX layer. So, no, no, no–
so there are two–there’s– So how does
this work, right? The user who needs this
probably has a screen reader
of some kind. If you are a web developer,
you’re probably using a self-voicing browser like
Fire Vox to do your testing. But the end user
probably already has a screen reader
that he’s using. The screen reader
probably came with the Text-to-Speech engine
that it talks to, right? In the past,
for a web developer, the problem has been– how do you sort of
talk to that screen reader and get it
to say something? The only way of doing it
was to sort of, you know, put things on the screen
and sort of go try it through the screen reader
to see what it said. What we are giving you
is JavaScript calls where you get to abstract it
from all of that. So you, as a web developer,
are at a much higher level. So you’re not
talking to the– So yeah,
your platform specificity about how each
TTS engine works is being handled
by somebody else for you. Chen: Yes. woman:
[indistinct] Chen: So we have a showcase
on the main AxsJAX home page. So that showcase lists all the currently available
AxsJAX scripts. Raman: Well, many of them
are prototypes for Google– Google applications
that, you know, as they become robust,
you know, get integrated, and some of them
are for external ones. So the JawBreaker game,
for instance. Another of my favorites
is the “XKCD”comic. You want to show “XKCD?” Chen: Um, yeah. Raman: So the “XKCD”comic
is very interesting. So Randall came and gave
a talk here at Google, and during his talk he said
something about blind hackers, so then we had to go– computer voice: %00–
Raman: Make “XKCD”talk. computer voice:
Sound transcript. A man is sitting on a couch
talking to another man. They are both
stick figures. First man,
“Make me a sandwich.” Second man,
“What? Make it yourself.” First man,
“Sudo, make me a sandwich.” Second man, “Okay.” Comment, “Proper user policy.
Apparently means, ‘Simon Says.'” Raman: So now there’s
something interesting– [audience members laughing] Raman: So there’s something
interesting happening here. So if you think
of a comic like “XKCD,” part of the fun
when you can see the cartoon is to look at
the stick figures, see that little
caption under it, and interpret it
for yourself. There is a site
that is separate from “XKCD” where volunteers go in
and type up transcripts to help the comic
robot search engines. Now those transcripts
are typed into that other site. They are actually verified
by the comic author– in this case, Randall–
and then they go in. Now you could say, you know,
for accessibility reasons, for blind users
to appreciate it, Randall’s site should
just show that transcript. That would be
sort of bogus because it’s part of the comic
for anybody who can see. So you could answer, well,
you know, for a blind user, you go to the site,
you look at the comic, then you go
to that other site, and then you look
at that transcript, and then you can
understand it. How often did I do that?
Never. I was too lazy, right? So the way
your programmers answer is to make the computer
do that work. And so that’s what
that AxsJAX enhancement did. It’s actually a mash-up. So mash-ups don’t
have to be just visual. Mash-ups–a mash-up is about
bringing content on the web from two different places
into a single user experience. And in this case we brought
together the “XKCD”comic and the transcript
for that particular episode into a single
auditory user experience, and that’s what you had. Chen: Okay.
So anything else, or…? woman: I believe
there is [indistinct]… I guess there are things
for accessibility programs where you could
stick to content, something like that,
right? Chen: Mm-hmm. woman:
So it would be best if the web site developers
could actually include such invisible text,
which could, you know, display just like an all text perimeter,
so [indistinct]… Raman: Well,
you could do that, but then let’s say Randall
gets 10 million hits a day. I don’t know
how many he gets, but let’s say he gets
10 million hits a day. And out of that,
100 users cannot see. So, well,
Randall probably pays for the bytes that
he sells, right? [laughs] So there are–there are–
there are design choices. The thing is the web
is a very flexible environment. And yes, so, you know,
doing an invisible link for, say, two words,
is one thing. Doing an invisible link
for, say, 200 words and then giving
all the 10 million users all of those 200 words
that they never see is probably
not a good idea. Now on the other hand,
if you sort of say, well, you know,
“Do a mouse hover over and then you see the transcript,”
and everybody likes it, then that’s
a good idea. So you should sort of
have that ability to do it
in many, many ways. And so, you know. Chen: And I think the other
key point to remember here is that we were able to do this
without getting help from either Randall
or from the owner robot site that does
the text transcripts. This is something that,
as a third party, you could just step in
and do it through AxsJAX, and that’s very empowering that
you can do that for someone. Raman: Because the thing is–
so after you do this, and you show it to, say,
the owner guys or Randall, say, “Wow, this is
wonderful,” right? But also if you sort of say, “You know, if you
had done that, “and if you had
taken the transcript “and you had put that
on your site, “then you would have
all the blind users come and read your comic,”
and you know, you’ll sort of probably
get a long yawn and say, “Yeah, maybe when I get time
I’ll do it.” [laughs] So it’s a very
empowering sort of way. And this is how the rest
of the web works, right? If you don’t like– So the old Unix joke
is, you know, “Get out of the way
or I’ll turn you– Get out of my way or I’ll
turn you into a shell script.” Right? So on the web
you can basically go in
and program it yourself and, you know, give it to
everybody else and your friends and take it from there. Chen: So any
other questions, or…? Okay, well, thank you
all for coming. Raman: Thank you,
everybody. [applause]

Be the first to comment

Leave a Reply

Your email address will not be published.


*