Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:18,500 --> 00:00:22,980
There is no practical obstacle
whatever now
2
00:00:22,980 --> 00:00:26,020
to the creation of an efficient index
3
00:00:26,020 --> 00:00:30,140
to all human knowledge,
ideas and achievements.
4
00:00:31,620 --> 00:00:33,620
To the creation, that is,
5
00:00:33,620 --> 00:00:38,420
of a complete planetary
memory for all mankind.
6
00:00:39,660 --> 00:00:43,620
He was one of the early inventors
of science fiction.
7
00:00:46,140 --> 00:00:48,140
The idea of time travel,
8
00:00:48,140 --> 00:00:50,420
the possibility of invisibility...
9
00:00:50,420 --> 00:00:54,140
LAUGHTER
10
00:00:54,140 --> 00:00:56,060
..of intergalactic struggles.
11
00:00:57,980 --> 00:01:00,380
And then, he came up with ideas
12
00:01:00,380 --> 00:01:03,980
of how we might reorganize the
knowledge apparatus of the world,
13
00:01:03,980 --> 00:01:05,940
which he called the World Brain.
14
00:01:05,940 --> 00:01:09,060
For Wells, the World Brain
had to contain
15
00:01:09,060 --> 00:01:11,940
all that was learnt and known
16
00:01:11,940 --> 00:01:15,780
and that was being learnt and known.
17
00:01:15,780 --> 00:01:19,260
If you have access to anything
that's been written,
18
00:01:19,260 --> 00:01:21,060
not just theoretical access,
19
00:01:21,060 --> 00:01:24,140
but like instant access
next to your brain,
20
00:01:24,140 --> 00:01:27,020
that changes your idea
of who you are.
21
00:01:27,020 --> 00:01:29,980
It can be reproduced exactly
22
00:01:29,980 --> 00:01:35,580
and fully in Peru, China, Iceland,
23
00:01:35,580 --> 00:01:39,340
Central Africa or wherever else.
24
00:01:39,340 --> 00:01:42,420
They were frank in their ambition
25
00:01:42,420 --> 00:01:46,460
and dazzling in their ability
to execute it.
26
00:01:47,660 --> 00:01:50,300
The Google Books scanning project
27
00:01:50,300 --> 00:01:54,780
is clearly the most ambitious
World Brain scheme
28
00:01:54,780 --> 00:01:56,660
that has ever been invented.
29
00:01:58,700 --> 00:02:03,100
This is no remote dream, no fantasy.
30
00:02:03,100 --> 00:02:07,300
It is a plain statement
of a contemporary state of affairs.
31
00:02:16,100 --> 00:02:18,500
The nightmare scenario,
in 20 years' time,
32
00:02:18,500 --> 00:02:21,140
would be Google tracking
everything we read.
33
00:02:21,140 --> 00:02:24,820
Google could basically hold
the whole world hostage.
34
00:02:26,060 --> 00:02:27,860
Ever since Wells,
35
00:02:27,860 --> 00:02:30,700
science fiction is always
about the possibility
36
00:02:30,700 --> 00:02:32,900
that people won't really matter
in the future.
37
00:02:32,900 --> 00:02:35,060
And the plot is always
about some heroic person
38
00:02:35,060 --> 00:02:36,940
that either succeeds
or doesn't succeed
39
00:02:36,940 --> 00:02:39,140
in proving that people really matter
after all.
40
00:02:54,900 --> 00:02:57,620
It's a library, a public library,
41
00:02:57,620 --> 00:03:01,220
where people go to look at books,
and read them and take them away.
42
00:03:02,620 --> 00:03:06,100
That girl works at the library
and she checks on books
43
00:03:06,100 --> 00:03:08,900
that are going out
and books that are coming back in.
44
00:03:08,900 --> 00:03:10,420
I love libraries.
45
00:03:10,420 --> 00:03:12,220
I like the smell,
46
00:03:12,220 --> 00:03:15,100
the smell of paper
properly preserved.
47
00:03:15,100 --> 00:03:17,620
It's as if it's the smell
of a hay barn
48
00:03:17,620 --> 00:03:19,900
that's been cleared
of all its animals
49
00:03:19,900 --> 00:03:22,580
and made into a human intelligence.
50
00:03:22,580 --> 00:03:25,780
And in a library, you really...even
if you're sitting in the tearoom,
51
00:03:25,780 --> 00:03:28,820
discussing your latest findings,
52
00:03:28,820 --> 00:03:32,900
it's amazing how much social
interaction with other people
53
00:03:32,900 --> 00:03:36,140
will actually help you
to enrich what you're doing.
54
00:03:36,140 --> 00:03:37,740
'In this part of the library,
55
00:03:37,740 --> 00:03:40,420
'the grown-ups can read
the stories to the children.'
56
00:03:40,420 --> 00:03:43,940
People sometimes say to me,
aren't libraries obsolete?
57
00:03:43,940 --> 00:03:45,900
Um... It's... It's absurd -
58
00:03:45,900 --> 00:03:48,140
they are nerve centres,
59
00:03:48,140 --> 00:03:50,140
centres of intellectual energy.
60
00:03:50,140 --> 00:03:52,620
Libraries stand for an ideal,
61
00:03:52,620 --> 00:03:55,060
which is an educated public.
62
00:03:55,060 --> 00:03:57,140
And to the degree that knowledge
is power,
63
00:03:57,140 --> 00:03:59,100
they also stand there for the idea
64
00:03:59,100 --> 00:04:03,340
that power should be disseminated
and not centralised.
65
00:04:27,420 --> 00:04:32,020
The first appeal of Google's
enterprise,
66
00:04:32,020 --> 00:04:38,500
when we saw it, was just digitising
millions and millions of books.
67
00:04:38,500 --> 00:04:40,620
At Harvard, we have, by far,
68
00:04:40,620 --> 00:04:43,260
the greatest university library
in the world.
69
00:04:43,260 --> 00:04:46,620
It's enormous - 17 million volumes.
70
00:04:46,620 --> 00:04:50,900
And every library wants
its holdings digitised
71
00:04:50,900 --> 00:04:53,940
for lots for reasons,
including preservation.
72
00:04:53,940 --> 00:04:57,700
But, beyond that,
it raises the possibility
73
00:04:57,700 --> 00:05:00,300
of sharing your intellectual wealth.
74
00:05:00,300 --> 00:05:03,500
I think of the Harvard Library
as an international asset.
75
00:05:03,500 --> 00:05:05,580
Something that should be opened up
76
00:05:05,580 --> 00:05:09,020
and shared with the general
population.
77
00:05:09,020 --> 00:05:11,220
So here comes Google.
78
00:05:11,220 --> 00:05:13,860
They've got the energy,
they've got the technology,
79
00:05:13,860 --> 00:05:17,740
they've got the money and they said,
"We'll do it for you. Free!"
80
00:05:17,740 --> 00:05:21,980
Google did such a fabulous job
in creating a vision,
81
00:05:21,980 --> 00:05:26,060
not only that a universal digital
library could be created,
82
00:05:26,060 --> 00:05:28,460
but that it could be done today.
83
00:05:28,460 --> 00:05:34,140
The Google engineers are
like good engineers everywhere,
84
00:05:34,140 --> 00:05:35,700
they just like to think about,
85
00:05:35,700 --> 00:05:37,700
"How do we surmount
these challenges?"
86
00:05:37,700 --> 00:05:43,020
They sort of leave the lawsuit
to the lawyers to worry about.
87
00:05:48,180 --> 00:05:52,220
Google's a company that believes
in its fundamental mission
88
00:05:52,220 --> 00:05:55,060
of empowering everyone in this world
89
00:05:55,060 --> 00:05:58,340
with all the information they need.
90
00:05:58,340 --> 00:06:00,700
Enriched with the right information,
91
00:06:00,700 --> 00:06:03,420
people can make better decisions
for themselves,
92
00:06:03,420 --> 00:06:06,020
their families and their communities.
93
00:06:06,020 --> 00:06:09,980
This world is full
of wonderful individuals
94
00:06:09,980 --> 00:06:11,820
which have varied needs.
95
00:06:11,820 --> 00:06:15,180
From a farmer in Africa
to a mother in India,
96
00:06:15,180 --> 00:06:17,620
to a business person in Japan.
97
00:06:17,620 --> 00:06:21,300
Everyone needs information
in this modern day and age.
98
00:06:21,300 --> 00:06:24,140
And Google believes
in breaking all the barriers
99
00:06:24,140 --> 00:06:28,020
between every individual
and the information they seek.
100
00:06:28,020 --> 00:06:31,740
When you actually negotiate
with Google
101
00:06:31,740 --> 00:06:35,340
and do so on their turf,
102
00:06:35,340 --> 00:06:38,660
you enter a strange world.
103
00:06:38,660 --> 00:06:43,820
A Google office doesn't have chairs
like this chair,
104
00:06:43,820 --> 00:06:47,900
the furniture consists
of large inflated balls
105
00:06:47,900 --> 00:06:51,980
that are coloured green
or red or yellow
106
00:06:51,980 --> 00:06:54,940
and the young Google engineers
are sitting on these.
107
00:06:54,940 --> 00:06:58,060
It's a kind of Never Never Land
feeling.
108
00:07:00,060 --> 00:07:05,780
About ten years ago, I got a visit
from a vice president of Google.
109
00:07:05,780 --> 00:07:07,540
And she walked into my office
110
00:07:07,540 --> 00:07:11,060
and described a project
that Google had in mind,
111
00:07:11,060 --> 00:07:12,660
which was to digitise
112
00:07:12,660 --> 00:07:15,580
all the books in
the Harvard Library.
113
00:07:15,580 --> 00:07:19,460
My first thought was,
to put it bluntly,
114
00:07:19,460 --> 00:07:23,220
that maybe they were smoking
something, because I didn't think
it was possible.
115
00:07:23,220 --> 00:07:26,900
Harvard had been digitising books
from time to time,
116
00:07:26,900 --> 00:07:30,740
but they were very limited
in number and we didn't do many,
117
00:07:30,740 --> 00:07:34,420
it was a very expensive
and complicated project.
118
00:07:34,420 --> 00:07:36,100
I don't remember exactly,
119
00:07:36,100 --> 00:07:39,460
but it was several hundred dollars
just for a single book.
120
00:07:39,460 --> 00:07:45,380
But they had invented
a copying station
121
00:07:45,380 --> 00:07:49,100
that was a lot cheaper
and easier to use,
122
00:07:49,100 --> 00:07:51,140
that didn't damage the books
123
00:07:51,140 --> 00:07:53,900
or, at least, went out of its way
not to damage the books.
124
00:07:53,900 --> 00:07:57,060
And it seemed to me
that it had a lot of plausibility.
125
00:07:57,060 --> 00:08:00,540
And so, we decided to...
to give it a try.
126
00:08:00,540 --> 00:08:04,500
Every great library did digitising,
sometimes on a large scale,
127
00:08:04,500 --> 00:08:09,340
our Open Collections Programme
digitised 2.3 million pages.
128
00:08:09,340 --> 00:08:10,780
I mean, that's big.
129
00:08:10,780 --> 00:08:15,580
But nothing like as big
as what Google attempted to do.
130
00:08:15,580 --> 00:08:19,700
The sheer ambition
of digitising everything.
131
00:08:30,820 --> 00:08:33,940
In the ancient world,
at the Library of Alexandria,
132
00:08:33,940 --> 00:08:36,900
they copied rolls and tablets,
133
00:08:36,900 --> 00:08:40,620
and attempted to copy
all that was known.
134
00:08:40,620 --> 00:08:44,700
And, eventually, the library
was destroyed by Julius Caesar
135
00:08:44,700 --> 00:08:48,020
and the loss of that library
in Alexandria
136
00:08:48,020 --> 00:08:51,340
was an international catastrophe.
137
00:08:53,420 --> 00:08:56,260
The universal library's been
talked about for millennia.
138
00:08:56,260 --> 00:08:59,220
There's a kind of a continuity
of development
139
00:08:59,220 --> 00:09:01,380
and, you know, we mustn't forget
the important role
140
00:09:01,380 --> 00:09:04,180
that libraries and scholars
have always made
141
00:09:04,180 --> 00:09:06,420
for millennia of copying.
142
00:09:06,420 --> 00:09:09,660
And then, you see,
with the development of printing,
143
00:09:09,660 --> 00:09:11,740
the multiplicity of texts,
144
00:09:11,740 --> 00:09:13,820
the copying of original texts.
145
00:09:13,820 --> 00:09:15,940
It was possible to think
in the Renaissance
146
00:09:15,940 --> 00:09:19,620
that you might be able to amass
the whole of published knowledge
147
00:09:19,620 --> 00:09:22,060
in a single room
or a single institution.
148
00:10:04,340 --> 00:10:06,380
Then, in the 19th century,
149
00:10:06,380 --> 00:10:10,260
you have various suggestions
in France and Belgium
150
00:10:10,260 --> 00:10:12,460
that you can create
a catalogue of everything.
151
00:10:12,460 --> 00:10:14,660
What will come next is microfilm.
152
00:10:14,660 --> 00:10:18,340
And so, you start finding
huge microfilming projects.
153
00:10:18,340 --> 00:10:21,580
And so, for us, the Google Project
was a sort of a natural extension
154
00:10:21,580 --> 00:10:23,420
of that process of development.
155
00:10:28,700 --> 00:10:32,740
Project Gutenberg, Michael Hart,
was the first digital library.
156
00:10:32,740 --> 00:10:35,620
He started on the fourth of July,
in early 1970s,
157
00:10:35,620 --> 00:10:37,860
by going and typing
the Declaration of Independence
158
00:10:37,860 --> 00:10:40,060
so that everybody
could have access to it.
159
00:10:40,060 --> 00:10:43,100
Thousands of volunteers worked
from all over the world
160
00:10:43,100 --> 00:10:44,500
to go and build this.
161
00:10:44,500 --> 00:10:47,180
He even had the idea
that it ought to be possible
162
00:10:47,180 --> 00:10:51,020
to download the entire library that
he had created if you wanted that.
163
00:10:51,020 --> 00:10:54,060
And I think it did act as a kind
of example of something
164
00:10:54,060 --> 00:10:57,100
that, later on, Google and others
165
00:10:57,100 --> 00:11:01,820
took up in a much bigger,
more extensive way.
166
00:11:06,660 --> 00:11:09,300
My name is Raymond Kurzweil
and I'm from Queens, New York.
167
00:11:09,300 --> 00:11:13,180
'When I was 12, I became fascinated
with pattern recognition.'
168
00:11:13,180 --> 00:11:14,780
And, as a young teenager,
169
00:11:14,780 --> 00:11:20,060
I did a project to teach computers
how to recognise patterns in music.
170
00:11:20,060 --> 00:11:21,500
I've built a computer
171
00:11:21,500 --> 00:11:23,940
and, by feeding it certain
relationships and music,
172
00:11:23,940 --> 00:11:25,940
I was able to write music with it.
173
00:11:25,940 --> 00:11:27,700
Raymond, how old are you? I'm 17.
174
00:11:27,700 --> 00:11:29,820
Do your parents know
what you've been up to?
175
00:11:29,820 --> 00:11:31,020
LAUGHTER
176
00:11:31,020 --> 00:11:34,100
Recognising printed letters
was a classical unsolved problem
177
00:11:34,100 --> 00:11:35,980
in the field of pattern recognition.
178
00:11:35,980 --> 00:11:40,580
And so, I created the first
omni-font optical character
recognition.
179
00:11:40,580 --> 00:11:43,420
This was about 1975.
180
00:11:43,420 --> 00:11:45,980
1978, we developed
a commercial version.
181
00:11:45,980 --> 00:11:49,340
And we talked about how you could
ultimately scan all books
182
00:11:49,340 --> 00:11:52,180
and all printed material.
183
00:11:52,180 --> 00:11:55,220
'When automobiles came along first,
184
00:11:55,220 --> 00:11:58,700
'they seemed likely to become
a rich man's monopoly.
185
00:11:58,700 --> 00:12:02,140
'They cost upward
of a thousand pounds.
186
00:12:02,140 --> 00:12:04,780
'Henry Ford altered all that.
187
00:12:04,780 --> 00:12:06,940
'He put the poor man on the road.
188
00:12:06,940 --> 00:12:08,940
'We want a Henry Ford today
189
00:12:08,940 --> 00:12:12,460
'to modernise the distribution
of knowledge,
190
00:12:12,460 --> 00:12:15,020
'make good knowledge cheap and easy,
191
00:12:15,020 --> 00:12:17,980
'in this still very ignorant,
ill-educated,
192
00:12:17,980 --> 00:12:22,020
'ill-served English-speaking world
of ours,
193
00:12:22,020 --> 00:12:27,100
'which might be the greatest power
on Earth for the good of mankind.'
194
00:12:36,980 --> 00:12:39,900
We started the Internet Archive
in 1996.
195
00:12:41,740 --> 00:12:45,140
The idea was to have all
the published works of humankind
196
00:12:45,140 --> 00:12:47,220
available to everybody,
197
00:12:47,220 --> 00:12:49,940
that this was the opportunity
of our generation,
198
00:12:49,940 --> 00:12:53,780
that...like the previous generation
had put a man on the moon.
199
00:12:53,780 --> 00:12:57,380
The Internet Archive had been
completely open with Google.
200
00:12:57,380 --> 00:13:00,300
In fact, I'd gone and given
a speech that was attended
201
00:13:00,300 --> 00:13:02,420
by, I think, all of the senior
executives
202
00:13:02,420 --> 00:13:05,660
on how one could go about
building a digital library
203
00:13:05,660 --> 00:13:07,100
of all books, music, video,
204
00:13:07,100 --> 00:13:10,140
and I'd hoped that there was going
to be a way to work with them,
205
00:13:10,140 --> 00:13:11,820
but that was not to be.
206
00:13:11,820 --> 00:13:14,540
Libraries had signed secret
agreements with Google...
207
00:13:14,540 --> 00:13:16,860
We didn't know what
was really going on.
208
00:13:16,860 --> 00:13:19,300
When it started coming out
as a completely separate project,
209
00:13:19,300 --> 00:13:21,140
and not working with others,
210
00:13:21,140 --> 00:13:23,580
then, I started
to become suspicious.
211
00:13:25,180 --> 00:13:28,260
Larry Page,
who founded Google with me,
212
00:13:28,260 --> 00:13:32,140
first proposed that we digitise
all books a decade ago,
213
00:13:32,140 --> 00:13:34,540
when we were a fledgling start-up.
214
00:13:34,540 --> 00:13:37,220
Five years later, in 2004,
215
00:13:37,220 --> 00:13:39,900
Google Books was born.
216
00:13:39,900 --> 00:13:43,300
Despite a number of important
digitisation efforts to date,
217
00:13:43,300 --> 00:13:45,500
none have been at a comparable scale,
218
00:13:45,500 --> 00:13:50,300
simply because no-one else has chosen
to invest the requisite resources.
219
00:13:50,300 --> 00:13:54,540
If Google Books is successful,
others will follow.
220
00:13:54,540 --> 00:13:58,100
I don't think that Google is aware
of the fact that it's a corporation.
221
00:13:58,100 --> 00:14:02,100
I think Google does think
of itself as an NGO
222
00:14:02,100 --> 00:14:04,380
that just happens
to make a lot of money.
223
00:14:04,380 --> 00:14:07,740
And they think of themselves
as social reformers
224
00:14:07,740 --> 00:14:13,340
who just happen to have their stock
traded on stock exchanges
225
00:14:13,340 --> 00:14:16,620
and who just happen to have
investors and shareholders,
226
00:14:16,620 --> 00:14:18,420
but they do think of themselves
227
00:14:18,420 --> 00:14:21,860
as ultimately being in the business
of making the world better.
228
00:14:27,700 --> 00:14:30,380
There are few more irreparable
property losses
229
00:14:30,380 --> 00:14:32,260
than vanished books.
230
00:14:32,260 --> 00:14:36,500
Nature, politics and war
have always been
231
00:14:36,500 --> 00:14:39,300
the mortal enemies of written works.
232
00:14:39,300 --> 00:14:42,380
Most recently, Hurricane Katrina
dealt a blow
233
00:14:42,380 --> 00:14:45,420
to the libraries of the Gulf Coast.
234
00:14:45,420 --> 00:14:51,180
At Tulane University, the main
library sat in nine feet of water.
235
00:14:51,180 --> 00:14:54,980
In the 1970s, the Khmer Rouge regime,
in Cambodia,
236
00:14:54,980 --> 00:14:58,740
decimated cultural institutions
throughout the country.
237
00:14:58,740 --> 00:15:01,700
Khmer Rouge fighters took over
the National Library
238
00:15:01,700 --> 00:15:05,140
throwing the books into the street,
burning them,
239
00:15:05,140 --> 00:15:07,980
while using the stacks as a pigsty.
240
00:15:07,980 --> 00:15:12,660
Now, with Google, the University
of Michigan is involved
241
00:15:12,660 --> 00:15:15,820
in one of the most extensive
preservation projects
242
00:15:15,820 --> 00:15:17,540
in world history.
243
00:15:17,540 --> 00:15:22,580
Google Books is a potent idea
on a number of dimensions.
244
00:15:22,580 --> 00:15:24,900
What I like about Google Books
245
00:15:24,900 --> 00:15:29,020
is the idea of not losing books,
246
00:15:29,020 --> 00:15:33,180
especially books that might be
genuinely abandoned.
247
00:15:33,180 --> 00:15:35,580
The idea of getting
all that stuff online
248
00:15:35,580 --> 00:15:37,740
is, of course,
going to be a benefit,
249
00:15:37,740 --> 00:15:39,660
so that, we have to love.
250
00:16:02,220 --> 00:16:05,460
I went to Google in January 2003.
251
00:16:05,460 --> 00:16:09,300
I actually made, what now I feel
quite embarrassed about,
252
00:16:09,300 --> 00:16:11,780
I made a presentation to them,
253
00:16:11,780 --> 00:16:14,180
telling them what they ought
to be doing.
254
00:16:14,180 --> 00:16:15,900
Only to find out a few months later
255
00:16:15,900 --> 00:16:18,660
that they'd actually been doing it
for a while already.
256
00:16:20,300 --> 00:16:24,860
Project Ocean was the kind of
code name, development code name,
257
00:16:24,860 --> 00:16:28,860
that Google were giving to what
eventually became Google Books.
258
00:16:28,860 --> 00:16:31,860
So it was called Project Ocean
because it was big, I imagine.
259
00:16:31,860 --> 00:16:32,980
HE CHUCKLES
260
00:16:35,180 --> 00:16:37,420
Google seemed to think
that they could do
261
00:16:37,420 --> 00:16:39,580
almost a million in three years.
262
00:17:05,180 --> 00:17:08,580
You could say that this mass
digitisation
263
00:17:08,580 --> 00:17:14,540
is something like running
a huge machine through a library.
264
00:17:14,540 --> 00:17:17,420
You take books by the shelf.
265
00:17:17,420 --> 00:17:20,820
They are put in cartons, on carts.
266
00:17:20,820 --> 00:17:22,740
They are loaded onto trucks.
267
00:17:22,740 --> 00:17:28,340
And then, Google at this time
had three places in the country
268
00:17:28,340 --> 00:17:31,380
where it was doing digitisation.
269
00:17:31,380 --> 00:17:35,660
Supposedly, it didn't give
the address of where they were.
270
00:17:35,660 --> 00:17:39,700
Google won't say how much
scanning all the books cost.
271
00:17:39,700 --> 00:17:41,740
But there are estimates that...
272
00:17:41,740 --> 00:17:44,380
well, it's somewhere between
$30 and $100 per book,
273
00:17:44,380 --> 00:17:48,180
so if you multiply that times
20 million...
274
00:17:48,180 --> 00:17:50,060
Google, early on,
275
00:17:50,060 --> 00:17:53,940
bent over backwards to keep us
from communicating
276
00:17:53,940 --> 00:17:55,980
with the other libraries.
277
00:17:55,980 --> 00:17:57,980
There were three or four large ones
278
00:17:57,980 --> 00:18:02,460
and each of us was told
we should not tell the others
279
00:18:02,460 --> 00:18:06,340
what kind of a contract we had
and how we were working with Google.
280
00:18:06,340 --> 00:18:09,540
To begin with, it had
to be kept fairly quiet.
281
00:18:09,540 --> 00:18:15,060
It was probably mid 2003 when
I started to take the wraps off
282
00:18:15,060 --> 00:18:18,700
in terms of this is going
to be a possibility
283
00:18:18,700 --> 00:18:20,740
that we might be working with Google.
284
00:18:22,620 --> 00:18:26,220
I witnessed the scale of the
operation and it was very impressive.
285
00:18:26,220 --> 00:18:29,380
20 very large work stations
286
00:18:29,380 --> 00:18:34,140
with very high-resolution cameras
287
00:18:34,140 --> 00:18:36,220
sitting on top of a cradle
288
00:18:36,220 --> 00:18:38,620
with very intense lights.
289
00:18:38,620 --> 00:18:42,060
And, underneath, a lot of black
boxes, which, presumably,
290
00:18:42,060 --> 00:18:45,300
contained all of Google's algorithms
291
00:18:45,300 --> 00:18:48,140
that makes Google search what it is.
292
00:18:48,140 --> 00:18:51,900
And they uploaded that stuff
straight to Mountain View,
293
00:18:51,900 --> 00:18:54,300
straight from Oxford.
294
00:18:54,300 --> 00:18:57,580
Google certainly depends on knowing
more and more and more
295
00:18:57,580 --> 00:19:01,300
for their algorithm to be better
and better and better.
296
00:19:01,300 --> 00:19:06,140
And this is the core of the way
economics in this space now works.
297
00:19:06,140 --> 00:19:12,340
They had a specific interest
in having lots of things in Google
298
00:19:12,340 --> 00:19:15,100
that would lead people to use Google
299
00:19:15,100 --> 00:19:18,420
so they could make money
by having advertisements there.
300
00:19:20,060 --> 00:19:22,380
What are books?
They are full of data
301
00:19:22,380 --> 00:19:25,100
and so, the more data you have,
302
00:19:25,100 --> 00:19:28,900
the more you can fine-tune
your search technologies.
303
00:19:31,740 --> 00:19:37,340
Some of the enthusiasts for Google's
way of gathering data,
304
00:19:37,340 --> 00:19:39,300
and it's not just Google at all,
I mean,
305
00:19:39,300 --> 00:19:40,900
it's Silicon Valley in general.
306
00:19:40,900 --> 00:19:42,820
It's the current cultural moment
307
00:19:42,820 --> 00:19:45,460
and includes the other
Silicon Valley companies,
308
00:19:45,460 --> 00:19:47,660
but also the modern world of finance.
309
00:19:47,660 --> 00:19:50,180
And also, the modern world
of spy craft for states
310
00:19:50,180 --> 00:19:52,380
and also the modern world
of criminality.
311
00:19:52,380 --> 00:19:55,940
And the modern world of insurance
and health care.
312
00:19:55,940 --> 00:19:58,020
All these things have this idea
313
00:19:58,020 --> 00:20:01,540
that you grab all this data
in order to become very powerful,
314
00:20:01,540 --> 00:20:04,340
you create a differential
in your ability to see information
315
00:20:04,340 --> 00:20:05,740
versus the ordinary person.
316
00:20:05,740 --> 00:20:08,580
And you create these new incredible
castles of power,
317
00:20:08,580 --> 00:20:11,740
but it's OK, it's not just
traditional power mongering,
318
00:20:11,740 --> 00:20:14,660
because you're making the world
more efficient.
319
00:20:24,340 --> 00:20:28,140
I was a little boy in the '70s
growing up in India,
320
00:20:28,140 --> 00:20:32,780
watching re-runs of Star Trek
on our family's black-and-white TV.
321
00:20:32,780 --> 00:20:36,420
And from that, those times,
322
00:20:36,420 --> 00:20:41,820
the picture of a Star Trek computer
was deeply ingrained in my head.
323
00:20:41,820 --> 00:20:45,780
As a little boy, I
was just fascinated by the fact
324
00:20:45,780 --> 00:20:48,820
that you can walk up to a computer
and ask it,
325
00:20:48,820 --> 00:20:51,940
"Computer, what's the atmosphere
of that planet?"
326
00:20:51,940 --> 00:20:55,260
That was just the most fascinating
thing to a little boy
327
00:20:55,260 --> 00:20:56,940
and, from that day on,
328
00:20:56,940 --> 00:21:01,100
it was my dream to build
that Star Trek computer.
329
00:21:01,100 --> 00:21:05,300
Only later would I grow up
and realise it's really hard,
330
00:21:05,300 --> 00:21:07,740
because computers
don't understand language.
331
00:21:07,740 --> 00:21:12,500
And I went through this brief period
of disbelief as a graduate student,
332
00:21:12,500 --> 00:21:17,340
where I didn't think I would reach
my dream in my lifetime.
333
00:21:17,340 --> 00:21:18,540
But thanks to Google
334
00:21:18,540 --> 00:21:21,340
and all the technologies
that we have built here,
335
00:21:21,340 --> 00:21:23,300
and what I see in the pipeline,
336
00:21:23,300 --> 00:21:25,620
I'm closer to my dream than ever.
337
00:24:23,940 --> 00:24:24,820
Um...
338
00:24:51,020 --> 00:24:54,580
Google were and are free to do
what they want with the scans.
339
00:24:54,580 --> 00:24:57,460
And why should that concern us?
340
00:25:00,300 --> 00:25:02,860
I mean, part of our ethos
341
00:25:02,860 --> 00:25:05,820
and part of our objective
as a library
342
00:25:05,820 --> 00:25:09,660
is to make the information that's
contained in our library available
343
00:25:09,660 --> 00:25:13,900
as free of charge as we can possibly
make it to anybody who needs it.
344
00:25:13,900 --> 00:25:18,980
And if Google is going to do that
on a larger scale, that's fine.
345
00:25:18,980 --> 00:25:23,660
If they are going to make money
out of it down the line, why not?
346
00:25:23,660 --> 00:25:26,700
You know, they've invested
a lot of money in it.
347
00:25:26,700 --> 00:25:29,540
Um... There's no such thing
as a free lunch.
348
00:25:32,620 --> 00:25:35,620
Who wouldn't want to have all
of the world's knowledge available
349
00:25:35,620 --> 00:25:37,460
to everyone on the planet?
350
00:25:37,460 --> 00:25:41,740
The problem is that Google,
as an intermediary in this process,
351
00:25:41,740 --> 00:25:45,180
has certain interests
and has a certain agenda
352
00:25:45,180 --> 00:25:47,500
that is not always transparent.
353
00:25:48,860 --> 00:25:51,420
If you, in Silicon Valley,
you have another job,
354
00:25:51,420 --> 00:25:54,260
which is you're building
this new life form
355
00:25:54,260 --> 00:25:56,500
that's going to take over the world
356
00:25:56,500 --> 00:25:59,020
and Google is providing
the memories for its brain
357
00:25:59,020 --> 00:26:01,660
or the other companies
are providing the memories,
358
00:26:01,660 --> 00:26:04,340
and this is something
that's openly talked about.
359
00:26:04,340 --> 00:26:09,340
It's all human knowledge
in books and out of books
360
00:26:09,340 --> 00:26:14,420
woven together
into a single entity
361
00:26:14,420 --> 00:26:18,260
that's accessible by anybody,
anywhere in the world, any time.
362
00:26:18,260 --> 00:26:22,140
And that "all knowledge"
is transformative.
363
00:26:22,140 --> 00:26:26,900
It really kicks up the civilisation
in our society into another level.
364
00:26:28,740 --> 00:26:31,300
Shortly after the launch
of Google Books,
365
00:26:31,300 --> 00:26:35,540
in different events, I ran
into Larry Page and Sergey Brin
366
00:26:35,540 --> 00:26:39,100
and had this brief exchange
with them about the potential.
367
00:26:39,100 --> 00:26:40,220
And, you know,
368
00:26:40,220 --> 00:26:42,780
there was a characteristic
Google-founder response,
369
00:26:42,780 --> 00:26:45,660
which was a kind of glint
in their eyes and a smile
370
00:26:45,660 --> 00:26:48,500
and the sense that this was
just the beginning
371
00:26:48,500 --> 00:26:52,420
of something much bigger than even
you at this point can imagine.
372
00:26:58,140 --> 00:27:00,100
At Harvard, we only permitted Google
373
00:27:00,100 --> 00:27:02,740
to digitise books
in the public domain,
374
00:27:02,740 --> 00:27:07,620
but the other research libraries
that Google first went to
375
00:27:07,620 --> 00:27:10,900
permitted Google to digitise books
covered by copyright.
376
00:27:10,900 --> 00:27:14,340
As soon as you get
into the copyright area,
377
00:27:14,340 --> 00:27:16,980
things get rapidly complicated.
378
00:27:26,940 --> 00:27:29,780
We're allowing Google
to scan all of our books,
379
00:27:29,780 --> 00:27:33,300
those in the public domain
and those still in copyright.
380
00:27:33,300 --> 00:27:35,700
We believe it is legal,
381
00:27:35,700 --> 00:27:41,380
ethical and a noble endeavour
that will transform our society.
382
00:27:41,380 --> 00:27:44,620
Legal because we believe
copyright law allows us fair use
383
00:27:44,620 --> 00:27:47,660
of the millions of books
that are being digitised.
384
00:27:58,060 --> 00:28:02,260
Fair use is a piece of American
copyright law that allows us
385
00:28:02,260 --> 00:28:05,340
to make copies without
ever asking any permission,
386
00:28:05,340 --> 00:28:08,980
without paying any fee
for certain carved-out uses.
387
00:28:08,980 --> 00:28:11,700
I happen to think Google's
fair use defence is strong.
388
00:28:11,700 --> 00:28:14,300
One of the things that courts
have done,
389
00:28:14,300 --> 00:28:16,100
over the last decade or so,
390
00:28:16,100 --> 00:28:18,940
is decided that search engines,
391
00:28:18,940 --> 00:28:23,420
who routinely make copies
of information,
392
00:28:23,420 --> 00:28:27,220
are making fair uses when they do it
in order to help people
393
00:28:27,220 --> 00:28:29,660
find information
that they are looking for.
394
00:28:29,660 --> 00:28:32,780
One of the things Google
has done is provide links
395
00:28:32,780 --> 00:28:35,460
to places where you can
buy the book.
396
00:28:35,460 --> 00:28:38,220
They scanned, but they did not
release the copy.
397
00:28:38,220 --> 00:28:41,100
You could not search,
except for key words.
398
00:28:41,100 --> 00:28:43,980
You could not see a page,
except for snippets.
399
00:28:43,980 --> 00:28:48,300
They were trying to allow
indexing and searching,
400
00:28:48,300 --> 00:28:51,580
without allowing people
to get copies.
401
00:28:51,580 --> 00:28:54,300
And we will protect
all copyrighted materials,
402
00:28:54,300 --> 00:28:56,660
your work in that archive.
403
00:28:56,660 --> 00:28:58,580
Let me repeat that.
404
00:28:58,580 --> 00:29:05,180
I guarantee you we will protect
all copyrighted materials.
405
00:29:05,180 --> 00:29:06,980
I assure you we understand
406
00:29:06,980 --> 00:29:10,940
that providing public access
to materials and copyright,
407
00:29:10,940 --> 00:29:14,980
particularly those still in print,
would be unlawful.
408
00:29:14,980 --> 00:29:17,940
One of the things that you need
to understand about Google
409
00:29:17,940 --> 00:29:22,300
is that they try to roll out
projects first
410
00:29:22,300 --> 00:29:24,740
and then, to think about
the consequences later.
411
00:29:24,740 --> 00:29:29,540
So you will often see them experiment
with something that looks very cool,
412
00:29:29,540 --> 00:29:31,980
maybe the Google Street View
Project...
413
00:29:31,980 --> 00:29:35,620
Google launched Street View in 2007,
414
00:29:35,620 --> 00:29:37,540
part of the search engine's
long-term goal
415
00:29:37,540 --> 00:29:40,380
to create a virtual
3D map of the whole planet,
416
00:29:40,380 --> 00:29:42,500
right down to street level.
417
00:29:42,500 --> 00:29:44,220
But investigations have revealed
418
00:29:44,220 --> 00:29:45,660
that Google Street View cars
419
00:29:45,660 --> 00:29:49,020
were collecting more than just
photographs for their databanks.
420
00:29:49,020 --> 00:29:51,980
Their antennas were also hoovering
up personal information
421
00:29:51,980 --> 00:29:54,620
from unencrypted Wi-Fi networks,
422
00:29:54,620 --> 00:29:57,140
including Internet history
and passwords.
423
00:29:59,060 --> 00:30:03,140
I think the case of Google
collecting Wi-Fi information,
424
00:30:03,140 --> 00:30:05,380
it reveals a complete lack of respect
425
00:30:05,380 --> 00:30:07,620
for privacy within the corporation.
426
00:30:07,620 --> 00:30:12,100
Such projects often reveal that
Google does not fully understand
427
00:30:12,100 --> 00:30:14,340
the social consequences
of its own work.
428
00:30:27,740 --> 00:30:30,140
We actually do more search
queries in China alone
429
00:30:30,140 --> 00:30:34,060
than any other search company does
in any other single-national market,
430
00:30:34,060 --> 00:30:37,060
by which I really mean
Google in the United States.
431
00:30:37,060 --> 00:30:39,300
So we certainly do aspire
to be a World Brain.
432
00:30:39,300 --> 00:30:41,020
I think HG Wells was, I mean,
433
00:30:41,020 --> 00:30:43,180
he is well known for having been
quite prescient
434
00:30:43,180 --> 00:30:44,980
about a lot of the things
that he envisaged.
435
00:30:44,980 --> 00:30:46,900
Sure we don't have
the time machine yet,
436
00:30:46,900 --> 00:30:49,060
but pretty much the rest of it
was dead on.
437
00:30:49,060 --> 00:30:51,900
We have a product, which is a very,
very popular product,
438
00:30:51,900 --> 00:30:53,340
it's called Baidu Wenku,
439
00:30:53,340 --> 00:30:56,580
the Chinese name of it
is the Baidu Library.
440
00:30:56,580 --> 00:31:00,860
It allows people to upload
materials that they have
441
00:31:00,860 --> 00:31:03,780
that are either
of their own creation,
442
00:31:03,780 --> 00:31:09,380
or that they have the intellectual
property rights to, to our site.
443
00:31:54,860 --> 00:31:57,100
There isn't an area
of human knowledge
444
00:31:57,100 --> 00:32:00,340
that hasn't been filled out
and made more rich and wondrous
445
00:32:00,340 --> 00:32:01,980
by the fact of the Internet.
446
00:32:01,980 --> 00:32:05,660
I am often sort of shocked
by people who see it
447
00:32:05,660 --> 00:32:08,180
as the beginnings
of this dystopian future.
448
00:32:08,180 --> 00:32:10,060
I embrace it unequivocally.
449
00:32:10,060 --> 00:32:12,540
The Fundamental Knowledge System
450
00:32:12,540 --> 00:32:15,980
which accumulates, sorts,
keeps in order
451
00:32:15,980 --> 00:32:19,860
and renders available
everything that is known
452
00:32:19,860 --> 00:32:22,300
centres on Barcelona.
453
00:32:22,300 --> 00:32:26,220
With its 17 million active workers,
454
00:32:26,220 --> 00:32:29,580
it is the Memory Of Mankind.
455
00:33:03,100 --> 00:33:06,540
You can look at the Internet
as something divine.
456
00:33:06,540 --> 00:33:09,940
We eventually will come, I think,
457
00:33:09,940 --> 00:33:13,900
to revere some of our
technological creations,
458
00:33:13,900 --> 00:33:15,340
like the Internet,
459
00:33:15,340 --> 00:33:18,140
to be almost like cathedrals
of redwoods,
460
00:33:18,140 --> 00:33:21,020
to be as complicated
and as beautiful
461
00:33:21,020 --> 00:33:25,660
as natural creations.
462
00:33:25,660 --> 00:33:28,500
And that, in a real sense,
463
00:33:28,500 --> 00:33:33,380
that there is more of God
in a cellphone
464
00:33:33,380 --> 00:33:35,620
than there is in a tree frog,
465
00:33:35,620 --> 00:33:40,020
because a cellphone is
an additional layer of evolution
466
00:33:40,020 --> 00:33:42,980
over the natural frog.
467
00:34:18,900 --> 00:34:21,260
It's a new form of medieval church
or something like that.
468
00:34:21,260 --> 00:34:23,780
Everybody is to give their data
469
00:34:23,780 --> 00:34:26,740
in service of worship
of this digital god.
470
00:34:26,740 --> 00:34:29,420
And I think it's really,
really dumb.
471
00:34:47,500 --> 00:34:48,900
It's not unique to this era,
472
00:34:48,900 --> 00:34:52,180
you can look at previous
technologies, whether it was radio,
473
00:34:52,180 --> 00:34:53,820
whether it was television,
474
00:34:53,820 --> 00:34:56,220
whether it was the telegraph,
it was electricity,
475
00:34:56,220 --> 00:35:00,300
you do have many similar hopes -
476
00:35:00,300 --> 00:35:05,220
that those technologies will bring
universal communication,
477
00:35:05,220 --> 00:35:09,300
people will talk to one another,
there will be peace everywhere,
478
00:35:09,300 --> 00:35:11,580
education will spread globally...
479
00:35:11,580 --> 00:35:14,820
A lot of similar hopes
have been expressed
480
00:35:14,820 --> 00:35:17,100
in connection with earlier
technologies.
481
00:35:17,100 --> 00:35:20,540
So this is nothing new, but I think
there is something about the scale
482
00:35:20,540 --> 00:35:27,340
at which projects and groups and
various companies and organisations
483
00:35:27,340 --> 00:35:30,460
now are putting those cyber-utopian
beliefs to work
484
00:35:30,460 --> 00:35:33,500
that is different now
than from what it was before.
485
00:35:35,140 --> 00:35:38,300
Science fiction never imagined
Google.
486
00:35:38,300 --> 00:35:40,860
Google is a game-changing tool
487
00:35:40,860 --> 00:35:44,380
on the order of the equally handy
flint hand axe.
488
00:35:44,380 --> 00:35:46,420
But Google is not ours.
489
00:35:46,420 --> 00:35:50,900
We are its unpaid content providers,
in one way or another.
490
00:35:50,900 --> 00:35:53,020
We generate product for Google,
491
00:35:53,020 --> 00:35:55,540
our every search a miniscule
contribution.
492
00:35:56,780 --> 00:35:58,420
Google is made of us,
493
00:35:58,420 --> 00:36:03,180
a sort of coral reef of human minds
and their products.
494
00:36:03,180 --> 00:36:05,700
We have yet to take
Google's measure.
495
00:37:09,700 --> 00:37:11,740
I do think that Google genuinely
496
00:37:11,740 --> 00:37:17,900
wants to make all of the world's
information organised and available
497
00:37:17,900 --> 00:37:20,260
to people throughout the globe.
498
00:37:20,260 --> 00:37:23,260
I do think that they genuinely
believe in that mission.
499
00:37:23,260 --> 00:37:29,700
Um... But they also happen to believe
that nothing will get lost
500
00:37:29,700 --> 00:37:32,060
and no-one will get harmed
501
00:37:32,060 --> 00:37:35,180
if it's Google who will implement
that mission.
502
00:37:35,180 --> 00:37:36,620
And I think it's normal.
503
00:37:36,620 --> 00:37:39,900
If they didn't trust themselves
to do it, then they would be...
504
00:37:39,900 --> 00:37:44,220
you know, they would have some
weird schizophrenic problem,
505
00:37:44,220 --> 00:37:46,500
you know, if they don't trust
themselves
506
00:37:46,500 --> 00:37:48,300
to implement their own project.
507
00:38:52,860 --> 00:38:56,860
One of the concerns which came out,
as you would expect from France,
508
00:38:56,860 --> 00:38:59,780
was that this was really
part of a plot
509
00:38:59,780 --> 00:39:04,260
in the United States to make English
the universal language
510
00:39:04,260 --> 00:39:07,340
and, as we know, the most important
thing about France,
511
00:39:07,340 --> 00:39:10,340
aside from its wine,
is its language.
512
00:39:10,340 --> 00:39:12,820
And there was a real sense
513
00:39:12,820 --> 00:39:18,660
that who are we to be digitising
all those books in English?
514
00:39:18,660 --> 00:39:21,740
And I remember some correspondence
about the fact
515
00:39:21,740 --> 00:39:26,420
that we, at Harvard, were not
just digitising English books,
516
00:39:26,420 --> 00:39:31,540
but were digitising a very large
number of books in French.
517
00:39:31,540 --> 00:39:34,740
To which, if I remember correctly,
the response came back,
518
00:39:34,740 --> 00:39:37,180
"Who are you to digitise
books in French?"
519
00:39:56,060 --> 00:39:59,940
First, we learned that Google
was scanning books.
520
00:39:59,940 --> 00:40:01,980
And I remember loving that idea,
521
00:40:01,980 --> 00:40:06,060
because I'm a reader and I write
non-fiction books and I do research
522
00:40:06,060 --> 00:40:09,500
and I wanted access to those books.
523
00:40:09,500 --> 00:40:12,860
Then, we heard that they were
scanning our books,
524
00:40:12,860 --> 00:40:14,660
they were scanning copyrighted books
525
00:40:14,660 --> 00:40:16,780
and they hadn't asked
anyone's permission.
526
00:40:16,780 --> 00:40:19,020
The libraries had just
handed them over.
527
00:40:19,020 --> 00:40:22,540
Well, that was obviously a
violation of our copyrights
528
00:40:22,540 --> 00:40:27,180
and a little bit of a surprise,
to put it mildly.
529
00:40:27,180 --> 00:40:30,180
I remember being very curious
about what they were doing
530
00:40:30,180 --> 00:40:32,460
and I popped my name into Google
531
00:40:32,460 --> 00:40:37,100
and saw that it came up
with snippets of my books.
532
00:40:37,100 --> 00:40:39,740
So what I did was
I searched for terms
533
00:40:39,740 --> 00:40:41,740
that I knew were common in my book,
534
00:40:41,740 --> 00:40:44,260
like "star", "galaxy",
535
00:40:44,260 --> 00:40:46,860
and there were lots and lots of hits
536
00:40:46,860 --> 00:40:49,540
and it would display
several snippets.
537
00:40:49,540 --> 00:40:51,700
And then, I would search
for other common words
538
00:40:51,700 --> 00:40:54,660
and it was clear that if you were
clever about your searches,
539
00:40:54,660 --> 00:40:57,380
you could see quite a bit
of the text, if not all of it.
540
00:40:57,380 --> 00:41:02,460
The problem that most authors have
is obscurity.
541
00:41:02,460 --> 00:41:06,540
That's the issue.
There are a gazillion books.
542
00:41:06,540 --> 00:41:09,180
How do you get people
to pay attention to yours?
543
00:41:09,180 --> 00:41:14,660
Google claimed that its use of these
millions of copyrighted books
544
00:41:14,660 --> 00:41:18,380
that it had digitised
was an example of fair use.
545
00:41:18,380 --> 00:41:20,380
Why? I'm not sure.
546
00:41:20,380 --> 00:41:23,620
I still don't understand
how that can be justified.
547
00:41:23,620 --> 00:41:26,460
The point is that the entire book
has been copied
548
00:41:26,460 --> 00:41:30,660
and it's been copied by a single
company that's doing it for purposes
549
00:41:30,660 --> 00:41:32,820
of profiting off the work.
550
00:41:32,820 --> 00:41:36,980
If you allow a profit-making company
to copy a million books,
551
00:41:36,980 --> 00:41:40,500
then, how can you say no
to the next enterprise
552
00:41:40,500 --> 00:41:42,740
that also wants to copy
the million books?
553
00:41:42,740 --> 00:41:46,740
So The Authors Guild organised
a class action suit,
554
00:41:46,740 --> 00:41:49,220
asking them to stop doing that.
555
00:41:49,220 --> 00:41:53,780
The Authors Guild on Tuesday filed a
lawsuit against search engine Google
556
00:41:53,780 --> 00:41:56,900
alleging that scanning
and digitising library books
557
00:41:56,900 --> 00:42:00,580
constitutes a massive copyright
infringement.
558
00:42:00,580 --> 00:42:03,940
The Authors Guild represents
more than 8,000 authors
559
00:42:03,940 --> 00:42:06,460
and it's the largest society
of published writers
560
00:42:06,460 --> 00:42:08,500
in the United States.
561
00:42:08,500 --> 00:42:12,420
When Google made its decision
to scan these millions of books,
562
00:42:12,420 --> 00:42:16,740
it certainly realised that, depending
upon how litigation developed,
563
00:42:16,740 --> 00:42:20,420
this could be a bet-the-company
decision.
564
00:42:20,420 --> 00:42:25,140
Because copyright liability in the
United States can be quite extreme -
565
00:42:25,140 --> 00:42:28,460
$150,000 per copyrighted work.
566
00:42:28,460 --> 00:42:32,020
And, depending on the number
of copyrighted works at stake,
567
00:42:32,020 --> 00:42:34,340
it could be in the billions
of dollars.
568
00:42:34,340 --> 00:42:36,780
The Association of American
Publishers
569
00:42:36,780 --> 00:42:39,380
has filed a lawsuit against Google
570
00:42:39,380 --> 00:42:41,820
alleging the Internet company's
plan to scan
571
00:42:41,820 --> 00:42:45,460
and digitally distribute the text
of major library collections
572
00:42:45,460 --> 00:42:47,620
would violate copyright protections.
573
00:42:51,700 --> 00:42:58,660
I think the issue of copyright
is an archaic, unproductive view.
574
00:42:58,660 --> 00:43:00,500
When you create something,
575
00:43:00,500 --> 00:43:03,740
you're building on the work
of other people,
576
00:43:03,740 --> 00:43:05,300
no matter who you are,
577
00:43:05,300 --> 00:43:07,820
whether you are JK Rowling
or Shakespeare.
578
00:43:07,820 --> 00:43:12,060
You're basing your work
on the work of others.
579
00:43:12,060 --> 00:43:14,100
You're basically taking their ideas.
580
00:43:14,100 --> 00:43:16,620
An artist does not own their ideas.
581
00:43:16,620 --> 00:43:18,580
No artist does.
582
00:43:18,580 --> 00:43:23,740
Any useful information exists
because of the efforts of real people
583
00:43:23,740 --> 00:43:27,980
and copyright is our way of
remembering who those people are.
584
00:43:27,980 --> 00:43:30,780
It's crucial to not lose that.
585
00:43:30,780 --> 00:43:34,820
And I think cyber culture is missing
the point of copyright.
586
00:43:34,820 --> 00:43:36,860
You might say, "Well,
who cares about authors?
587
00:43:36,860 --> 00:43:39,500
"Let a few authors not make as much
money as they would have."
588
00:43:39,500 --> 00:43:42,140
But it's a precedent.
The whole Internet will become
589
00:43:42,140 --> 00:43:46,620
a tool for the concentration of
power and that would be a disaster.
590
00:43:46,620 --> 00:43:50,660
The Internet is the world's largest
copy machine,
591
00:43:50,660 --> 00:43:52,860
anything that touches it,
it's been copied.
592
00:43:52,860 --> 00:43:56,180
And, just to transmit something
along the way,
593
00:43:56,180 --> 00:43:59,580
um...people are making copies
of things.
594
00:43:59,580 --> 00:44:03,260
Copies are valueless,
they have no worth at all
595
00:44:03,260 --> 00:44:05,020
until there was a focus on copies
596
00:44:05,020 --> 00:44:08,420
because that's an industrial-age
artefact.
597
00:45:41,380 --> 00:45:47,260
A book is really a plateau
that a person reaches to say,
598
00:45:47,260 --> 00:45:50,020
"This is my testament,
this is what I can offer."
599
00:45:50,020 --> 00:45:52,660
A book is not just
an extra long tweet,
600
00:45:52,660 --> 00:45:55,180
a book is something
that's hard to do.
601
00:45:55,180 --> 00:45:57,220
It's hard to finish.
It's hard to publish.
602
00:45:57,220 --> 00:46:00,260
It's a certain achievement of scale,
603
00:46:00,260 --> 00:46:03,100
it's a declaration of this is
what my life has learned,
604
00:46:03,100 --> 00:46:04,780
this is what I can offer.
605
00:46:04,780 --> 00:46:07,780
And that is not something
that can be dissected
606
00:46:07,780 --> 00:46:11,220
and the little minced pieces
simply can't mean the same thing.
607
00:46:15,020 --> 00:46:19,660
The lawsuits were commenced
in the fall of 2005
608
00:46:19,660 --> 00:46:22,820
and, within six months,
609
00:46:22,820 --> 00:46:26,860
The Authors Guild and the publishers
came to Google
610
00:46:26,860 --> 00:46:31,140
with a proposal about
settling the lawsuit.
611
00:46:31,140 --> 00:46:34,020
I was sitting innocently in my office
612
00:46:34,020 --> 00:46:36,820
and a lawyer for the university
appeared and he said,
613
00:46:36,820 --> 00:46:40,700
"You are about to take
a non-disclosure oath."
614
00:46:40,700 --> 00:46:43,460
Well, I'd never had anything
to do with lawyers,
615
00:46:43,460 --> 00:46:46,780
except once in my life
when I made a will and I thought,
616
00:46:46,780 --> 00:46:50,860
"Um, I'm in deep water now.
What is this all about?"
617
00:46:50,860 --> 00:46:54,900
Well, it turned out that
there were secret negotiations
618
00:46:54,900 --> 00:46:57,580
between Google, on the one hand,
619
00:46:57,580 --> 00:47:02,420
and The Authors Guild and The
Association of American Publishers
on the other.
620
00:47:02,420 --> 00:47:06,180
They were suing Google
for infringement of copyright
621
00:47:06,180 --> 00:47:09,740
and, as happens frequently
with suits,
622
00:47:09,740 --> 00:47:12,500
they began to negotiate a settlement.
623
00:47:12,500 --> 00:47:15,620
Well, we were not part of that
at Harvard.
624
00:47:15,620 --> 00:47:19,260
However, we had to be informed
about it because we had the books.
625
00:47:19,260 --> 00:47:20,900
It took three years to work it out,
626
00:47:20,900 --> 00:47:23,140
because there were a lot of issues
to be discussed.
627
00:47:23,140 --> 00:47:26,500
There were publishers at the table
as well as authors.
628
00:47:26,500 --> 00:47:29,460
And publishers and authors
did not have identical interests.
629
00:47:29,460 --> 00:47:33,700
There were libraries, not at the
table, but very much in the picture.
630
00:47:33,700 --> 00:47:37,140
They were talking to Google
away from the room.
631
00:47:37,140 --> 00:47:39,500
And I'm not sure how much I can say.
632
00:47:39,500 --> 00:47:42,940
I definitely cannot talk
specifically about the negotiations
633
00:47:42,940 --> 00:47:45,900
because I signed a non-disclosure
agreement,
634
00:47:45,900 --> 00:47:47,900
which I'm told is still in force,
635
00:47:47,900 --> 00:47:49,940
and I don't want to go to jail.
636
00:47:49,940 --> 00:47:53,380
Google's long-running legal battle
with the US publishing industry
637
00:47:53,380 --> 00:47:55,900
came to an unexpected halt
this morning
638
00:47:55,900 --> 00:47:57,940
as the parties announced
a settlement
639
00:47:57,940 --> 00:47:59,700
that would see both sides cooperate
640
00:47:59,700 --> 00:48:02,860
on online access
to copyrighted books.
641
00:48:02,860 --> 00:48:07,900
Google have agreed to pay
�125 million in the settlement.
642
00:48:07,900 --> 00:48:11,860
�35.5 million of that sum
will go towards the establishment
643
00:48:11,860 --> 00:48:15,140
of a rights collecting body
for digital books.
644
00:48:15,140 --> 00:48:18,820
$45 millions has been set aside
to compensate writers
645
00:48:18,820 --> 00:48:22,860
whose copyrighted books
Google has already scanned.
646
00:48:22,860 --> 00:48:25,900
They will get around $60 per book.
647
00:48:25,900 --> 00:48:29,540
The largest portion of the
settlement, $45.5 million,
648
00:48:29,540 --> 00:48:32,580
will go just on the legal fees.
649
00:48:32,580 --> 00:48:35,260
But the most striking aspect
of the agreement
650
00:48:35,260 --> 00:48:38,980
is that it turns Google into a book
seller, selling online access
651
00:48:38,980 --> 00:48:42,580
to out-of-print but
still-in-copyright works.
652
00:48:42,580 --> 00:48:46,020
For those of you who don't know the
details of the settlement agreement,
653
00:48:46,020 --> 00:48:49,260
it's 385 pages,
654
00:48:49,260 --> 00:48:51,700
it has 46 sections of definitions,
655
00:48:51,700 --> 00:48:54,540
it's got 15 sections
on Google's obligations,
656
00:48:54,540 --> 00:48:57,060
it's got nine sections
on the economic terms,
657
00:48:57,060 --> 00:49:00,220
it's got six sections
on libraries' obligations.
658
00:49:00,220 --> 00:49:03,940
So this is not a little three-or-four
page memorandum of understanding
659
00:49:03,940 --> 00:49:06,140
that we are talking about here.
660
00:49:06,140 --> 00:49:08,980
This is a very heavily-negotiated
agreement.
661
00:49:08,980 --> 00:49:11,020
So how many people have not
read the 334 pages?
662
00:49:11,020 --> 00:49:12,220
CHUCKLING
663
00:49:12,220 --> 00:49:13,420
OK.
664
00:49:13,420 --> 00:49:16,780
We proposed something that was
a little bit outside the box
665
00:49:16,780 --> 00:49:20,740
and that was - if money
is being made,
666
00:49:20,740 --> 00:49:23,380
share the money with
the rights holders.
667
00:49:23,380 --> 00:49:25,060
It couldn't be simpler.
668
00:49:25,060 --> 00:49:28,820
So I thought it would be pretty
non-controversial.
669
00:49:28,820 --> 00:49:31,500
That apparently was naive of me.
670
00:49:31,500 --> 00:49:34,740
I personally became increasingly
disenchanted
671
00:49:34,740 --> 00:49:37,620
with what originally
looked like a great idea.
672
00:49:37,620 --> 00:49:41,060
They basically transformed
the search service
673
00:49:41,060 --> 00:49:44,540
into a gigantic commercial
enterprise.
674
00:49:44,540 --> 00:49:49,380
They really thought they would
digitise every book in existence
675
00:49:49,380 --> 00:49:51,820
and make it available,
for a price, everywhere.
676
00:50:00,780 --> 00:50:04,620
The settlement would allow Google
to have essentially a licence
677
00:50:04,620 --> 00:50:08,340
to commercialize all books
that are out of print.
678
00:50:08,340 --> 00:50:12,940
There were certainly
hundreds of thousands
679
00:50:12,940 --> 00:50:15,980
and probably millions of books,
680
00:50:15,980 --> 00:50:19,260
for whom, even if they were
in copyright,
681
00:50:19,260 --> 00:50:23,260
no author, no publisher,
no rights holder would come forward.
682
00:50:23,260 --> 00:50:26,140
And those books are orphans
683
00:50:26,140 --> 00:50:28,780
and Google would be able
to commercialize those
684
00:50:28,780 --> 00:50:30,980
and nobody else would.
685
00:50:30,980 --> 00:50:36,020
A monopoly was being created,
a monopoly of access to knowledge.
686
00:50:36,020 --> 00:50:40,140
Did we want the greatest library
that would ever exist
687
00:50:40,140 --> 00:50:42,820
to be in the hands
of one giant corporation,
688
00:50:42,820 --> 00:50:47,460
which could really charge almost
anything it wanted for access to it?
689
00:50:47,460 --> 00:50:49,900
It's not a library, it's a bookstore
690
00:50:49,900 --> 00:50:52,740
and, you know, sell it
as a bookstore, if you want,
691
00:50:52,740 --> 00:50:54,780
but don't pretend
that it's a library.
692
00:50:54,780 --> 00:50:56,980
When I talk to people
in the publishing industry,
693
00:50:56,980 --> 00:51:00,140
they find it humorous cos
it's like, "Well, they're orphan
for a reason..."
694
00:51:00,140 --> 00:51:01,500
CHUCKLING
695
00:51:01,500 --> 00:51:04,740
And that in fact if we suddenly
found this goldmine
696
00:51:04,740 --> 00:51:08,180
where the future of the book
are the orphan books... Yeah.
697
00:51:08,180 --> 00:51:11,460
..OK, then, boy, those publishers
sure aren't very smart.
698
00:51:20,100 --> 00:51:23,220
Our principal concern here today
in this discussion
699
00:51:23,220 --> 00:51:25,260
is that, under the proposed
settlement,
700
00:51:25,260 --> 00:51:28,500
Google would be the only entity
that could treat copyright
701
00:51:28,500 --> 00:51:30,180
as an opt-out mechanism.
702
00:51:30,180 --> 00:51:32,780
Everyone else would have to treat it
as opt-in.
703
00:51:32,780 --> 00:51:35,300
There are other problems
with this proposed settlement.
704
00:51:43,180 --> 00:51:45,980
Listed below are various potential
revenue streams for Google
705
00:51:45,980 --> 00:51:47,820
as identified within
the settlement -
706
00:51:47,820 --> 00:51:49,580
institutional subscriptions,
707
00:51:49,580 --> 00:51:54,060
consumer purchases, advertising
uses, public access service,
708
00:51:54,060 --> 00:51:56,860
print-on-demand, custom publishing,
709
00:51:56,860 --> 00:52:00,620
PDF downloads,
consumer subscription model,
710
00:52:00,620 --> 00:52:04,460
summaries, abstracts,
compilations of books.
711
00:52:04,460 --> 00:52:06,820
That's what you are going
to end up with at a minimum.
712
00:52:06,820 --> 00:52:09,020
What I'm saying to you, Mr Drummond,
713
00:52:09,020 --> 00:52:14,060
does this, in fact, place Google
at such a tremendous advantage
714
00:52:14,060 --> 00:52:18,260
in disregard of what has been
historically copyright law?
715
00:52:18,260 --> 00:52:20,980
How do you respond
to those concerns?
716
00:52:20,980 --> 00:52:23,820
As of today, we have zero market
share in any sort of books,
717
00:52:23,820 --> 00:52:25,940
so we're a new entrant
to the market.
718
00:52:25,940 --> 00:52:29,260
So far from being someone
who's controlling the market,
719
00:52:29,260 --> 00:52:31,700
we're not even in it yet
and we're trying to get in there.
720
00:52:31,700 --> 00:52:34,740
They thought, "All we have to do is
kind of announce this to the world
721
00:52:34,740 --> 00:52:37,660
"and the world will go,
'God, what a great agreement!'"
722
00:52:37,660 --> 00:52:40,220
And, for a while, some people did.
723
00:52:40,220 --> 00:52:43,900
But then, you started reading
the agreement really carefully
724
00:52:43,900 --> 00:52:46,100
and there were lots of questions.
725
00:52:51,700 --> 00:52:56,380
The problem was there was nothing
in the agreement
726
00:52:56,380 --> 00:52:59,700
that respected the privacy
of the people
727
00:52:59,700 --> 00:53:03,180
who were looking at the books.
728
00:53:03,180 --> 00:53:05,460
Google was going to be keeping track
729
00:53:05,460 --> 00:53:08,380
of who exactly was reading
that book,
730
00:53:08,380 --> 00:53:13,300
how long they were reading it
and what they read next.
731
00:53:13,300 --> 00:53:17,980
That information could get back
to the government,
732
00:53:17,980 --> 00:53:21,180
could get back to the FBI,
could get back to the police,
733
00:53:21,180 --> 00:53:24,300
could get back to their employer.
734
00:53:24,300 --> 00:53:26,900
Because Google wasn't making
any kind of guarantees
735
00:53:26,900 --> 00:53:30,540
about what they were going to do
in respect of this privacy.
736
00:53:35,900 --> 00:53:40,340
If people find that the privacy
policies of a particular technology
737
00:53:40,340 --> 00:53:44,020
are not to their liking,
they should unplug it.
738
00:53:44,020 --> 00:53:46,460
They should retreat
from the Internet.
739
00:53:47,900 --> 00:53:50,900
They should cut off
their phone lines
740
00:53:50,900 --> 00:53:54,140
and they should go up
and hide in a mountain.
741
00:53:54,140 --> 00:53:55,780
They have that choice.
742
00:54:00,940 --> 00:54:03,860
Well's conception
of the World Brain was that
743
00:54:03,860 --> 00:54:08,460
it was intended to have a power
of surveillance over mankind -
744
00:54:08,460 --> 00:54:12,460
information gathered
and organised in such a way
745
00:54:12,460 --> 00:54:17,300
that we had an eye
that could actually survey
746
00:54:17,300 --> 00:54:19,940
everything that was going on.
747
00:54:19,940 --> 00:54:23,820
It would be able to register
where everybody was,
748
00:54:23,820 --> 00:54:25,220
everywhere they went,
749
00:54:25,220 --> 00:54:28,460
potentially, all the transactions
that they were engaged in.
750
00:54:28,460 --> 00:54:31,420
And he seemed to think
this is likely to be a good thing.
751
00:54:31,420 --> 00:54:33,780
It was a gradual process
752
00:54:33,780 --> 00:54:37,620
of getting to know the details
of Google Book Search
753
00:54:37,620 --> 00:54:41,060
and it was the cumulative
effect of these details
754
00:54:41,060 --> 00:54:47,820
that made me feel this project
was, actually,
755
00:54:47,820 --> 00:54:51,940
something that I myself
could not recommend
756
00:54:51,940 --> 00:54:54,260
to the president and fellows
of Harvard
757
00:54:54,260 --> 00:54:58,340
as something that we should
enthusiastically support.
758
00:55:03,900 --> 00:55:07,340
HG Wells' idea of the World Brain
759
00:55:07,340 --> 00:55:12,100
was a dictatorship of technologists
and intellectuals.
760
00:55:12,100 --> 00:55:13,820
These are the geeks of their day
761
00:55:13,820 --> 00:55:16,620
and, gradually, he saw
their power would spread
762
00:55:16,620 --> 00:55:19,460
from laboratory to laboratory,
from university to university,
763
00:55:19,460 --> 00:55:23,540
as these people with the expertise
began to coalesce
764
00:55:23,540 --> 00:55:27,180
into sort of almost like
managerial groups
765
00:55:27,180 --> 00:55:31,660
that would mean that we don't need
the politicians
766
00:55:31,660 --> 00:55:35,100
and the conflicts and the noise,
767
00:55:35,100 --> 00:55:37,940
the confusion, the babble.
768
00:55:37,940 --> 00:55:41,940
But for the World Brain there was
to be a further component
769
00:55:41,940 --> 00:55:44,860
and this is the component
that is what disturbs me.
770
00:55:44,860 --> 00:55:46,900
It's how that would be used
771
00:55:46,900 --> 00:55:50,540
to achieve the ultimate goals
of civilisation,
772
00:55:50,540 --> 00:55:53,780
as it appears to have been
evolving towards.
773
00:56:22,700 --> 00:56:25,700
It's going to change
how we interface with information.
774
00:56:25,700 --> 00:56:27,900
People are going to ask,
"How did it do that?
775
00:56:27,900 --> 00:56:29,540
"How did it accomplish this task
776
00:56:29,540 --> 00:56:32,540
"which before we thought only humans
could ever hope to do?"
777
00:56:32,540 --> 00:56:35,100
David Hume held this view
778
00:56:35,100 --> 00:56:39,820
that sense and experience are
the sole foundation of knowledge.
779
00:56:39,820 --> 00:56:41,300
Watson?
780
00:56:41,300 --> 00:56:43,060
What is empiricism?
781
00:56:43,060 --> 00:56:45,740
After IBM's success with Deep Blue,
782
00:56:45,740 --> 00:56:50,060
they looked around for other kinds
of games that they could take on.
783
00:56:50,060 --> 00:56:51,700
And they wanted something
784
00:56:51,700 --> 00:56:54,620
that was a very different
kind of game than chess.
785
00:56:54,620 --> 00:56:56,540
And so, they picked Jeopardy!,
786
00:56:56,540 --> 00:56:58,580
which is basically
a fancy trivia game,
787
00:56:58,580 --> 00:57:01,020
it's one of those games
that you or I could play.
788
00:57:01,020 --> 00:57:05,100
It's a human standing there
with their carbon and water
789
00:57:05,100 --> 00:57:07,780
versus the computer
with all of its silicon
790
00:57:07,780 --> 00:57:09,980
and its main memory and its disk.
791
00:57:09,980 --> 00:57:12,820
After Germany invaded
the Netherlands,
792
00:57:12,820 --> 00:57:16,180
this Queen, her family
and cabinet fled to London. Maria?
793
00:57:16,180 --> 00:57:17,300
Who is Beatrice?
794
00:57:17,300 --> 00:57:18,940
No, Watson?
795
00:57:18,940 --> 00:57:20,740
Who is Wilhelmina?
796
00:57:20,740 --> 00:57:22,420
That is correct.
797
00:57:22,420 --> 00:57:25,340
This US President negotiated
the Treaty of Portsmouth
798
00:57:25,340 --> 00:57:27,020
ending the Russo-Japanese War.
799
00:57:27,020 --> 00:57:28,580
Watson?
800
00:57:28,580 --> 00:57:30,660
Who is Theodore Roosevelt?
801
00:57:30,660 --> 00:57:31,900
Good for $800...
802
00:57:31,900 --> 00:57:35,060
I did talk to Larry Page
when Google first started
803
00:57:35,060 --> 00:57:37,300
because I was really perplexed
804
00:57:37,300 --> 00:57:42,420
about why would anybody
make a new search engine
805
00:57:42,420 --> 00:57:44,100
when we had AltaVista,
806
00:57:44,100 --> 00:57:46,940
which was the current search engine.
807
00:57:46,940 --> 00:57:48,540
It seemed good enough.
808
00:57:48,540 --> 00:57:52,380
And he said, "Oh, it's not to make a
search engine, it's to make an AI."
809
00:57:55,020 --> 00:57:57,660
Most of my discussions
have been with Larry Page.
810
00:57:57,660 --> 00:58:01,140
We've talked in general
about their quest
811
00:58:01,140 --> 00:58:03,580
to digitise all knowledge
812
00:58:03,580 --> 00:58:07,260
and then develop true AI.
813
00:58:07,260 --> 00:58:11,500
You can create intelligent systems
if you have very large databases.
814
00:58:11,500 --> 00:58:14,940
And books are actually
probably more valuable
815
00:58:14,940 --> 00:58:17,820
than all the other stuff
on the Internet,
816
00:58:17,820 --> 00:58:20,660
cos we have a high standard
for what we put in books.
817
00:58:28,780 --> 00:58:31,380
The computer industry
and its implications
818
00:58:31,380 --> 00:58:33,460
in terms of information technology
819
00:58:33,460 --> 00:58:36,820
is a multi-trillion-dollar
part of the economy.
820
00:58:36,820 --> 00:58:42,580
It will be, you know, the basis
of everything we do in the future.
821
00:58:42,580 --> 00:58:47,060
What Watson showed was you can take
a very large, very messy set of data
822
00:58:47,060 --> 00:58:49,700
and if you can use
those inputs correctly,
823
00:58:49,700 --> 00:58:52,420
you can actually answer
really sophisticated questions.
824
00:58:52,420 --> 00:58:57,580
And, certainly, the presence of large
amounts of data on the Internet
825
00:58:57,580 --> 00:59:01,140
is going to be as much an input
for machines as it is for people.
826
00:59:01,140 --> 00:59:03,700
What we really will need to top that
827
00:59:03,700 --> 00:59:07,340
is computer systems that can
understand natural language.
828
00:59:07,340 --> 00:59:11,380
And natural language understanding
is actually coming along very well.
829
00:59:11,380 --> 00:59:15,700
IBM's Watson is a very good example
of the current state of the art
830
00:59:15,700 --> 00:59:18,500
in computers understanding
natural language,
831
00:59:18,500 --> 00:59:20,860
cos not only did Watson
have to understand
832
00:59:20,860 --> 00:59:23,900
the convoluted language
in the Jeopardy! query,
833
00:59:23,900 --> 00:59:28,220
which includes metaphors and similes
and puns, and riddles and jokes,
834
00:59:28,220 --> 00:59:31,340
but it got its knowledge
to respond to the query
835
00:59:31,340 --> 00:59:35,620
from actually reading 200 million
pages of natural-language documents,
836
00:59:35,620 --> 00:59:39,580
including all of Wikipedia,
and several other encyclopaedias.
837
00:59:39,580 --> 00:59:43,300
And when you see a computer play it
better than we ever could,
838
00:59:43,300 --> 00:59:45,740
it's one of those moments
where you realise,
839
00:59:45,740 --> 00:59:47,820
"Oh, yes, the world really
IS different."
840
00:59:47,820 --> 00:59:51,020
An IBM supercomputer named Watson
841
00:59:51,020 --> 00:59:55,820
has won the first ever
Jeopardy! quiz show competition
842
00:59:55,820 --> 00:59:59,740
starring a computer as a player.
843
00:59:59,740 --> 01:00:04,940
Google Book Project is, in a sense,
trying to make that universal library
844
01:00:04,940 --> 01:00:10,980
which could then be read by an AI
or a Watson-like supercomputer.
845
01:00:10,980 --> 01:00:14,620
By 2045, we'll have expanded,
according to my calculations,
846
01:00:14,620 --> 01:00:18,980
the intelligence and capability of
the human machine civilisation
847
01:00:18,980 --> 01:00:20,580
a billion fold.
848
01:00:20,580 --> 01:00:23,060
So that's such a
profound transformation,
849
01:00:23,060 --> 01:00:26,740
such a singular transformation,
that we call it the singularity.
850
01:00:26,740 --> 01:00:31,220
Now, this is not yet
inside my body or brain.
851
01:00:31,220 --> 01:00:33,580
It may as well be.
I'm very dependent on it.
852
01:00:33,580 --> 01:00:35,700
I think this is part of who I am.
853
01:00:35,700 --> 01:00:38,820
Ultimately, this kind of device
will be the size of blood cells
854
01:00:38,820 --> 01:00:40,940
and will go inside our body
to keep us healthy,
855
01:00:40,940 --> 01:00:44,380
go inside our brains, put our brains
directly on the Internet,
856
01:00:44,380 --> 01:00:48,500
give us direct access to the entire
library of all books.
857
01:00:51,180 --> 01:00:53,500
AI is just a religion.
It doesn't matter.
858
01:00:53,500 --> 01:00:56,860
What's really happening is real
world examples from real people
859
01:00:56,860 --> 01:00:59,900
who entered their answers,
their trivia,
860
01:00:59,900 --> 01:01:03,300
their experiences into some
online database.
861
01:01:03,300 --> 01:01:06,780
It's actually just a giant
puppet theatre repackaging
862
01:01:06,780 --> 01:01:08,540
inputs from real people
who are forgotten.
863
01:01:08,540 --> 01:01:11,260
We are pretending they aren't there.
864
01:01:11,260 --> 01:01:14,340
This is something
I really want people to see.
865
01:01:14,340 --> 01:01:16,620
The insane structure of modern
finance is exactly
866
01:01:16,620 --> 01:01:19,980
the same as the insane structure
of modern culture on the Internet.
867
01:01:19,980 --> 01:01:21,420
They're precisely the same.
868
01:01:21,420 --> 01:01:25,500
It's an attempt to gather all
the information into a high castle,
869
01:01:25,500 --> 01:01:30,180
optimise the world and pretend that
all the people the information came
870
01:01:30,180 --> 01:01:33,820
from don't deserve anything.
It's all the same mistake.
871
01:01:33,820 --> 01:01:38,300
Google Search is going to be
assisted intelligence
872
01:01:38,300 --> 01:01:41,540
and not artificial intelligence.
873
01:01:41,540 --> 01:01:45,180
In my mind I think of Search
as this beautiful symphony
874
01:01:45,180 --> 01:01:50,140
between the user and the search
engine and we make music together.
875
01:02:18,540 --> 01:02:22,660
Before the law,
there stands a guard.
876
01:02:24,580 --> 01:02:29,260
A man comes from the country
begging admittance to the law.
877
01:03:07,860 --> 01:03:09,980
The man tries to peer
through the entrance.
878
01:03:09,980 --> 01:03:12,940
He had been taught that the law
should be accessible to every man.
879
01:03:14,780 --> 01:03:19,220
"Do not attempt to enter without
my permission," says the guard.
880
01:03:25,780 --> 01:03:31,020
This tale is told during the story
called The Trial.
881
01:03:31,020 --> 01:03:34,460
I've been surprised
at the level of controversy there
882
01:03:34,460 --> 01:03:38,540
because digitising the world's books
and making them available,
883
01:03:38,540 --> 01:03:42,900
there's really... there's nobody
else who's attempted it at our scale
884
01:03:42,900 --> 01:03:45,620
or who is really working on it.
885
01:03:45,620 --> 01:03:47,980
And I feel like we had a number of
technical challenges
886
01:03:47,980 --> 01:03:50,860
which we've overcome.
887
01:03:50,860 --> 01:03:54,380
There was this legal dispute
which we have a settlement,
888
01:03:54,380 --> 01:03:58,220
settlements proposed, that we
at least jointly agree to with
889
01:03:58,220 --> 01:04:01,980
the authors and publishers and so
forth but it remains somewhat
890
01:04:01,980 --> 01:04:08,260
controversial, so I'm surprised at
the amount of resistance that's had
891
01:04:08,260 --> 01:04:12,460
but, ultimately, I'm optimistic that
we're going to be successful.
892
01:04:44,340 --> 01:04:49,220
It's important to understand that
the Google Books element was
893
01:04:49,220 --> 01:04:54,340
negotiated by a small number
of people claiming to represent
894
01:04:54,340 --> 01:04:57,420
authors and claiming to
represent publishers,
895
01:04:57,420 --> 01:05:00,780
but not every author and not every
publisher was in the room
896
01:05:00,780 --> 01:05:05,700
so once the settlement's announced,
there's a six-month period
897
01:05:05,700 --> 01:05:10,540
in which it's required to notify them
about the terms of the settlement
898
01:05:10,540 --> 01:05:14,300
and give them a chance to opt out
if they don't like the settlement
899
01:05:14,300 --> 01:05:17,940
or to give them a chance to object to
the terms of the settlement.
900
01:07:56,380 --> 01:08:01,980
The first time I realised Google
scanned my book was 2009, November.
901
01:08:01,980 --> 01:08:05,060
Actually my lawyer called me
902
01:08:05,060 --> 01:08:09,140
and he said, "Do you know your book
be scanned by Google Book?"
903
01:08:09,140 --> 01:08:13,180
The search engine Google came under
intense fire from Chinese authors
904
01:08:13,180 --> 01:08:16,380
as the digital library used books
written by Chinese authors
905
01:08:16,380 --> 01:08:17,700
without permission.
906
01:08:17,700 --> 01:08:23,900
The reader, they can search my book
by the keyword and maybe around
907
01:08:23,900 --> 01:08:28,580
100 keyword, but I remember the most
ridiculous keyword of my book
908
01:08:28,580 --> 01:08:31,740
is 'bed', B-E-D, and 'telephone'.
909
01:08:31,740 --> 01:08:34,820
That's two words I remember
and that made me laugh.
910
01:08:34,820 --> 01:08:37,340
This is not intellectual at all.
911
01:08:37,340 --> 01:08:40,980
Me and my lawyer
decide to sue Google.
912
01:08:40,980 --> 01:08:44,060
My lawyer asked 60,000,
something like that.
913
01:08:44,060 --> 01:08:47,260
My journalist friends said, "I don't
want to help you but I know you.
914
01:08:47,260 --> 01:08:52,340
"Why you ask such low money?"
so I wrote this blog that night.
915
01:08:52,340 --> 01:08:57,020
When I wake up, it's, like,
400 messages at my blog saying,
916
01:08:57,020 --> 01:09:00,220
"Damage this girl,"
and, "This girl's a bitch."
917
01:09:00,220 --> 01:09:04,140
Blah blah blah. Really disgusting,
horrible messages.
918
01:09:04,140 --> 01:09:08,620
I become a public enemy after Google
say they will leave China.
919
01:09:08,620 --> 01:09:12,260
Also, Chinese young people started
sending flowers to the Google office
920
01:09:12,260 --> 01:09:15,860
which has made even my best friend
be confused.
921
01:09:15,860 --> 01:09:19,340
She say, "Is the government sending
you to sue Google?"
922
01:12:10,260 --> 01:12:13,060
Before the court is
the plaintiff's motion to approve
923
01:12:13,060 --> 01:12:15,500
the settlement as fair
and reasonable.
924
01:12:15,500 --> 01:12:18,100
Numerous materials
have been submitted.
925
01:12:18,100 --> 01:12:20,660
Did anyone count up
the number of objections?
926
01:12:20,660 --> 01:12:23,820
We have in the range of 500.
Thank you.
927
01:12:25,820 --> 01:12:31,060
I flew to New York
and it was very exciting.
928
01:12:31,060 --> 01:12:35,980
There were 25 outside parties that
929
01:12:35,980 --> 01:12:39,380
made presentations to Judge Chin.
930
01:12:39,380 --> 01:12:42,540
There were 500 objections
for him to read.
931
01:12:42,540 --> 01:12:45,580
The judge basically said, "I'm not
going to rule from the bench,"
932
01:12:45,580 --> 01:12:48,180
but people were
hanging on every word.
933
01:12:48,180 --> 01:12:53,420
This is a fascinating turning point
actually in the whole history of
934
01:12:53,420 --> 01:12:55,540
knowledge and of access to knowledge
935
01:12:55,540 --> 01:12:58,340
and it was being played out
in a New York courtroom
936
01:12:58,340 --> 01:13:00,420
before Judge Denny Chin
937
01:13:00,420 --> 01:13:03,220
in the Southern Federal District
Court of New York.
938
01:13:13,580 --> 01:13:18,020
I confirm that one of my books has
been digitally scanned by Google
939
01:13:18,020 --> 01:13:20,100
without my permission.
940
01:13:20,100 --> 01:13:23,700
Because this act is a clear
violation of the copyright
941
01:13:23,700 --> 01:13:27,820
law of Japan, I have asked
the Metropolitan Police Department
942
01:13:27,820 --> 01:13:33,700
of Japan to criminally charge Google
and its CEO for this violation.
943
01:13:33,700 --> 01:13:38,140
The court's decision was to
a considerable extent going to
944
01:13:38,140 --> 01:13:42,220
determine the future of books,
of digital books.
945
01:13:42,220 --> 01:13:46,700
The proposed settlement results in
a de facto monopoly on information
946
01:13:46,700 --> 01:13:51,580
and an intensification of media
concentration on Google.
947
01:13:51,580 --> 01:13:55,380
As a result, the right of free
access to information,
948
01:13:55,380 --> 01:13:59,900
as well as the existing cultural
diversity in both Germany and Europe
949
01:13:59,900 --> 01:14:01,660
will be usurped.
950
01:14:01,660 --> 01:14:05,580
Would it be basically in the hands
of commercial speculators,
951
01:14:05,580 --> 01:14:10,420
whose responsibility was
to their shareholders
952
01:14:10,420 --> 01:14:12,860
or would it be organised
for the public good?
953
01:14:12,860 --> 01:14:16,140
There was a risk
of monopolisation there,
954
01:14:16,140 --> 01:14:18,780
that the Department of Justice saw.
955
01:14:20,620 --> 01:14:23,860
The proposed settlement would
establish a marketplace
956
01:14:23,860 --> 01:14:26,100
in which only one competitor
957
01:14:26,100 --> 01:14:29,980
would have authority to use
a vast array of works.
958
01:14:29,980 --> 01:14:35,500
The risk was that Google could
basically hold the whole
959
01:14:35,500 --> 01:14:40,500
world hostage to
the price of access to these books
960
01:14:40,500 --> 01:14:44,180
and, because no-one else
would have a licence,
961
01:14:44,180 --> 01:14:47,940
no-one else would have a
corpus like the corpus they had,
962
01:14:47,940 --> 01:14:51,380
we'd have to pay whatever
they wanted to charge.
963
01:14:52,700 --> 01:14:55,820
The core concerns seem to
be that this would diminish
964
01:14:55,820 --> 01:14:58,300
the availability to read
books in private.
965
01:14:58,300 --> 01:15:04,460
That is not true. This service would
be available at public libraries.
966
01:15:04,460 --> 01:15:08,060
You can walk into your neighbourhood
library, you can sit down at
967
01:15:08,060 --> 01:15:10,820
a free access terminal, anonymously.
968
01:15:10,820 --> 01:15:13,020
You can search for and read a book.
969
01:15:14,260 --> 01:15:17,220
And if you want to
look at it at home, then what?
970
01:15:17,220 --> 01:15:22,140
Well, if you want to look at it
at home, that may present an issue.
971
01:15:22,140 --> 01:15:23,940
Here's the rub.
972
01:15:23,940 --> 01:15:27,060
This is a tension
between requirements for security
973
01:15:27,060 --> 01:15:30,420
that are insisted on in order
not to have these works be
974
01:15:30,420 --> 01:15:32,100
sort of freely disseminated.
975
01:15:34,420 --> 01:15:39,620
In my view, the Google Book Search
settlement is no different from the
976
01:15:39,620 --> 01:15:44,500
piracy cases in which the Internet
and digital technology are abused.
977
01:15:44,500 --> 01:15:49,100
I strongly urge the court to reject
the proposed settlement.
978
01:15:49,100 --> 01:15:52,660
I remember there being
a Japanese writer there
979
01:15:52,660 --> 01:15:55,660
and the language was very vivid.
980
01:15:55,660 --> 01:15:57,900
It was as though, you know,
981
01:15:57,900 --> 01:16:01,140
copyright was going to
be swept away,
982
01:16:01,140 --> 01:16:05,620
and that copyright was going to be
destroyed and the approval of this
983
01:16:05,620 --> 01:16:09,380
settlement was going to, you know,
984
01:16:09,380 --> 01:16:12,940
make the United States out of
compliance with treaty obligations.
985
01:16:14,140 --> 01:16:17,820
There's a real risk that, should
the court approve the settlement,
986
01:16:17,820 --> 01:16:22,100
members of the World Trade
Organisation will initiate
987
01:16:22,100 --> 01:16:25,380
settlement proceedings
against the US government.
988
01:16:25,380 --> 01:16:28,980
And if the US government were
to lose such proceedings,
989
01:16:28,980 --> 01:16:33,260
which is a very real possibility,
our partners would be
990
01:16:33,260 --> 01:16:37,900
entitled to impose trade sanctions
against the United States.
991
01:16:37,900 --> 01:16:40,340
You don't use words
like that very often.
992
01:16:40,340 --> 01:16:42,700
It wasn't kind of like,
"Oh, gee, there are these issues
993
01:16:42,700 --> 01:16:44,580
"and we're concerned
about something."
994
01:16:44,580 --> 01:16:47,260
It was like,
"THIS VIOLATES A TREATY!
995
01:16:47,260 --> 01:16:50,500
"HOW CAN THE JUDGE DO SOMETHING
THAT'S GOING TO VIOLATE A TREATY?
996
01:16:50,500 --> 01:16:51,740
"THIS IS CRAZY!"
997
01:16:51,740 --> 01:16:54,060
I am not going to rule today.
998
01:16:54,060 --> 01:16:58,860
There is just too much to digest.
I will reserve decision.
999
01:16:58,860 --> 01:17:02,300
There's much to think about.
All rise.
1000
01:17:02,300 --> 01:17:05,340
And then Judge Chin
thought about it.
1001
01:17:05,340 --> 01:17:07,700
He thought about it
and he thought about it.
1002
01:17:26,260 --> 01:17:31,220
He took a very long time and every
morning I got up and I thought,
1003
01:17:31,220 --> 01:17:34,580
"Is Judge Chin going to
announce his decision today?"
1004
01:17:34,580 --> 01:17:38,140
And when he finally did,
I myself felt thrilled
1005
01:17:38,140 --> 01:17:43,100
because the court actually refused
to sanction the settlement.
1006
01:17:43,100 --> 01:17:47,220
Then Google Book Search could not
take place, at least according
1007
01:17:47,220 --> 01:17:49,900
to Google's original business plan.
1008
01:17:49,900 --> 01:17:54,300
US circuit judge Denny Chin said
the creation of a universal library
1009
01:17:54,300 --> 01:17:57,420
would benefit many
but would simply go too far.
1010
01:17:57,420 --> 01:18:00,100
Chin said the settlement of
a class action law suit that the
1011
01:18:00,100 --> 01:18:03,660
company reached with US authors
and publishers would grant Google
1012
01:18:03,660 --> 01:18:06,140
significant rights
to exploit entire books
1013
01:18:06,140 --> 01:18:08,540
without permission
of copyright owners.
1014
01:18:08,540 --> 01:18:11,940
Chin also said the deal gives Google
a significant advantage over
1015
01:18:11,940 --> 01:18:15,220
competitors and it would be
rewarding it for engaging in
1016
01:18:15,220 --> 01:18:17,860
wholesale copying of copyrighted
works without permission.
1017
01:18:58,340 --> 01:19:03,900
I think you could read the decision
by Judge Chin as a defeat
1018
01:19:03,900 --> 01:19:05,740
of the screen by the book.
1019
01:19:05,740 --> 01:19:09,060
But this is a long war.
1020
01:19:09,060 --> 01:19:12,020
This is one battle and,
1021
01:19:12,020 --> 01:19:16,340
whatever triumph there
might have been for books,
1022
01:19:16,340 --> 01:19:17,980
it's going to be short-lived,
1023
01:19:17,980 --> 01:19:20,540
because the screen
will ultimately triumph.
1024
01:19:28,300 --> 01:19:31,940
They spent several months trying
to negotiate a new settlement,
1025
01:19:31,940 --> 01:19:35,380
couldn't reach a new settlement
that was mutually acceptable,
1026
01:19:35,380 --> 01:19:39,500
so they're going to
have to go to trial.
1027
01:19:59,020 --> 01:20:03,540
'Baidu, China's search engine giant,
has been blamed by Chinese
1028
01:20:03,540 --> 01:20:07,340
'writers for participating
in copyright violation.
1029
01:20:07,340 --> 01:20:11,260
'This is because the website offers
free online excerpts of stories
1030
01:20:11,260 --> 01:20:15,260
'and books without
the authors' prior approval.'
1031
01:20:15,260 --> 01:20:18,900
I think very late March
or early April of 2011,
1032
01:20:18,900 --> 01:20:23,780
we purged the site of about 2.8
million files that we believed
1033
01:20:23,780 --> 01:20:27,460
might be copyright infringing within
a period of 72 hours.
1034
01:20:27,460 --> 01:20:30,700
I think a good number of them
were books or chapters of books.
1035
01:20:30,700 --> 01:20:35,300
We implemented a rule where no-one
could upload anything of more
1036
01:20:35,300 --> 01:20:41,020
than 1,000 Chinese characters without
it being manually inspected
1037
01:20:41,020 --> 01:20:42,260
for copyright infringement
1038
01:20:42,260 --> 01:20:45,300
or automatically inspected
for copyright infringement.
1039
01:20:45,300 --> 01:20:49,220
The problem is then people started
uploading parts of books
1040
01:20:49,220 --> 01:20:53,060
in 1,000-character increments
so they would avoid detection.
1041
01:20:53,060 --> 01:20:56,460
So there's always people who want
to abuse the system.
1042
01:21:00,700 --> 01:21:03,020
The question is,
1043
01:21:03,020 --> 01:21:09,300
has Google already been able to make
its search engine better because
1044
01:21:09,300 --> 01:21:14,300
of the Google Books corpus and the
scanning of 20 million books?
1045
01:21:14,300 --> 01:21:16,380
I think the answer to that is yes.
1046
01:21:16,380 --> 01:21:19,820
The question of
whether large Internet
1047
01:21:19,820 --> 01:21:23,900
companies are making our lives
easier or gaining power over us,
1048
01:21:23,900 --> 01:21:28,180
I think it presents a kind of false
binary because they're doing both.
1049
01:21:28,180 --> 01:21:29,980
If they were not
making our lives easier,
1050
01:21:29,980 --> 01:21:31,940
no-one would be
using their services.
1051
01:21:31,940 --> 01:21:34,940
This is the tricky,
complicated question
1052
01:21:34,940 --> 01:21:37,140
that we'll have to face
down the road.
1053
01:21:37,140 --> 01:21:39,580
All of them
are making our lives easier.
1054
01:21:39,580 --> 01:21:41,580
They're making products cheaper.
1055
01:21:41,580 --> 01:21:47,900
They're making our commute less
bothersome and more exciting.
1056
01:21:47,900 --> 01:21:52,300
Google will be supplying us with
glasses that will augment reality
1057
01:21:52,300 --> 01:21:54,860
and tell us about where
our friends are in the city.
1058
01:21:54,860 --> 01:21:57,580
They'll tell us the weather.
They'll tell us everything.
1059
01:21:57,580 --> 01:22:00,140
The question is what would
the trade-offs be?
1060
01:22:00,140 --> 01:22:04,540
What happens with all
of the information that would pass
1061
01:22:04,540 --> 01:22:07,540
through Google Glasses?
Surely it will be stored somewhere.
1062
01:22:07,540 --> 01:22:10,060
I'm sure Google will not be
discarding it because they will
1063
01:22:10,060 --> 01:22:12,460
need to know what it is
that I've seen yesterday
1064
01:22:12,460 --> 01:22:15,340
so that they can customise
what I see today even better.
1065
01:22:15,340 --> 01:22:19,540
But then the question is, would the
National Security Agency be able to
1066
01:22:19,540 --> 01:22:21,580
go to Google and ask for that data?
1067
01:22:21,580 --> 01:22:24,220
Ask for everything I've seen
through my Google Glasses?
1068
01:22:24,220 --> 01:22:27,300
And if that would be the case
then the question should be
1069
01:22:27,300 --> 01:22:30,620
do we actually want to have a
society where citizens are wearing
1070
01:22:30,620 --> 01:22:32,580
CCTV cameras on their heads?
1071
01:23:24,380 --> 01:23:27,220
Getting to a better system
where people are rewarded
1072
01:23:27,220 --> 01:23:30,940
for their information contribution
to the world, getting to that system
1073
01:23:30,940 --> 01:23:36,260
from where we are, where people
are expected to get by with less,
1074
01:23:36,260 --> 01:23:38,180
that's going to be a hard
transition.
1075
01:23:38,180 --> 01:23:41,820
They might involve government but
they might involve the big companies
1076
01:23:41,820 --> 01:23:45,900
and the reason why is the big
companies like Google and Amazon
1077
01:23:45,900 --> 01:23:48,540
are shooting themselves in the foot
with what we're doing
1078
01:23:48,540 --> 01:23:52,820
because what we're doing is
shrinking the economy. I mean...
1079
01:23:52,820 --> 01:23:58,620
My concern is not so much
the direction in which Google,
1080
01:23:58,620 --> 01:24:01,620
Facebook for that matter,
want to take the world.
1081
01:24:01,620 --> 01:24:03,580
My concern is the fact
1082
01:24:03,580 --> 01:24:07,260
that it's Google and Facebook
taking us in that direction.
1083
01:24:42,460 --> 01:24:47,180
Our current policy to open up the
library and make it part of this
1084
01:24:47,180 --> 01:24:53,020
really very ambitious project, more
ambitious I think than Google's,
1085
01:24:53,020 --> 01:24:57,100
which we call the Digital
Public Library of America.
1086
01:24:57,100 --> 01:25:00,060
You know, I think that we
owe a great deal to Google.
1087
01:25:01,700 --> 01:25:06,100
I can't imagine that this
Digital Public Library of America
1088
01:25:06,100 --> 01:25:12,220
would ever have gotten off the
ground had Google not started to
1089
01:25:12,220 --> 01:25:17,500
race ahead with its own version of
digitization on this massive scale.
1090
01:25:20,140 --> 01:25:24,180
However, you know, Google,
wonderful as it is,
1091
01:25:24,180 --> 01:25:26,860
is not familiar with books.
1092
01:25:26,860 --> 01:25:31,940
For example, Walt Whitman's famous
book of poems, Leaves Of Grass,
1093
01:25:31,940 --> 01:25:34,380
was catalogued under gardening.
1094
01:26:12,580 --> 01:26:15,580
We are designing the Digital
Public Library of America
1095
01:26:15,580 --> 01:26:19,780
so that it will be perfectly
compatible with Europeana
1096
01:26:19,780 --> 01:26:23,460
and that means soon we will have
a worldwide network.
1097
01:26:23,460 --> 01:26:25,460
A gigantic world library.
1098
01:26:28,220 --> 01:26:34,900
HG Wells' view of science and
technology was what sustained him
1099
01:26:34,900 --> 01:26:37,460
and sustained his ideas
throughout his whole life.
1100
01:26:37,460 --> 01:26:41,380
He had this sense that, if only
we could get the scientists and the
1101
01:26:41,380 --> 01:26:45,020
technologists
working in the right way,
1102
01:26:45,020 --> 01:26:47,700
we could transform the world
1103
01:26:47,700 --> 01:26:51,740
and he continued with
that belief up until
1104
01:26:51,740 --> 01:26:55,860
the absolute final disillusionment
with the entire human world.
1105
01:26:55,860 --> 01:26:58,260
It was a book which he called,
so fittingly,
1106
01:26:58,260 --> 01:27:01,100
Mind At The End Of Its Tether.
1107
01:27:01,100 --> 01:27:04,580
He felt that the whole evolutionary
process that he had been studying
1108
01:27:04,580 --> 01:27:08,620
and he felt was leading us
to something new and wonderful,
1109
01:27:08,620 --> 01:27:10,660
had failed.
1110
01:27:10,660 --> 01:27:16,540
And his last words were that there
was no way out or round or through.
1111
01:27:20,180 --> 01:27:24,660
HG WELLS: Our world of self-delusion
will perish amidst its evasions
1112
01:27:24,660 --> 01:27:27,140
and fortuities.
1113
01:27:27,140 --> 01:27:33,660
It is like a convoy lost in darkness
along an unknown rocky coast
1114
01:27:33,660 --> 01:27:38,780
with quarrelling pirates in the chart
room and savages clambering up
1115
01:27:38,780 --> 01:27:44,500
the sides of the ship to plunder and
do evil as the whim may take them.
1116
01:27:45,740 --> 01:27:49,180
That is the rough outline of the more
1117
01:27:49,180 --> 01:27:53,220
and more jumbled movie
on the screen before us.
1118
01:27:54,700 --> 01:27:57,460
There is no way out.
1119
01:27:57,460 --> 01:27:59,380
Or round.
1120
01:27:59,380 --> 01:28:01,460
Or through.
1121
01:28:21,340 --> 01:28:24,380
Subtitles by Red Bee Media Ltd
97044
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.