Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:01,900 --> 00:00:04,700
♪ ♪
2
00:00:06,800 --> 00:00:08,733
SARAH BRAYNE:
We live in this era where
3
00:00:08,733 --> 00:00:11,033
we leave digital traces
throughout the course
4
00:00:11,033 --> 00:00:13,133
of our everyday lives.
5
00:00:13,133 --> 00:00:14,866
ANDY CLARNO:
What is this data,
6
00:00:14,866 --> 00:00:16,133
how is it collected,
how is it being used?
7
00:00:16,133 --> 00:00:20,700
NARRATOR:
One way it's being used
is to make predictions
8
00:00:20,700 --> 00:00:22,033
about who might commit
a crime...
9
00:00:22,033 --> 00:00:23,833
Hey, give me all
your money, man!
10
00:00:23,833 --> 00:00:25,366
NARRATOR:
...and who should get bail.
11
00:00:25,366 --> 00:00:27,300
JUDGE:
On count one, you're charged
with felony intimidation...
12
00:00:27,300 --> 00:00:30,300
ANDREW FERGUSON:
The idea is that
if you look at past crimes,
13
00:00:30,300 --> 00:00:32,833
you might be able to predict
the future.
14
00:00:32,833 --> 00:00:34,933
WILLIAM ISAAC:
We want safer communities,
15
00:00:34,933 --> 00:00:36,633
we want societies that are
less incarcerated.
16
00:00:36,633 --> 00:00:39,433
NARRATOR:
But is that what we're getting?
17
00:00:39,433 --> 00:00:41,066
Are the predictions reliable?
18
00:00:41,066 --> 00:00:43,566
CATHY O'NEIL:
I think algorithms can,
19
00:00:43,566 --> 00:00:44,833
in many cases,
be better than people.
20
00:00:44,833 --> 00:00:47,333
But, of course, algorithms
don't have consciousness.
21
00:00:47,333 --> 00:00:50,233
The algorithm only knows
what it's been fed.
22
00:00:50,233 --> 00:00:51,300
RUHA BENJAMIN:
Because it's technology,
23
00:00:51,300 --> 00:00:54,000
we don't question them as
much as we might
24
00:00:54,000 --> 00:00:56,833
a racist judge
or a racist officer.
25
00:00:56,833 --> 00:01:00,033
They're behind this veneer
of neutrality.
26
00:01:00,033 --> 00:01:03,233
ISAAC:
We need to know
who's accountable
27
00:01:03,233 --> 00:01:06,333
when systems harm
the communities
28
00:01:06,333 --> 00:01:07,600
that they're designed to serve.
29
00:01:07,600 --> 00:01:11,600
NARRATOR:
Can we trust the justice
of predictive algorithms?
30
00:01:11,600 --> 00:01:13,100
And should we?
31
00:01:13,100 --> 00:01:14,900
"Computers Vs. Crime,"
32
00:01:14,900 --> 00:01:17,733
right now, on "NOVA."
33
00:01:39,900 --> 00:01:42,000
(computers booting up)
34
00:01:42,000 --> 00:01:44,766
♪ ♪
35
00:01:44,766 --> 00:01:47,433
NARRATOR:
We live in a world of big data,
36
00:01:47,433 --> 00:01:49,666
where computers look
for patterns
37
00:01:49,666 --> 00:01:52,200
in vast collections
of information
38
00:01:52,200 --> 00:01:54,033
in order to predict the future.
39
00:01:54,033 --> 00:01:58,700
And we depend on their accuracy.
40
00:01:58,700 --> 00:02:00,500
Is it a good morning
for jogging?
41
00:02:00,500 --> 00:02:02,966
Will this become cancer?
42
00:02:02,966 --> 00:02:05,366
What movie should I choose?
43
00:02:05,366 --> 00:02:08,033
The best way to beat traffic?
44
00:02:08,033 --> 00:02:09,766
Your computer can tell you.
45
00:02:09,766 --> 00:02:13,400
Similar computer programs,
called predictive algorithms,
46
00:02:13,400 --> 00:02:17,000
are mining big data
to make predictions
47
00:02:17,000 --> 00:02:19,266
about crime and punishment--
48
00:02:19,266 --> 00:02:22,633
reinventing how our
criminal legal system works.
49
00:02:22,633 --> 00:02:24,666
Policing agencies have used
these computer algorithms
50
00:02:24,666 --> 00:02:30,200
in an effort to predict where
the next crime will occur
51
00:02:30,200 --> 00:02:31,866
and even who the perpetrator
will be.
52
00:02:31,866 --> 00:02:34,200
ASSISTANT DISTRICT ATTORNEY:
Here, the state is
recommending...
53
00:02:34,200 --> 00:02:35,000
NARRATOR:
Judges use them
54
00:02:35,000 --> 00:02:37,100
to determine who should get bail
55
00:02:37,100 --> 00:02:38,566
and who shouldn't.
56
00:02:38,566 --> 00:02:41,200
JUDGE:
If you fail to appear
next time, you get no bond.
57
00:02:41,200 --> 00:02:44,166
NARRATOR:
It may sound like
the police of the future
58
00:02:44,166 --> 00:02:45,666
in the movie
"Minority Report."
I'm placing you under arrest
59
00:02:45,666 --> 00:02:47,633
for the future murder
of Sarah Marks.
60
00:02:47,633 --> 00:02:49,566
NARRATOR:
But fiction it's not.
61
00:02:49,566 --> 00:02:55,133
How do these predictions
actually work?
62
00:02:55,133 --> 00:02:56,933
Can computer algorithms
63
00:02:56,933 --> 00:03:01,433
make our criminal legal system
more equitable?
64
00:03:01,433 --> 00:03:08,300
Are these algorithms truly fair
and free of human bias?
65
00:03:11,200 --> 00:03:12,300
ANDREW PAPACHRISTOS:
I grew up in Chicago
66
00:03:12,300 --> 00:03:14,766
in the 1980s and early 1990s.
67
00:03:14,766 --> 00:03:21,666
♪ ♪
68
00:03:24,300 --> 00:03:26,466
My dad was an immigrant
from Greece,
69
00:03:26,466 --> 00:03:29,700
we worked in my
family's restaurant,
70
00:03:29,700 --> 00:03:31,266
called KaMar's.
71
00:03:31,266 --> 00:03:37,000
NARRATOR:
Andrew Papachristos
was a 16-year-old kid
72
00:03:37,000 --> 00:03:39,766
in the North Side of Chicago
in the 1990s.
73
00:03:39,766 --> 00:03:42,933
I spent a lot of my formative
years busing tables,
74
00:03:42,933 --> 00:03:47,400
serving people hamburgers
and gyros.
75
00:03:47,400 --> 00:03:48,933
It kind of was a whole
family affair.
76
00:03:48,933 --> 00:03:53,933
NARRATOR:
Young Papachristos was aware
the streets could be dangerous,
77
00:03:53,933 --> 00:04:00,033
but never imagined the violence
would touch him or his family.
78
00:04:00,033 --> 00:04:02,866
REPORTER:
Two more gang-related
murders Monday night.
79
00:04:02,866 --> 00:04:05,166
PAPACHRISTOS:
And of course, you know,
the '80s and '90s in Chicago
80
00:04:05,166 --> 00:04:06,633
was some of the historically
most violent periods in Chicago.
81
00:04:06,633 --> 00:04:11,700
Street corner drug markets,
street organizations.
82
00:04:11,700 --> 00:04:14,933
And then like a lot of other
businesses on our, on our block
83
00:04:14,933 --> 00:04:16,166
and in our neighborhood,
84
00:04:16,166 --> 00:04:19,466
local gangs tried to extort
my family and the business.
85
00:04:19,466 --> 00:04:23,466
And my dad had been running
KaMar's for 30 years
86
00:04:23,466 --> 00:04:24,866
and kind of just said no.
87
00:04:24,866 --> 00:04:27,066
♪ ♪
88
00:04:27,066 --> 00:04:32,166
(sirens blaring)
89
00:04:32,166 --> 00:04:37,066
NARRATOR:
Then, one night, the family
restaurant burned to the ground.
90
00:04:37,066 --> 00:04:40,066
Police suspected arson.
91
00:04:40,066 --> 00:04:42,033
PAPACHRISTOS:
It was quite a shock
to our family,
92
00:04:42,033 --> 00:04:43,300
'cause everybody
in the neighborhood worked
93
00:04:43,300 --> 00:04:45,866
in the restaurant at one point
in their life.
94
00:04:45,866 --> 00:04:51,533
And my parents lost
30 years of their lives.
95
00:04:51,533 --> 00:04:54,500
That was really one of the
events that made me want to
96
00:04:54,500 --> 00:04:55,466
understand violence.
97
00:04:55,466 --> 00:04:56,900
Like, how could this happen?
98
00:04:56,900 --> 00:04:59,200
♪ ♪
99
00:04:59,200 --> 00:05:00,466
NARRATOR:
About a decade later,
100
00:05:00,466 --> 00:05:06,066
Papachristos was a graduate
student searching for answers.
101
00:05:06,066 --> 00:05:07,100
PAPACHRISTOS:
In graduate school,
102
00:05:07,100 --> 00:05:11,166
I was working on
a violence prevention program
103
00:05:11,166 --> 00:05:12,133
that brought together
community members,
104
00:05:12,133 --> 00:05:17,300
including street outreach
workers.
105
00:05:17,300 --> 00:05:19,333
And we were sitting at a table,
106
00:05:19,333 --> 00:05:21,566
and one of these
outreach workers asked me,
107
00:05:21,566 --> 00:05:22,866
the university student,
108
00:05:22,866 --> 00:05:24,766
"Who's next?
109
00:05:24,766 --> 00:05:27,133
Who's going to get shot next?"
110
00:05:30,533 --> 00:05:32,900
And where that led
was me sitting down
111
00:05:32,900 --> 00:05:36,566
with stacks of shooting and,
and homicide files
112
00:05:36,566 --> 00:05:39,500
with a red pen and a legal pad,
113
00:05:39,500 --> 00:05:41,633
by hand creating these
network images
114
00:05:41,633 --> 00:05:43,300
of, this person shot this
person,
115
00:05:43,300 --> 00:05:46,033
and this person was involved
with this group and this event,
116
00:05:46,033 --> 00:05:49,600
and creating a web
of these relationships.
117
00:05:49,600 --> 00:05:50,933
And then I learned that
118
00:05:50,933 --> 00:05:52,966
there's this whole science
about networks.
119
00:05:52,966 --> 00:05:55,366
I didn't have to invent
anything.
120
00:05:55,366 --> 00:05:57,366
♪ ♪
121
00:05:57,366 --> 00:06:00,633
NARRATOR:
Social network analysis
was already influencing
122
00:06:00,633 --> 00:06:01,800
popular culture.
123
00:06:01,800 --> 00:06:07,200
"Six Degrees of Separation"
was a play on Broadway.
124
00:06:07,200 --> 00:06:10,933
And then, there was
Six Degrees of Kevin Bacon.
125
00:06:10,933 --> 00:06:12,366
PAPACHRISTOS:
The idea was,
you would play this game,
126
00:06:12,366 --> 00:06:15,100
and whoever got the
shortest distance to Kevin Bacon
127
00:06:15,100 --> 00:06:16,733
would win.
128
00:06:16,733 --> 00:06:19,533
So Robert De Niro was in
a movie with so-and-so,
129
00:06:19,533 --> 00:06:20,800
who was in a movie with
Kevin Bacon.
130
00:06:20,800 --> 00:06:24,400
It was creating, essentially,
a series of ties
131
00:06:24,400 --> 00:06:26,200
among movies and actors.
132
00:06:26,200 --> 00:06:28,900
And in fact,
there's a mathematics
133
00:06:28,900 --> 00:06:30,033
behind that principle.
134
00:06:30,033 --> 00:06:34,566
It's actually old mathematical
graph theory, right?
135
00:06:34,566 --> 00:06:36,466
That goes back
to 1900s mathematics.
136
00:06:36,466 --> 00:06:41,000
And lots of scientists started
seeing that there were
137
00:06:41,000 --> 00:06:42,066
mathematical principles,
138
00:06:42,066 --> 00:06:46,666
and computational resources--
computers, data--
139
00:06:46,666 --> 00:06:48,933
were at a point that you could
test those things.
140
00:06:48,933 --> 00:06:51,066
So it was in a
very exciting time.
141
00:06:51,066 --> 00:06:54,433
We looked at arrest records,
at police stops,
142
00:06:54,433 --> 00:06:56,533
and we looked at
victimization records.
143
00:06:56,533 --> 00:06:57,700
Who was the victim
of a homicide
144
00:06:57,700 --> 00:07:00,700
or a non-fatal shooting?
145
00:07:00,700 --> 00:07:02,033
♪ ♪
146
00:07:02,033 --> 00:07:08,033
The statistical model starts by
creating the social networks of,
147
00:07:08,033 --> 00:07:09,433
say, everybody who may have
been arrested
148
00:07:09,433 --> 00:07:10,700
in a,
in a particular neighborhood.
149
00:07:10,700 --> 00:07:14,300
So Person A and Person B
were in a robbery together,
150
00:07:14,300 --> 00:07:16,500
they have a tie,
and then Person B and Person C
151
00:07:16,500 --> 00:07:19,966
were, were stopped by the police
in another instance.
152
00:07:19,966 --> 00:07:22,200
And it creates networks of
thousands of people.
153
00:07:22,200 --> 00:07:25,266
Understanding that events
are connected,
154
00:07:25,266 --> 00:07:26,666
places are connected.
155
00:07:26,666 --> 00:07:28,433
That there are old things,
like disputes between crews,
156
00:07:28,433 --> 00:07:35,133
which actually drive behavior
for generations.
157
00:07:35,133 --> 00:07:37,166
What we saw was striking.
158
00:07:37,166 --> 00:07:38,933
(snaps):
And you could see it
immediately,
159
00:07:38,933 --> 00:07:40,433
and you could see it
a mile away.
160
00:07:40,433 --> 00:07:42,833
Which was, gunshot victims
clumped together.
161
00:07:42,833 --> 00:07:45,833
You, you very rarely see one
victim.
162
00:07:45,833 --> 00:07:47,266
You see two, three, four.
163
00:07:47,266 --> 00:07:48,400
Sometimes they string across
time and space.
164
00:07:48,400 --> 00:07:53,933
And then the model predicts,
what's the probability
165
00:07:53,933 --> 00:07:56,566
that this is going to lead to
a shooting
166
00:07:56,566 --> 00:07:59,033
on the same pathway
in the future?
167
00:07:59,033 --> 00:08:02,433
(gun firing, people shouting)
168
00:08:02,433 --> 00:08:05,133
REPORTER:
Another young man lies dead.
169
00:08:05,133 --> 00:08:08,100
NARRATOR:
In Boston, Papachristos found
170
00:08:08,100 --> 00:08:10,500
that 85% of all
gunshot injuries
171
00:08:10,500 --> 00:08:13,333
occurred within a single
social network.
172
00:08:13,333 --> 00:08:15,266
Individuals in this network
173
00:08:15,266 --> 00:08:17,500
were less than
five handshakes away
174
00:08:17,500 --> 00:08:20,200
from the victim of a
gun homicide
175
00:08:20,200 --> 00:08:22,733
or non-fatal shooting.
176
00:08:22,733 --> 00:08:24,833
The closer a person was
177
00:08:24,833 --> 00:08:26,066
connected to a gunshot victim,
he found,
178
00:08:26,066 --> 00:08:32,500
the greater the probability that
that person would be shot.
179
00:08:32,500 --> 00:08:35,566
Around 2011,
when Papachristos was presenting
180
00:08:35,566 --> 00:08:38,500
his groundbreaking work
on social networks
181
00:08:38,500 --> 00:08:40,300
and gang violence,
182
00:08:40,300 --> 00:08:43,900
the Chicago Police Department
wanted to know more.
183
00:08:43,900 --> 00:08:45,033
PAPACHRISTOS:
We were at a conference.
184
00:08:45,033 --> 00:08:46,900
The then-superintendent
of the police department,
185
00:08:46,900 --> 00:08:49,200
he was asking me a bunch
of questions.
186
00:08:49,200 --> 00:08:50,700
He had clearly read the paper.
187
00:08:50,700 --> 00:08:53,266
NARRATOR:
The Chicago Police Department
188
00:08:53,266 --> 00:08:55,766
was working on its own
predictive policing program
189
00:08:55,766 --> 00:08:57,566
to fight crime.
190
00:08:57,566 --> 00:09:00,266
They were convinced
that Papachristos's model
191
00:09:00,266 --> 00:09:05,966
could make their new policing
model even more effective.
192
00:09:01,800 --> 00:09:05,966
could make their new policing
model even more effective.
193
00:09:05,966 --> 00:09:07,166
LOGAN KOEPKE:
Predictive policing involves
194
00:09:07,166 --> 00:09:12,600
looking to historical crime data
to predict future events,
195
00:09:12,600 --> 00:09:14,800
either where police believe
crime may occur
196
00:09:14,800 --> 00:09:18,600
or who might be involved
in certain crimes.
197
00:09:18,600 --> 00:09:21,366
♪ ♪
198
00:09:21,366 --> 00:09:22,566
So it's the use of historical
data to forecast a future event.
199
00:09:22,566 --> 00:09:27,133
NARRATOR:
At the core of these programs
is software,
200
00:09:27,133 --> 00:09:30,133
which, like all
computer programs,
201
00:09:30,133 --> 00:09:33,400
is built around an algorithm.
202
00:09:33,400 --> 00:09:35,366
So, think of an algorithm
like a recipe.
203
00:09:35,366 --> 00:09:40,000
♪ ♪
204
00:09:40,000 --> 00:09:41,800
You have inputs,
205
00:09:41,800 --> 00:09:44,600
which are your ingredients,
you have the algorithm,
206
00:09:44,600 --> 00:09:45,433
which is the steps.
207
00:09:45,433 --> 00:09:50,266
♪ ♪
208
00:09:50,266 --> 00:09:51,766
And then there's the output,
209
00:09:51,766 --> 00:09:53,866
which is hopefully the
delicious cake you're making.
210
00:09:57,400 --> 00:09:59,100
GROUP:
Happy birthday!
211
00:09:59,100 --> 00:10:02,533
So one way to think about
algorithms is to think about
212
00:10:02,533 --> 00:10:03,800
the hiring process.
213
00:10:03,800 --> 00:10:07,166
In fact, recruiters have been
studied for a hundred years.
214
00:10:07,166 --> 00:10:10,533
And it turns out
many human recruiters
215
00:10:10,533 --> 00:10:13,366
have a standard algorithm
216
00:10:13,366 --> 00:10:16,033
when they're looking at
a risumi.
217
00:10:16,033 --> 00:10:19,466
So they start with your name,
and then they look to see
218
00:10:19,466 --> 00:10:21,266
where you went to school,
and then finally,
219
00:10:21,266 --> 00:10:23,900
they look at what your
last job was.
220
00:10:23,900 --> 00:10:26,433
If they don't see the pattern
they're looking for...
221
00:10:26,433 --> 00:10:28,133
(bell dings)
...that's all the time you get.
222
00:10:28,133 --> 00:10:32,100
And in a sense, that's exactly
what artificial intelligence
223
00:10:32,100 --> 00:10:35,366
is doing, as well,
in a very basic level.
224
00:10:35,366 --> 00:10:37,733
It's recognizing sets
of patterns and using that
225
00:10:37,733 --> 00:10:42,333
to decide what the next step in
its decision process would be.
226
00:10:42,333 --> 00:10:47,366
♪ ♪
227
00:10:47,366 --> 00:10:48,566
NARRATOR:
What is commonly referred to
228
00:10:48,566 --> 00:10:51,600
as artificial intelligence,
or A.I.,
229
00:10:51,600 --> 00:10:54,333
is a process called
machine learning,
230
00:10:54,333 --> 00:10:56,866
where a computer algorithm
will adjust on its own,
231
00:10:56,866 --> 00:10:58,000
without human instructions,
232
00:10:58,000 --> 00:11:03,200
in response to the patterns
it finds in the data.
233
00:11:03,200 --> 00:11:07,000
These powerful processes
can analyze more data
234
00:11:07,000 --> 00:11:08,000
than any person can,
235
00:11:08,000 --> 00:11:13,266
and find patterns never
recognized before.
236
00:11:13,266 --> 00:11:15,066
The principles for
machine learning
237
00:11:15,066 --> 00:11:16,133
were invented in the 1950s,
238
00:11:16,133 --> 00:11:22,266
but began proliferating
only after about 2010.
239
00:11:22,266 --> 00:11:23,366
What we consider
machine learning today
240
00:11:23,366 --> 00:11:28,066
came about because hard drives
became very cheap.
241
00:11:28,066 --> 00:11:31,233
So it was really easy to get
a lot of data on everyone
242
00:11:31,233 --> 00:11:32,933
in every aspect of life.
243
00:11:32,933 --> 00:11:35,700
And the question is, what can
we do with all of that data?
244
00:11:35,700 --> 00:11:39,300
Those new uses are things like
predictive policing,
245
00:11:39,300 --> 00:11:42,233
they are things like deciding
whether or not a person's
246
00:11:42,233 --> 00:11:44,333
going to get a job or not,
247
00:11:44,333 --> 00:11:45,800
or be invited for
a job interview.
248
00:11:45,800 --> 00:11:51,266
NARRATOR:
So how does such a powerful tool
like machine learning work?
249
00:11:51,266 --> 00:11:53,766
Take the case of a
hiring algorithm.
250
00:11:53,766 --> 00:11:56,733
First, a computer needs to
understand the objective.
251
00:11:56,733 --> 00:11:59,266
Here, the objective is
identifying
252
00:11:59,266 --> 00:12:01,600
the best candidate
for the job.
253
00:12:01,600 --> 00:12:04,666
The algorithm looks at risumis
of former job candidates
254
00:12:04,666 --> 00:12:10,566
and searches for keywords
in risumis of successful hires.
255
00:12:10,566 --> 00:12:14,433
The risumis are what's
called training data.
256
00:12:14,433 --> 00:12:18,033
The algorithm assigns values
to each keyword.
257
00:12:18,033 --> 00:12:20,466
Words that appear more
frequently in the risumis
258
00:12:20,466 --> 00:12:23,833
of successful candidates
are given more value.
259
00:12:23,833 --> 00:12:27,433
The system learns from
past risumis the patterns
260
00:12:27,433 --> 00:12:31,133
of qualities that are associated
with successful hires.
261
00:12:31,133 --> 00:12:33,266
Then it makes its predictions
by identifying these
262
00:12:33,266 --> 00:12:36,500
same patterns from the risumis
of potential candidates.
263
00:12:36,500 --> 00:12:39,366
♪ ♪
264
00:12:39,366 --> 00:12:41,300
In a similar way,
265
00:12:41,300 --> 00:12:43,666
the Chicago police wanted to
find patterns in crime reports
266
00:12:43,666 --> 00:12:47,533
and arrest records to predict
who would be connected
267
00:12:47,533 --> 00:12:48,666
to violence in the future.
268
00:12:48,666 --> 00:12:54,800
They thought Papachristos's
model could help.
269
00:12:54,800 --> 00:12:58,066
Obviously we wanted to,
and tried, and framed and wrote
270
00:12:58,066 --> 00:12:59,233
all the caveats and made
our recommendations to say,
271
00:12:59,233 --> 00:13:03,733
"This research should be
in this public health space."
272
00:13:03,733 --> 00:13:06,300
But once the math is out there,
273
00:13:06,300 --> 00:13:07,500
once the statistics
are out there,
274
00:13:07,500 --> 00:13:10,933
people can also take it
and do what they want with it.
275
00:13:13,033 --> 00:13:16,700
NARRATOR:
While Papachristos saw
the model as a tool to identify
276
00:13:16,700 --> 00:13:17,866
future victims of gun violence,
277
00:13:17,866 --> 00:13:21,966
CPD saw the chance to identify
not only future victims,
278
00:13:21,966 --> 00:13:24,833
but future criminals.
279
00:13:26,800 --> 00:13:28,866
First it took me, you know,
by, by surprise,
280
00:13:28,866 --> 00:13:30,200
and then it got me worried.
281
00:13:30,200 --> 00:13:31,466
What is it gonna do?
282
00:13:31,466 --> 00:13:33,733
Who is it gonna harm?
283
00:13:33,733 --> 00:13:35,966
♪ ♪
284
00:13:35,966 --> 00:13:39,000
NARRATOR:
What the police wanted
to predict was who was at risk
285
00:13:39,000 --> 00:13:42,400
for being involved in
future violence.
286
00:13:42,400 --> 00:13:43,866
Gimme all your money, man.
287
00:13:43,866 --> 00:13:47,466
NARRATOR:
Training on hundreds of
thousands of arrest records,
288
00:13:47,466 --> 00:13:51,366
the computer algorithm looks
for patterns or factors
289
00:13:51,366 --> 00:13:53,833
associated with violent crime
290
00:13:53,833 --> 00:13:55,266
to calculate the risk
that an individual
291
00:13:55,266 --> 00:14:00,566
will be connected to
future violence.
292
00:14:00,566 --> 00:14:03,200
Using social network analysis,
293
00:14:03,200 --> 00:14:05,233
arrest records of associates
294
00:14:05,233 --> 00:14:08,366
are also included in that
calculation.
295
00:14:08,366 --> 00:14:14,033
The program was called the
Strategic Subject List, or SSL.
296
00:14:14,033 --> 00:14:16,400
It would be one of the most
controversial
297
00:14:16,400 --> 00:14:17,866
in Chicago policing history.
298
00:14:17,866 --> 00:14:20,933
ANDY CLARNO:
The idea behind the
Strategic Subjects List,
299
00:14:20,933 --> 00:14:22,866
or the SSL,
300
00:14:22,866 --> 00:14:24,433
was to try to identify
the people who would be
301
00:14:24,433 --> 00:14:29,666
most likely to become involved
as what they called
302
00:14:29,666 --> 00:14:32,666
a "party to violence," either as
a shooter or a victim.
303
00:14:32,666 --> 00:14:34,400
♪ ♪
304
00:14:34,400 --> 00:14:36,666
NARRATOR:
Chicago police would use
Papachristos's research
305
00:14:36,666 --> 00:14:40,433
to evaluate what was called
an individual's
306
00:14:40,433 --> 00:14:43,500
"co-arrest network."
307
00:14:43,500 --> 00:14:45,600
And the way that the
Chicago Police Department
308
00:14:45,600 --> 00:14:49,133
calculated an
individual's network was through
309
00:14:49,133 --> 00:14:50,633
kind of two degrees of removal.
310
00:14:50,633 --> 00:14:53,966
Anybody that I'd been arrested
with and anybody that they
311
00:14:53,966 --> 00:14:56,600
would, had been arrested with
counted as people who were
312
00:14:56,600 --> 00:14:58,266
within my network.
313
00:14:58,266 --> 00:15:01,666
So my risk score would be
based on my individual history
314
00:15:01,666 --> 00:15:03,866
of arrest and victimization,
as well as the histories
315
00:15:03,866 --> 00:15:06,600
of arrest and victimization
of people within that
316
00:15:06,600 --> 00:15:10,066
two-degree network of mine.
317
00:15:10,066 --> 00:15:11,566
It was colloquially known
as the "heat list."
318
00:15:11,566 --> 00:15:12,700
If you were hot, you were on it.
319
00:15:12,700 --> 00:15:15,566
And they gave you literally
a risk score.
320
00:15:15,566 --> 00:15:16,666
At one time,
it was zero to 500-plus.
321
00:15:16,666 --> 00:15:18,866
If you're 500-plus, you are
a high-risk person.
322
00:15:18,866 --> 00:15:21,700
♪ ♪
323
00:15:21,700 --> 00:15:22,900
And if you made this heat list,
324
00:15:22,900 --> 00:15:26,933
you might find a detective
knocking on your front door.
325
00:15:26,933 --> 00:15:32,133
♪ ♪
326
00:15:32,133 --> 00:15:33,166
NARRATOR:
Trying to predict
327
00:15:33,166 --> 00:15:38,533
future criminal activity is not
a new idea.
328
00:15:38,533 --> 00:15:41,166
Scotland Yard in London
began using this approach
329
00:15:41,166 --> 00:15:45,166
by mapping crime events
in the 1930s.
330
00:15:48,866 --> 00:15:50,100
But in the 1990s,
331
00:15:50,100 --> 00:15:55,133
it was New York City Police
Commissioner William Bratton
332
00:15:55,133 --> 00:15:57,933
who took crime mapping
to another level.
333
00:15:57,933 --> 00:16:00,766
BRATTON:
I run the New York City
Police Department.
334
00:16:00,766 --> 00:16:03,166
My competition is the
criminal element.
335
00:16:03,166 --> 00:16:06,766
NARRATOR:
Bratton convinced policing
agencies across the country
336
00:16:06,766 --> 00:16:09,200
that data-driven policing
was the key
337
00:16:09,200 --> 00:16:10,566
to successful
policing strategies.
338
00:16:10,566 --> 00:16:12,566
Part of this is to prevent
crime in the first place.
339
00:16:12,566 --> 00:16:17,566
♪ ♪
340
00:16:17,566 --> 00:16:18,733
NARRATOR:
Bratton was inspired
341
00:16:18,733 --> 00:16:22,066
by the work of his own
New York City Transit Police.
342
00:16:22,066 --> 00:16:24,066
As you see all those,
343
00:16:24,066 --> 00:16:25,933
uh, dots on the map,
344
00:16:25,933 --> 00:16:27,300
that's our opponents.
345
00:16:27,300 --> 00:16:29,166
NARRATOR:
It was called
Charts of the Future,
346
00:16:29,166 --> 00:16:34,000
and credited with cutting
subway felonies by 27%
347
00:16:34,000 --> 00:16:34,633
and robberies by a third.
348
00:16:34,633 --> 00:16:39,433
Bratton saw potential.
349
00:16:39,433 --> 00:16:42,300
He ordered all
New York City precincts
350
00:16:42,300 --> 00:16:43,266
to systematically map crime,
351
00:16:43,266 --> 00:16:48,533
collect data, find patterns,
report back.
352
00:16:48,533 --> 00:16:50,933
The new approach was
called CompStat.
353
00:16:50,933 --> 00:16:54,733
BRAYNE:
CompStat, I think, in a way,
is kind of a precessor
354
00:16:54,733 --> 00:16:55,900
of predictive policing,
355
00:16:55,900 --> 00:17:00,600
in the sense that many of
the same principles there--
356
00:17:00,600 --> 00:17:03,066
you know, using data tracking,
year-to-dates,
357
00:17:03,066 --> 00:17:06,433
identifying places where
law enforcement interventions
358
00:17:06,433 --> 00:17:07,566
could be effective, et cetera--
359
00:17:07,566 --> 00:17:10,266
really laid the groundwork
for predictive policing.
360
00:17:10,266 --> 00:17:12,700
♪ ♪
361
00:17:12,700 --> 00:17:14,533
NARRATOR:
By the early 2000s,
362
00:17:14,533 --> 00:17:16,966
as computational power
increased,
363
00:17:16,966 --> 00:17:18,766
criminologists were
convinced this new data trove
364
00:17:18,766 --> 00:17:23,700
could be used in machine
learning to create models
365
00:17:23,700 --> 00:17:24,666
that predict when and where
366
00:17:24,666 --> 00:17:27,833
crime would happen
in the future.
367
00:17:27,833 --> 00:17:30,033
♪ ♪
368
00:17:30,033 --> 00:17:32,100
REPORTER:
L.A. police now say
the gunmen opened fire
369
00:17:32,100 --> 00:17:33,800
with a semi-automatic weapon.
370
00:17:33,800 --> 00:17:35,400
NARRATOR:
In 2008,
371
00:17:35,400 --> 00:17:38,800
now as chief of the
Los Angeles Police Department,
372
00:17:38,800 --> 00:17:41,466
Bratton joined with academics
at U.C.L.A.
373
00:17:41,466 --> 00:17:44,700
to help launch a
predictive policing system
374
00:17:44,700 --> 00:17:46,166
called PredPol,
375
00:17:46,166 --> 00:17:49,100
powered by a machine learning
algorithm.
376
00:17:49,100 --> 00:17:52,533
♪ ♪
377
00:17:52,533 --> 00:17:54,166
ISAAC:
PredPol started
378
00:17:54,166 --> 00:17:56,100
as a spin-off
of a set of, like,
379
00:17:56,100 --> 00:18:00,033
government contracts that were
related to military work.
380
00:18:00,033 --> 00:18:02,166
They were developing
381
00:18:02,166 --> 00:18:05,566
a form of an algorithm that was
used to predict I.E.Ds.
382
00:18:05,566 --> 00:18:07,766
(device explodes)
383
00:18:07,766 --> 00:18:08,966
And it was a technique
that was used
384
00:18:08,966 --> 00:18:13,233
to also detect aftershocks
and seismographic activity.
385
00:18:13,233 --> 00:18:15,866
(dogs barking and whining,
objects clattering)
386
00:18:15,866 --> 00:18:17,133
And after those contracts ended,
387
00:18:17,133 --> 00:18:19,300
the company decided
they wanted to apply this
388
00:18:19,300 --> 00:18:20,466
in the domain of, of
policing
389
00:18:20,466 --> 00:18:22,400
domestically in the
United States.
390
00:18:22,400 --> 00:18:25,266
(radio beeping)
391
00:18:25,266 --> 00:18:27,266
NARRATOR:
The PredPol model
392
00:18:27,266 --> 00:18:28,666
relies on three types
of historical data:
393
00:18:28,666 --> 00:18:35,266
type of crime, crime location,
and time of crime,
394
00:18:35,266 --> 00:18:37,400
going back two to five years.
395
00:18:37,400 --> 00:18:38,966
The algorithm is looking for
patterns
396
00:18:38,966 --> 00:18:44,266
to identify locations where
crime is most likely to occur.
397
00:18:44,266 --> 00:18:46,866
As new crime incidents
are reported,
398
00:18:46,866 --> 00:18:51,733
they get folded into
the calculation.
399
00:18:51,733 --> 00:18:52,800
The predictions are displayed
on a map
400
00:18:52,800 --> 00:18:57,400
as 500 x 500 foot areas
that officers are then
401
00:18:57,400 --> 00:18:59,833
directed to patrol.
402
00:18:59,833 --> 00:19:01,700
ISAAC:
And then from there,
the algorithm says,
403
00:19:01,700 --> 00:19:04,800
"Okay, based on what we know
about the kind of
404
00:19:04,800 --> 00:19:06,633
"very recent history,
405
00:19:06,633 --> 00:19:08,766
"where is likely that
we'll see crime
406
00:19:08,766 --> 00:19:11,100
in the next day
or the next hour?"
407
00:19:11,100 --> 00:19:14,233
♪ ♪
408
00:19:14,233 --> 00:19:15,766
BRAYNE:
One of the key reasons
409
00:19:15,766 --> 00:19:16,800
that police start using
these tools
410
00:19:16,800 --> 00:19:20,066
is the efficient
and even, to a certain extent,
411
00:19:20,066 --> 00:19:21,366
like in their logic,
412
00:19:21,366 --> 00:19:24,300
more fair, um, and, and
justifiable allocation
413
00:19:24,300 --> 00:19:25,533
of their police resources.
414
00:19:25,533 --> 00:19:28,133
♪ ♪
415
00:19:28,133 --> 00:19:29,800
NARRATOR:
By 2013,
416
00:19:29,800 --> 00:19:33,400
in addition to PredPol,
predictive policing systems
417
00:19:33,400 --> 00:19:37,100
developed by companies like
HunchLab, IBM, and Palantir
418
00:19:37,100 --> 00:19:39,500
were in use across the country.
419
00:19:39,500 --> 00:19:42,000
(radios running in background)
420
00:19:42,000 --> 00:19:44,900
And computer algorithms
421
00:19:44,900 --> 00:19:46,166
were also being adopted
in courtrooms.
422
00:19:46,166 --> 00:19:52,866
BAILIFF:
21CF3810, State of Wisconsin
versus Chantille...
423
00:19:52,866 --> 00:19:55,833
KATHERINE FORREST:
These tools are used in
pretrial determinations,
424
00:19:55,833 --> 00:19:58,966
they're used in sentencing
determinations,
425
00:19:58,966 --> 00:20:00,300
and they're used in
housing determinations.
426
00:20:00,300 --> 00:20:05,733
They're also used, importantly,
in the plea bargaining phase.
427
00:20:05,733 --> 00:20:08,133
They're used really throughout
the entire process
428
00:20:08,133 --> 00:20:13,000
to try to do what judges
have been doing,
429
00:20:13,000 --> 00:20:15,200
which is the very, very
difficult task
430
00:20:15,200 --> 00:20:16,400
of trying to understand
and predict
431
00:20:16,400 --> 00:20:21,666
what will a human being
do tomorrow, or the next day,
432
00:20:21,666 --> 00:20:23,366
or next month, or three years
from now.
433
00:20:23,366 --> 00:20:25,333
ASSISTANT DISTRICT ATTORNEY:
Bail forfeited.
434
00:20:25,333 --> 00:20:27,666
He failed to appear
12/13/21.
435
00:20:27,666 --> 00:20:29,533
Didn't even make it
to preliminary hearing.
436
00:20:29,533 --> 00:20:33,200
The software tools are an
attempt to try to predict
437
00:20:33,200 --> 00:20:34,366
it better than humans can.
438
00:20:34,366 --> 00:20:36,366
MICHELLE HAVAS:
On count one, you're charged
with
439
00:20:36,366 --> 00:20:38,266
felony intimidation
of a victim.
440
00:20:38,266 --> 00:20:40,866
SWEENEY:
So, in the United States,
you're innocent
441
00:20:40,866 --> 00:20:44,300
until you've been proven guilty,
but you've been arrested.
442
00:20:44,300 --> 00:20:45,800
Now that you've been arrested,
443
00:20:45,800 --> 00:20:48,166
a judge has to decide
whether or not
444
00:20:48,166 --> 00:20:49,533
you get out on bail,
445
00:20:49,533 --> 00:20:51,700
or how high or low that bail
should be.
446
00:20:51,700 --> 00:20:55,200
You're charged with driving
on a suspended license.
447
00:20:55,200 --> 00:20:56,400
I've set that bond at $1,000.
448
00:20:56,400 --> 00:20:58,833
No insurance, I've set that bond
at $1,000.
449
00:20:58,833 --> 00:21:01,633
ALISON SHAMES:
One of the problems is,
450
00:21:01,633 --> 00:21:04,400
judges often are relying on
money bond
451
00:21:04,400 --> 00:21:05,600
or financial conditions
of release.
452
00:21:05,600 --> 00:21:08,200
JUDGE:
So I'm going to
lower his fine
453
00:21:08,200 --> 00:21:10,000
to make it a bit
more reasonable.
454
00:21:10,000 --> 00:21:13,166
So instead of $250,000 cash,
455
00:21:13,166 --> 00:21:15,300
surety is $100,000.
456
00:21:15,300 --> 00:21:17,866
SHAMES:
It allows people who have
access to money to be released.
457
00:21:17,866 --> 00:21:20,566
If you are poor, you are often
being detained pretrial.
458
00:21:20,566 --> 00:21:27,066
Approximately 70% of the people
in jail are there on pretrial.
459
00:21:27,066 --> 00:21:29,500
These are people who are
presumed innocent,
460
00:21:29,500 --> 00:21:32,533
but are detained during the
pretrial stage of their case.
461
00:21:32,533 --> 00:21:38,200
NARRATOR:
Many jurisdictions use
pretrial assessment algorithms
462
00:21:38,200 --> 00:21:42,100
with a goal to reduce
jail populations and decrease
463
00:21:42,100 --> 00:21:43,566
the impact of judicial bias.
464
00:21:43,566 --> 00:21:49,233
SHAMES:
The use of a tool like this
takes historical data
465
00:21:49,233 --> 00:21:51,366
and assesses, based on research,
466
00:21:51,366 --> 00:21:56,866
associates factors that are
predictive of the two outcomes
467
00:21:56,866 --> 00:21:59,133
that the judge is
concerned with.
468
00:21:59,133 --> 00:22:02,066
That's community safety
and whether that person
469
00:22:02,066 --> 00:22:05,866
will appear back in court
during the pretrial period.
470
00:22:05,866 --> 00:22:08,300
♪ ♪
471
00:22:08,300 --> 00:22:11,900
NARRATOR:
Many of these algorithms
are based on a concept called
472
00:22:11,900 --> 00:22:12,833
a regression model.
473
00:22:12,833 --> 00:22:17,033
The earliest,
called linear regression,
474
00:22:17,033 --> 00:22:24,333
dates back to 19th-century
mathematics.
475
00:22:24,333 --> 00:22:27,066
O'NEIL:
At the end of the day,
machine learning algorithms
476
00:22:27,066 --> 00:22:29,900
do exactly what
linear regression does,
477
00:22:29,900 --> 00:22:31,966
which is predict--
478
00:22:31,966 --> 00:22:34,600
based on the initial conditions,
the situation they're seeing--
479
00:22:34,600 --> 00:22:36,333
predict what will happen
in the future,
480
00:22:36,333 --> 00:22:38,033
whether that's, like, in
the next one minute
481
00:22:38,033 --> 00:22:40,433
or the next four years.
482
00:22:41,800 --> 00:22:44,566
NARRATOR:
Throughout the United States,
over 60 jurisdictions
483
00:22:44,566 --> 00:22:49,366
use predictive algorithms
as part of the legal process.
484
00:22:49,366 --> 00:22:53,100
One of the most widely used
is COMPAS.
485
00:22:53,100 --> 00:22:55,166
The COMPAS algorithm
weighs factors,
486
00:22:55,166 --> 00:22:57,266
including a defendant's answers
to a questionnaire,
487
00:22:57,266 --> 00:23:02,566
to provide a
risk assessment score.
488
00:23:02,566 --> 00:23:07,266
These scores are used every day
by judges to guide decisions
489
00:23:07,266 --> 00:23:12,566
about pretrial detention, bail,
and even sentencing.
490
00:23:12,566 --> 00:23:15,700
But the reliability
of the COMPAS algorithm
491
00:23:15,700 --> 00:23:17,266
has been questioned.
492
00:23:17,266 --> 00:23:24,066
In 2016, ProPublica published
an investigative report
493
00:23:24,066 --> 00:23:25,666
on the COMPAS
risk assessment tool.
494
00:23:25,666 --> 00:23:31,300
BENJAMIN:
Investigators wanted to see
if the scores were accurate
495
00:23:31,300 --> 00:23:33,233
in predicting whether
these individuals
496
00:23:33,233 --> 00:23:35,866
would commit a future crime.
497
00:23:35,866 --> 00:23:38,166
And they found two things
that were interesting.
498
00:23:38,166 --> 00:23:42,933
One was that the score
was remarkably unreliable
499
00:23:42,933 --> 00:23:46,833
in predicting who would commit
a, a crime in the future
500
00:23:46,833 --> 00:23:48,133
over this two-year period.
501
00:23:48,133 --> 00:23:52,333
But then the other thing that
ProPublica investigators found
502
00:23:52,333 --> 00:23:57,700
was that Black people were
much more likely to be deemed
503
00:23:57,700 --> 00:24:01,600
high risk and white people
low risk.
504
00:24:01,600 --> 00:24:04,833
NARRATOR:
This was true even in cases
when the Black person
505
00:24:04,833 --> 00:24:06,866
was arrested for a minor offense
and the white person
506
00:24:06,866 --> 00:24:12,233
in question was arrested for
a more serious crime.
507
00:24:12,233 --> 00:24:17,900
BENJAMIN:
This ProPublica study
was one of the first to begin
508
00:24:17,900 --> 00:24:22,100
to burst the bubble
of technology
509
00:24:22,100 --> 00:24:24,733
as somehow objective
and neutral.
510
00:24:24,733 --> 00:24:31,566
NARRATOR:
The article created
a national controversy.
511
00:24:31,566 --> 00:24:35,166
But at Dartmouth, a student
convinced her professor
512
00:24:35,166 --> 00:24:36,533
they should both be
more than stunned.
513
00:24:36,533 --> 00:24:40,566
HANY FARID:
As it turns out, one of my
students, Julia Dressel,
514
00:24:40,566 --> 00:24:42,400
reads the same article and said,
515
00:24:42,400 --> 00:24:44,200
"This is terrible.
516
00:24:44,200 --> 00:24:45,533
We should do something
about it."
(chuckles)
517
00:24:45,533 --> 00:24:48,266
This is the difference between
an awesome idealistic student
518
00:24:48,266 --> 00:24:50,333
and a jaded, uh, professor.
519
00:24:50,333 --> 00:24:52,400
And I thought,
"I think you're right."
520
00:24:52,400 --> 00:24:54,733
And as we were sort of
struggling to understand
521
00:24:54,733 --> 00:24:58,766
the underlying roots of the bias
in the algorithms,
522
00:24:58,766 --> 00:25:00,933
we asked ourselves
a really simple question:
523
00:25:00,933 --> 00:25:05,100
are the algorithms today, are
they doing better than humans?
524
00:25:05,100 --> 00:25:07,900
Because presumably, that's why
you have these algorithms,
525
00:25:07,900 --> 00:25:11,566
is that they eliminate some of
the bias and the prejudices,
526
00:25:11,566 --> 00:25:14,066
either implicit or explicit,
in the human judgment.
527
00:25:14,066 --> 00:25:17,866
NARRATOR:
To analyze COMPAS's
risk assessment accuracy,
528
00:25:17,866 --> 00:25:20,866
they used the crowdsourcing
platform Mechanical Turk.
529
00:25:20,866 --> 00:25:25,566
Their online study included
400 participants
530
00:25:25,566 --> 00:25:29,166
who evaluated 1,000 defendants.
531
00:25:29,166 --> 00:25:30,766
FARID:
We asked participants to
532
00:25:30,766 --> 00:25:34,433
read a very short paragraph
about an actual defendant.
533
00:25:34,433 --> 00:25:35,866
How old they were,
534
00:25:35,866 --> 00:25:37,866
whether they were
male or female,
535
00:25:37,866 --> 00:25:40,400
what their prior
juvenile conviction record was,
536
00:25:40,400 --> 00:25:42,433
and their prior
adult conviction record.
537
00:25:42,433 --> 00:25:45,066
And, importantly, we didn't
tell people their race.
538
00:25:45,066 --> 00:25:46,366
And then we ask a very simple
question,
539
00:25:46,366 --> 00:25:48,066
"Do you think this person
will commit a crime
540
00:25:48,066 --> 00:25:49,966
in the next two years?"
541
00:25:49,966 --> 00:25:50,933
Yes, no.
542
00:25:50,933 --> 00:25:53,933
And again,
these are non-experts.
543
00:25:53,933 --> 00:25:55,600
These are people being paid
544
00:25:55,600 --> 00:25:58,433
a couple of bucks online
to answer a survey.
545
00:25:58,433 --> 00:26:00,633
No criminal justice experience,
546
00:26:00,633 --> 00:26:02,500
don't know anything about
the defendants.
547
00:26:02,500 --> 00:26:05,566
They were as accurate
as the commercial software
548
00:26:05,566 --> 00:26:07,366
being used in the courts today,
549
00:26:07,366 --> 00:26:09,233
one particular piece
of software.
550
00:26:09,233 --> 00:26:11,500
That was really surprising.
551
00:26:11,500 --> 00:26:13,766
We would've expected
a little bit of improvement.
552
00:26:13,766 --> 00:26:14,966
After all, the algorithm has
access
553
00:26:14,966 --> 00:26:17,833
to huge amounts
of training data.
554
00:26:19,333 --> 00:26:22,200
NARRATOR:
And something else
puzzled the researchers.
555
00:26:22,200 --> 00:26:24,500
The MTurk workers' answers to
questions
556
00:26:24,500 --> 00:26:27,833
about who would commit crimes in
the future and who wouldn't
557
00:26:27,833 --> 00:26:30,733
showed a surprising pattern of
racial bias,
558
00:26:30,733 --> 00:26:33,866
even though race wasn't
indicated
559
00:26:33,866 --> 00:26:35,933
in any of the profiles.
560
00:26:35,933 --> 00:26:38,600
They were more likely to say a
person of color
561
00:26:38,600 --> 00:26:41,333
will be high risk when they
weren't,
562
00:26:41,333 --> 00:26:44,066
and they were more likely to say
that a white person
563
00:26:44,066 --> 00:26:46,833
would not be high risk when in
fact they were.
564
00:26:46,833 --> 00:26:49,700
And this made no sense to us at
all.
565
00:26:49,700 --> 00:26:50,933
You don't know the race of the
person.
566
00:26:50,933 --> 00:26:53,066
How is it possible that you're
biased against them?
567
00:26:53,066 --> 00:26:54,833
(radios running in background)
568
00:26:54,833 --> 00:26:56,866
In this country,
if you are a person of color,
569
00:26:56,866 --> 00:26:59,733
you are significantly more
likely, historically,
570
00:26:59,733 --> 00:27:00,866
to be arrested,
571
00:27:00,866 --> 00:27:03,566
to be charged,
and to be convicted of a crime.
572
00:27:03,566 --> 00:27:06,300
So in fact, prior convictions
573
00:27:06,300 --> 00:27:09,466
is a proxy for your race.
574
00:27:09,466 --> 00:27:11,500
Not a perfect proxy,
but it is correlated,
575
00:27:11,500 --> 00:27:13,633
because of the historical
inequities
576
00:27:13,633 --> 00:27:14,800
in the criminal justice system
577
00:27:14,800 --> 00:27:17,766
and policing in this country.
578
00:27:17,766 --> 00:27:19,166
(siren blaring)
579
00:27:19,166 --> 00:27:22,100
MAN:
It's my car, bro, come on,
what are y'all doing?
580
00:27:22,100 --> 00:27:23,700
Like, this, this is racial
profiling.
581
00:27:23,700 --> 00:27:25,733
NARRATOR:
Research indicates a
Black person
582
00:27:25,733 --> 00:27:29,266
is five times more likely to be
stopped without cause
583
00:27:29,266 --> 00:27:30,766
than a white person.
584
00:27:30,766 --> 00:27:33,133
Black people are at least twice
as likely
585
00:27:33,133 --> 00:27:35,833
as white people to be
arrested for drug offenses,
586
00:27:35,833 --> 00:27:37,933
even though Black and
white people
587
00:27:37,933 --> 00:27:39,866
use drugs at the same rate.
588
00:27:39,866 --> 00:27:43,666
Black people are also about 12
times
589
00:27:43,666 --> 00:27:45,966
more likely to be wrongly
convicted of drug crimes.
590
00:27:45,966 --> 00:27:51,700
FORREST:
Historically, Black men have
been arrested at higher levels
591
00:27:51,700 --> 00:27:53,000
than other populations.
592
00:27:53,000 --> 00:27:58,333
Therefore, the tool predicts
that a Black man, for instance,
593
00:27:58,333 --> 00:28:00,666
will be arrested at a rate and
recidivate at a rate
594
00:28:00,666 --> 00:28:04,266
that is higher than a white
individual.
595
00:28:06,400 --> 00:28:09,133
FARID:
And so what was happening is,
you know, the big data,
596
00:28:09,133 --> 00:28:10,333
the big machine
learning folks are saying,
597
00:28:10,333 --> 00:28:13,233
"Look, we're not giving it
race-- it can't be racist."
598
00:28:13,233 --> 00:28:15,633
But that
is spectacularly naive,
599
00:28:15,633 --> 00:28:18,433
because we know that other
things correlate with race.
600
00:28:18,433 --> 00:28:19,600
In this case, number
of prior convictions.
601
00:28:19,600 --> 00:28:23,366
And so when you train an
algorithm on historical data,
602
00:28:23,366 --> 00:28:24,533
well, guess what.
603
00:28:24,533 --> 00:28:26,900
It's going to reproduce
history-- of course it will.
604
00:28:28,500 --> 00:28:31,233
NARRATOR:
Compounding the problem is the
fact that
605
00:28:31,233 --> 00:28:32,766
predictive algorithms can't be
put on the witness stand
606
00:28:32,766 --> 00:28:38,233
and interrogated about their
decision-making processes.
607
00:28:38,233 --> 00:28:39,733
FORREST:
Many defendants have had
difficulty
608
00:28:39,733 --> 00:28:45,200
getting access to
the underlying information
609
00:28:45,200 --> 00:28:47,300
that tells them,
610
00:28:47,300 --> 00:28:50,633
what was the data set that was
used to assess me?
611
00:28:50,633 --> 00:28:53,300
What were the inputs
that were used?
612
00:28:53,300 --> 00:28:55,200
How were those inputs weighted?
613
00:28:55,200 --> 00:28:57,633
So you've got what can be,
these days,
614
00:28:57,633 --> 00:28:58,866
increasingly, a black box.
615
00:28:58,866 --> 00:29:02,433
A lack of transparency.
616
00:29:04,266 --> 00:29:05,666
NARRATOR:
Some black box algorithms get
their name
617
00:29:05,666 --> 00:29:08,500
from a lack of transparency
about the code
618
00:29:08,500 --> 00:29:10,500
and data inputs they use,
619
00:29:10,500 --> 00:29:13,566
which can be deemed proprietary.
620
00:29:13,566 --> 00:29:17,700
But that's not the only
kind of black box.
621
00:29:17,700 --> 00:29:21,066
A black box is any system
which is so complicated
622
00:29:21,066 --> 00:29:24,166
that you can see what goes in
and you can see what comes out,
623
00:29:24,166 --> 00:29:26,966
but it's impossible to
understand
624
00:29:26,966 --> 00:29:29,266
what's going on inside it.
625
00:29:29,266 --> 00:29:32,233
All of those steps in the
algorithm
626
00:29:32,233 --> 00:29:37,233
are hidden inside phenomenally
complex math
627
00:29:37,233 --> 00:29:39,600
and processes.
628
00:29:39,600 --> 00:29:43,066
FARID:
And I would argue that when you
are using algorithms
629
00:29:43,066 --> 00:29:45,900
in mission-critical
applications,
630
00:29:45,900 --> 00:29:47,066
like criminal justice system,
631
00:29:47,066 --> 00:29:49,433
we should not be deploying
black box algorithms.
632
00:29:55,400 --> 00:29:58,366
NARRATOR:
PredPol, like many
predictive platforms,
633
00:29:58,366 --> 00:30:00,733
claimed a proven record for
crime reduction.
634
00:30:00,733 --> 00:30:05,533
In 2015, PredPol published its
algorithm
635
00:30:01,800 --> 00:30:05,533
In 2015, PredPol published its
algorithm
636
00:30:05,533 --> 00:30:08,500
in a peer-reviewed journal.
637
00:30:08,500 --> 00:30:11,133
William Isaac and Kristian Lum,
638
00:30:11,133 --> 00:30:13,000
research scientists
who investigate
639
00:30:13,000 --> 00:30:14,900
predictive policing platforms,
640
00:30:14,900 --> 00:30:17,366
analyzed the algorithm.
641
00:30:17,366 --> 00:30:19,633
ISAAC:
We just kind of saw the
algorithm
642
00:30:19,633 --> 00:30:21,266
as going back to the
same one or two blocks
643
00:30:21,266 --> 00:30:24,433
every single time.
644
00:30:26,200 --> 00:30:27,766
And that's kind of strange,
645
00:30:27,766 --> 00:30:30,833
because if you had a truly
predictive policing system,
646
00:30:30,833 --> 00:30:33,366
you wouldn't necessarily see it
going to the same locations
647
00:30:33,366 --> 00:30:35,500
over and over again.
648
00:30:38,333 --> 00:30:40,266
NARRATOR:
For their experiment,
649
00:30:40,266 --> 00:30:41,633
Isaac and Lum used a different
data set,
650
00:30:41,633 --> 00:30:47,466
public health data, to map
illicit drug use in Oakland.
651
00:30:47,466 --> 00:30:50,933
ISAAC:
So, a good chunk
of the city was kind of
652
00:30:50,933 --> 00:30:53,666
evenly distributed in terms of
where
653
00:30:53,666 --> 00:30:55,400
potential illicit drug use might
be.
654
00:30:55,400 --> 00:30:59,333
But the police predictions were
clustering around areas
655
00:30:59,333 --> 00:31:02,133
where police had, you know,
656
00:31:02,133 --> 00:31:04,233
historically found incidents of
illicit drug use.
657
00:31:04,233 --> 00:31:08,066
Specifically, we saw significant
numbers of neighborhoods
658
00:31:08,066 --> 00:31:10,166
that were predominantly
non-white and lower-income
659
00:31:10,166 --> 00:31:16,133
being deliberate targets of the
predictions.
660
00:31:16,133 --> 00:31:19,466
NARRATOR:
Even though illicit drug use was
a citywide problem,
661
00:31:19,466 --> 00:31:21,133
the algorithm focused its
predictions
662
00:31:21,133 --> 00:31:25,300
on low-income neighborhoods
and communities of color.
663
00:31:26,466 --> 00:31:29,733
ISAAC:
The reason why is
actually really important.
664
00:31:29,733 --> 00:31:31,000
It's very hard to divorce
665
00:31:31,000 --> 00:31:33,066
these predictions from those
histories
666
00:31:33,066 --> 00:31:35,933
and legacies of over-policing.
667
00:31:37,133 --> 00:31:41,333
As a result of that, they
manifest themselves in the data.
668
00:31:41,333 --> 00:31:43,566
NARRATOR:
In an area where there
is more police presence,
669
00:31:43,566 --> 00:31:46,166
more crime is uncovered.
670
00:31:47,466 --> 00:31:49,766
The crime data indicates to the
algorithm
671
00:31:49,766 --> 00:31:52,433
that the heavily
policed neighborhood
672
00:31:52,433 --> 00:31:55,200
is where future crime will be
found,
673
00:31:55,200 --> 00:31:56,466
even though there may be other
neighborhoods
674
00:31:56,466 --> 00:32:02,100
where crimes are being committed
at the same or higher rate.
675
00:32:03,300 --> 00:32:05,733
ISAAC:
Every new prediction that
you generate
676
00:32:05,733 --> 00:32:08,100
is going to be
increasingly dependent
677
00:32:08,100 --> 00:32:11,466
on the behavior of the algorithm
in the past.
678
00:32:11,466 --> 00:32:13,366
So, you know,
if you go ten days, 20 days,
679
00:32:13,366 --> 00:32:15,833
30 days into the future, right,
after using an algorithm,
680
00:32:15,833 --> 00:32:19,366
all of those predictions have
changed the behavior
681
00:32:19,366 --> 00:32:20,700
of the police department
682
00:32:20,700 --> 00:32:23,733
and are now being folded back
into the next day's prediction.
683
00:32:26,166 --> 00:32:27,566
NARRATOR:
The result can be a
feedback loop
684
00:32:27,566 --> 00:32:32,000
that reinforces
historical policing practices.
685
00:32:34,433 --> 00:32:35,866
SWEENEY:
All of these different types
686
00:32:35,866 --> 00:32:38,200
of machine learning algorithms
are all trying to help us
687
00:32:38,200 --> 00:32:41,066
figure out, are there some
patterns in this data?
688
00:32:41,066 --> 00:32:43,733
It's up to us to then figure
out,
689
00:32:43,733 --> 00:32:46,233
are those legitimate patterns,
do they,
690
00:32:46,233 --> 00:32:47,400
are they useful patterns?
691
00:32:47,400 --> 00:32:49,100
Because the computer has no
idea.
692
00:32:49,100 --> 00:32:51,366
It didn't make a logical
association.
693
00:32:51,366 --> 00:32:54,800
It just made it, made a
correlation.
694
00:32:56,066 --> 00:32:59,333
MING:
My favorite definition of
artificial intelligence
695
00:32:59,333 --> 00:33:02,800
is, it's any autonomous system
696
00:33:02,800 --> 00:33:04,366
that can make decisions under
uncertainty.
697
00:33:04,366 --> 00:33:09,900
You can't make decisions under
uncertainty without bias.
698
00:33:11,300 --> 00:33:14,766
In fact, it's impossible
to escape from having bias.
699
00:33:14,766 --> 00:33:16,466
It's a mathematical reality
700
00:33:16,466 --> 00:33:21,333
about any intelligent system,
even us.
701
00:33:21,333 --> 00:33:23,033
(siren blaring in distance)
702
00:33:23,033 --> 00:33:26,166
NARRATOR:
And even if the goal
is to get rid of prejudice,
703
00:33:26,166 --> 00:33:31,700
bias in the historical data can
undermine that objective.
704
00:33:33,733 --> 00:33:35,500
Amazon discovered this
705
00:33:35,500 --> 00:33:38,066
when they began a search for top
talent
706
00:33:38,066 --> 00:33:40,733
with a hiring algorithm
whose training data
707
00:33:40,733 --> 00:33:45,066
depended on hiring successes
from the past.
708
00:33:45,066 --> 00:33:49,333
MING:
Amazon, somewhat famously
within the A.I. industry,
709
00:33:49,333 --> 00:33:54,200
they tried to build a hiring
algorithm.
710
00:33:54,200 --> 00:33:56,866
They had a massive data set.
711
00:33:56,866 --> 00:33:58,566
They had all the right answers,
712
00:33:58,566 --> 00:34:00,966
because they knew literally who
got hired
713
00:34:00,966 --> 00:34:04,033
and who got that
promotion in their first year.
714
00:34:04,033 --> 00:34:05,766
(typing)
715
00:34:05,766 --> 00:34:07,866
NARRATOR:
The company created
multiple models
716
00:34:07,866 --> 00:34:10,133
to review past
candidates' risumis
717
00:34:10,133 --> 00:34:15,833
and identify some 50,000
key terms.
718
00:34:15,833 --> 00:34:18,466
MING:
What Amazon actually
wanted to achieve
719
00:34:18,466 --> 00:34:21,100
was to diversify their hiring.
720
00:34:21,100 --> 00:34:24,766
Amazon, just like every other
tech company,
721
00:34:24,766 --> 00:34:25,833
and a lot of other companies, as
well,
722
00:34:25,833 --> 00:34:29,966
has enormous bias built
into its hiring history.
723
00:34:29,966 --> 00:34:35,533
It was always biased, strongly
biased, in favor of men,
724
00:34:35,533 --> 00:34:37,833
in favor, generally,
725
00:34:37,833 --> 00:34:40,766
of white or sometimes Asian men.
726
00:34:40,766 --> 00:34:44,033
Well, they went and built
a hiring algorithm.
727
00:34:44,033 --> 00:34:45,366
And sure enough, this thing was
728
00:34:45,366 --> 00:34:50,000
the most sexist recruiter you
could imagine.
729
00:34:50,000 --> 00:34:52,633
If you said the word "women's"
in your risumi,
730
00:34:52,633 --> 00:34:53,966
then it wouldn't hire you.
731
00:34:53,966 --> 00:34:54,966
If you went to a women's
college,
732
00:34:54,966 --> 00:34:57,833
it didn't want to hire you.
733
00:34:57,833 --> 00:35:00,400
So they take out all the gender
markers,
734
00:35:00,400 --> 00:35:02,266
and all of the women's
colleges--
735
00:35:02,266 --> 00:35:04,300
all the things that explicitly
says,
736
00:35:04,300 --> 00:35:05,600
"This is a man," and, "This is a
woman,"
737
00:35:05,600 --> 00:35:08,666
or even the ones that,
obviously, implicitly say it.
738
00:35:08,666 --> 00:35:11,100
So they did that.
739
00:35:11,100 --> 00:35:13,666
And then they trained up their
new deep neural network
740
00:35:13,666 --> 00:35:16,166
to decide who Amazon would hire.
741
00:35:16,166 --> 00:35:18,400
And it did something amazing.
742
00:35:18,400 --> 00:35:19,633
It did something no
human could do.
743
00:35:19,633 --> 00:35:22,933
It figured out who was a woman
and it wouldn't hire them.
744
00:35:24,566 --> 00:35:26,266
It was able to look through
745
00:35:26,266 --> 00:35:29,566
all of the correlations that
existed
746
00:35:29,566 --> 00:35:30,766
in that massive data set
747
00:35:30,766 --> 00:35:35,633
and figure out which ones most
strongly correlated
748
00:35:35,633 --> 00:35:37,433
with someone getting a
promotion.
749
00:35:37,433 --> 00:35:40,866
And the single biggest
correlate
750
00:35:40,866 --> 00:35:42,666
of getting a promotion was being
a man.
751
00:35:42,666 --> 00:35:46,733
And it figured those patterns
out and didn't hire women.
752
00:35:47,966 --> 00:35:54,033
NARRATOR:
Amazon abandoned its
hiring algorithm in 2017.
753
00:35:54,033 --> 00:35:55,933
Remember the way machine
learning works, right?
754
00:35:55,933 --> 00:35:57,900
It's like a student who doesn't
really understand
755
00:35:57,900 --> 00:35:59,166
the material in the class.
756
00:35:59,166 --> 00:36:03,100
They got a bunch of questions,
they got a bunch of answers.
757
00:36:00,433 --> 00:36:03,100
They got a bunch of questions,
they got a bunch of answers.
758
00:36:03,100 --> 00:36:04,600
And now they're trying to
pattern match
759
00:36:04,600 --> 00:36:06,166
for a new question and say,
"Oh, wait.
760
00:36:06,166 --> 00:36:08,066
"Let me find an answer that
looks pretty much
761
00:36:08,066 --> 00:36:09,833
like the questions and answers I
saw before."
762
00:36:09,833 --> 00:36:13,066
The algorithm only worked
because someone has said,
763
00:36:13,066 --> 00:36:16,600
"Oh, this person whose data you
have,
764
00:36:16,600 --> 00:36:18,166
"they were a good employee.
765
00:36:18,166 --> 00:36:19,400
This other person was a bad
employee," or, "This person
766
00:36:19,400 --> 00:36:22,466
performed well," or, "This
person did not perform well."
767
00:36:24,966 --> 00:36:27,200
O'NEIL:
Because algorithms
don't just look for patterns,
768
00:36:27,200 --> 00:36:29,300
they look for patterns of
success, however it's defined.
769
00:36:29,300 --> 00:36:32,800
But the definition of success
is really critically important
770
00:36:32,800 --> 00:36:35,800
to what that end up, ends up
being.
771
00:36:35,800 --> 00:36:37,733
And a lot of, a lot of opinion
772
00:36:37,733 --> 00:36:40,733
is embedded in, what, what does
success look like?
773
00:36:43,066 --> 00:36:44,600
NARRATOR:
In the case of algorithms,
774
00:36:44,600 --> 00:36:47,500
human choices play a critical
role.
775
00:36:47,500 --> 00:36:50,166
O'NEIL:
The data itself was curated.
776
00:36:50,166 --> 00:36:52,600
Someone decided what
data to collect.
777
00:36:52,600 --> 00:36:55,700
Somebody decided what data
was not relevant, right?
778
00:36:55,700 --> 00:36:58,233
And they don't exclude
it necessarily
779
00:36:58,233 --> 00:36:59,833
intentionally--
they could be blind spots.
780
00:36:59,833 --> 00:37:03,400
NARRATOR:
The need to identify
such oversights
781
00:37:03,400 --> 00:37:04,400
becomes more urgent
782
00:37:04,400 --> 00:37:09,566
as technology takes on more
decision making.
783
00:37:09,566 --> 00:37:11,966
♪ ♪
784
00:37:11,966 --> 00:37:15,200
Consider facial recognition
technology,
785
00:37:15,200 --> 00:37:16,366
used by law enforcement
786
00:37:16,366 --> 00:37:19,100
in cities around the world for
surveillance.
787
00:37:22,933 --> 00:37:24,633
In Detroit, 2018,
788
00:37:24,633 --> 00:37:28,466
law enforcement looked to facial
recognition technology
789
00:37:28,466 --> 00:37:29,866
when $3,800 worth of watches
790
00:37:29,866 --> 00:37:34,433
were stolen from an upscale
boutique.
791
00:37:35,766 --> 00:37:38,066
Police ran a still frame from
the shop's surveillance video
792
00:37:38,066 --> 00:37:42,833
through their facial recognition
system to find a match.
793
00:37:42,833 --> 00:37:46,233
How do I turn a face into
numbers
794
00:37:46,233 --> 00:37:47,866
that equations can act with?
795
00:37:47,866 --> 00:37:50,666
You turn the individual pixels
in the picture of that face
796
00:37:50,666 --> 00:37:53,933
into values.
797
00:37:56,133 --> 00:37:59,466
What it's really looking for
are complex patterns
798
00:37:59,466 --> 00:38:01,733
across those pixels.
799
00:38:01,733 --> 00:38:04,700
The sequence of taking a
pattern of numbers
800
00:38:04,700 --> 00:38:06,666
and transforming it
801
00:38:06,666 --> 00:38:08,366
into little edges and angles,
802
00:38:08,366 --> 00:38:14,100
then transforming that into eyes
and cheekbones and mustaches.
803
00:38:15,633 --> 00:38:16,766
NARRATOR:
To find that match,
804
00:38:16,766 --> 00:38:21,900
the system can be trained on
billions of photographs.
805
00:38:23,533 --> 00:38:26,100
Facial recognition uses a class
of machine learning
806
00:38:26,100 --> 00:38:28,433
called deep learning.
807
00:38:28,433 --> 00:38:30,666
The models built by deep
learning techniques
808
00:38:30,666 --> 00:38:35,833
are called neural networks.
809
00:38:35,833 --> 00:38:37,033
VENKATASUBRAMANIAN:
A neural network
810
00:38:37,033 --> 00:38:39,266
is, you know, stylized as, you
know, trying to model
811
00:38:39,266 --> 00:38:41,800
how neural
pathways work in the brain.
812
00:38:43,566 --> 00:38:44,566
You can think of a neural
network
813
00:38:44,566 --> 00:38:50,033
as a collection of neurons.
814
00:38:50,033 --> 00:38:51,633
So you put some values into a
neuron,
815
00:38:51,633 --> 00:38:55,900
and if they're, sufficiently,
they add up to some number,
816
00:38:55,900 --> 00:38:57,233
or they cross some threshold,
817
00:38:57,233 --> 00:39:01,200
this one will fire and send off
a new number to the next neuron.
818
00:39:01,200 --> 00:39:03,666
NARRATOR:
At a certain threshold,
819
00:39:03,666 --> 00:39:05,966
the neuron will fire to the
next neuron.
820
00:39:05,966 --> 00:39:10,666
If it's below the threshold,
the neuron doesn't fire.
821
00:39:10,666 --> 00:39:12,766
This process repeats and
repeats
822
00:39:12,766 --> 00:39:14,900
across hundreds, possibly
thousands of layers,
823
00:39:14,900 --> 00:39:18,066
making connections like the
neurons in our brain.
824
00:39:18,066 --> 00:39:19,833
♪ ♪
825
00:39:19,833 --> 00:39:24,133
The output is a predictive
match.
826
00:39:27,266 --> 00:39:28,433
Based on a facial recognition
match,
827
00:39:28,433 --> 00:39:32,466
in January 2020, the
police arrested Robert Williams
828
00:39:32,466 --> 00:39:35,000
for the theft of the watches.
829
00:39:37,033 --> 00:39:38,833
The next day, he was released.
830
00:39:38,833 --> 00:39:42,733
Not only did Williams have an
alibi,
831
00:39:42,733 --> 00:39:46,233
but it wasn't his face.
832
00:39:46,233 --> 00:39:49,066
MING:
To be very blunt about it,
these algorithms are probably
833
00:39:49,066 --> 00:39:51,666
dramatically over-trained on
white faces.
834
00:39:51,666 --> 00:39:57,033
♪ ♪
835
00:39:57,033 --> 00:39:59,300
So, of course, algorithms that
start out bad
836
00:39:59,300 --> 00:40:01,133
can be improved, in general.
837
00:40:01,133 --> 00:40:04,400
The Gender Shades project found
that
838
00:40:04,400 --> 00:40:07,400
certain facial recognition
technology,
839
00:40:07,400 --> 00:40:09,166
when they actually
tested it on Black women,
840
00:40:09,166 --> 00:40:15,233
it was 65% accurate, whereas for
white men, it was 99% accurate.
841
00:40:16,600 --> 00:40:19,733
How did they improve it?
Because they did.
842
00:40:19,733 --> 00:40:21,566
They built an algorithm
843
00:40:21,566 --> 00:40:23,633
that was trained on more diverse
data.
844
00:40:23,633 --> 00:40:26,433
So I don't think it's
completely a lost cause
845
00:40:26,433 --> 00:40:30,233
to improve algorithms to be
better.
846
00:40:31,700 --> 00:40:36,300
MAN (in ad voiceover):
I used to think my job
was all about arrests.
847
00:40:36,300 --> 00:40:38,033
NEDY:
LESLIE KEN
There was a commercial a few
years ago
848
00:40:38,033 --> 00:40:40,633
that showed a police officer
going to a gas station
849
00:40:40,633 --> 00:40:42,700
and then waiting for the
criminal to show up.
850
00:40:42,700 --> 00:40:44,233
MAN:
We analyze crime data,
851
00:40:44,233 --> 00:40:46,300
spot patterns,
852
00:40:46,300 --> 00:40:49,166
and figure out where to send
patrols.
853
00:40:49,166 --> 00:40:51,433
They said, "Well, our algorithm
will tell you exactly
854
00:40:51,433 --> 00:40:53,466
where the crime, the next crime
is going to take place."
855
00:40:53,466 --> 00:40:56,266
Well, that's just silly, uh,
it, that's not how it works.
856
00:40:56,266 --> 00:40:59,566
MAN:
By stopping it before it
happens.
857
00:40:59,566 --> 00:41:00,433
(sighs)
858
00:41:00,433 --> 00:41:03,100
MAN:
Let's build a smarter planet.
859
00:41:08,233 --> 00:41:10,566
♪ ♪
860
00:41:10,566 --> 00:41:12,966
JOEL CAPLAN:
Understanding what it is about
these places
861
00:41:12,966 --> 00:41:15,666
that enable
crime problems
862
00:41:15,666 --> 00:41:18,400
to emerge and/or persist.
863
00:41:18,400 --> 00:41:21,433
NARRATOR:
At Rutgers University,
864
00:41:21,433 --> 00:41:22,666
the researchers who invented
865
00:41:22,666 --> 00:41:26,366
the crime mapping platform
called Risk Terrain Modeling,
866
00:41:26,366 --> 00:41:28,366
or RTM,
867
00:41:28,366 --> 00:41:30,866
bristle at the term
"predictive policing."
868
00:41:30,866 --> 00:41:36,100
CAPLAN (voiceover):
We don't want to predict,
we want to prevent.
869
00:41:37,333 --> 00:41:40,000
I worked as a police officer a
long time ago,
870
00:41:40,000 --> 00:41:41,266
in the early 2000s.
871
00:41:41,266 --> 00:41:46,866
Police collected data for as
long as police have existed.
872
00:41:46,866 --> 00:41:50,100
Now there's a greater
recognition
873
00:41:50,100 --> 00:41:52,600
that data can have value.
874
00:41:52,600 --> 00:41:55,300
But it's not just
about the data.
875
00:41:55,300 --> 00:41:56,766
It's about how you analyze it,
how you use those results.
876
00:41:56,766 --> 00:42:00,400
There's only two data sets that
risk terrain modeling uses.
877
00:42:00,400 --> 00:42:02,666
These data sets are local,
878
00:42:02,666 --> 00:42:07,866
current information about crime
incidents within a given area
879
00:42:07,866 --> 00:42:11,033
and information about
environmental features
880
00:42:11,033 --> 00:42:12,566
that exist in that landscape,
881
00:42:12,566 --> 00:42:13,733
such as bars, fast food
restaurants,
882
00:42:13,733 --> 00:42:17,900
convenience stores,
schools, parks, alleyways.
883
00:42:19,400 --> 00:42:20,866
KENNEDY:
The algorithm is basically
884
00:42:20,866 --> 00:42:24,100
the relationship between these
environmental features
885
00:42:24,100 --> 00:42:27,066
and the, the outcome data,
which in this case is crime.
886
00:42:27,066 --> 00:42:28,966
The algorithm provides you with
a map
887
00:42:28,966 --> 00:42:32,000
of the distribution of
the risk values.
888
00:42:33,266 --> 00:42:35,600
ALEJANDRO GIME NEZ-SANTANA:
This is the highest-risk area,
889
00:42:35,600 --> 00:42:37,000
on this commercial corridor on
Bloomfield Avenue.
890
00:42:37,000 --> 00:42:41,733
NARRATOR:
But the algorithm isn't intended
for use just by police.
891
00:42:41,733 --> 00:42:44,866
Criminologist Alejandro
Giminez-Santana
892
00:42:44,866 --> 00:42:47,233
leads the Newark Public Safety
Collaborative,
893
00:42:47,233 --> 00:42:51,033
a collection of 40 community
organizations.
894
00:42:51,033 --> 00:42:53,266
They use RTM as a diagnostic
tool
895
00:42:53,266 --> 00:42:58,266
to understand not just where
crime may happen next,
896
00:42:58,266 --> 00:43:00,833
but why.
897
00:43:00,833 --> 00:43:03,466
GIME NEZ-SANTANA:
Through RTM, we identify this
commercial corridor
898
00:43:03,466 --> 00:43:05,000
on Bloomfield Avenue, which is
where we are right now,
899
00:43:05,000 --> 00:43:08,733
as a risky area for auto theft
due to car idling.
900
00:43:08,733 --> 00:43:09,866
So why is this space
901
00:43:09,866 --> 00:43:12,633
particularly problematic when it
comes to auto theft?
902
00:43:14,166 --> 00:43:16,033
One is because we're in a
commercial corridor,
903
00:43:16,033 --> 00:43:17,533
where there's high density of
people
904
00:43:17,533 --> 00:43:20,900
who go to the beauty
salon or to go to a restaurant.
905
00:43:20,900 --> 00:43:22,533
Uber delivery and Uber Eats,
906
00:43:22,533 --> 00:43:24,766
delivery people who come to grab
orders that also,
907
00:43:24,766 --> 00:43:26,833
and leave their cars
running
908
00:43:26,833 --> 00:43:28,433
create the conditions for this
crime
909
00:43:28,433 --> 00:43:31,233
to be concentrated in this
particular area.
910
00:43:31,233 --> 00:43:33,166
What the data showed us was,
911
00:43:33,166 --> 00:43:35,966
there was a tremendous rise in
auto vehicle thefts.
912
00:43:35,966 --> 00:43:39,233
But we convinced the police
department
913
00:43:39,233 --> 00:43:41,500
to take a more
social service approach.
914
00:43:41,500 --> 00:43:43,966
NARRATOR:
Community organizers convinced
police
915
00:43:43,966 --> 00:43:46,833
not to ticket idling cars,
916
00:43:46,833 --> 00:43:48,366
and let organizers create
917
00:43:48,366 --> 00:43:51,666
an effective public awareness
poster campaign instead.
918
00:43:51,666 --> 00:43:54,166
And we put it out to the Newark
students
919
00:43:54,166 --> 00:43:57,166
to submit in this flyer
campaign,
920
00:43:57,166 --> 00:44:00,200
and have their artwork on the
actual flyer.
921
00:44:00,200 --> 00:44:02,866
GIME NEZ-SANTANA:
As you can see, this is the
commercial corridor
922
00:44:02,866 --> 00:44:04,533
on Bloomfield Avenue.
923
00:44:04,533 --> 00:44:05,766
The site score shows a six,
924
00:44:05,766 --> 00:44:07,333
which means that we are at the
highest risk of auto theft
925
00:44:07,333 --> 00:44:09,533
in this particular location.
926
00:44:09,533 --> 00:44:11,100
And as I move closer to the end
927
00:44:11,100 --> 00:44:14,800
of the commercial corridor, the
site risk score is coming down.
928
00:44:14,800 --> 00:44:17,066
NARRATOR:
This is the first time
in Newark
929
00:44:17,066 --> 00:44:19,500
that police data for
crime occurrences
930
00:44:19,500 --> 00:44:23,300
have been shared widely with
community members.
931
00:44:23,300 --> 00:44:26,233
ELVIS PEREZ:
The kind of data we share
is incident-related data--
932
00:44:26,233 --> 00:44:29,333
sort of time, location,
that sort of information.
933
00:44:29,333 --> 00:44:31,466
We don't discuss any private
arrest information.
934
00:44:31,466 --> 00:44:35,533
We're trying to avoid a crime.
935
00:44:37,900 --> 00:44:39,633
NARRATOR:
In 2019,
936
00:44:39,633 --> 00:44:42,600
Caplan and Kennedy formed a
start-up at Rutgers
937
00:44:42,600 --> 00:44:45,533
to meet the rising demand for
their technology.
938
00:44:45,533 --> 00:44:49,133
Despite the many possible
applications for RTM,
939
00:44:49,133 --> 00:44:51,866
from tracking public health
issues
940
00:44:51,866 --> 00:44:53,566
to understanding vehicle
crashes,
941
00:44:53,566 --> 00:44:59,733
law enforcement continues to be
its principal application.
942
00:44:59,733 --> 00:45:01,433
Like any other technology,
943
00:45:01,433 --> 00:45:04,333
risk terrain modeling can be
used for the public good
944
00:45:04,333 --> 00:45:06,433
when people use it wisely.
945
00:45:08,833 --> 00:45:14,866
♪ ♪
946
00:45:14,866 --> 00:45:17,833
We as academics and scientists,
we actually need to be critical,
947
00:45:17,833 --> 00:45:20,333
because it could be the
best model in the world,
948
00:45:20,333 --> 00:45:21,733
it could be very good
predictions,
949
00:45:21,733 --> 00:45:22,900
but how you use those
predictions
950
00:45:22,900 --> 00:45:24,766
matters, in some ways,
even more.
951
00:45:24,766 --> 00:45:26,100
REPORTER:
The police department
952
00:45:26,100 --> 00:45:28,800
had revised the SSL numerous
times...
953
00:45:28,800 --> 00:45:30,100
NARRATOR:
In 2019,
954
00:45:30,100 --> 00:45:34,666
Chicago's inspector general
contracted the RAND Corporation
955
00:45:34,666 --> 00:45:38,033
to evaluate the Strategic
Subject List,
956
00:45:38,033 --> 00:45:39,566
the predictive policing platform
957
00:45:39,566 --> 00:45:45,733
that incorporated Papachristos's
research on social networks.
958
00:45:45,733 --> 00:45:47,400
PAPACHRISTOS:
I never wanted to go down this
path
959
00:45:47,400 --> 00:45:51,100
of who was the person that
was the potential suspect.
960
00:45:51,100 --> 00:45:53,066
And that problem is not
necessarily
961
00:45:53,066 --> 00:45:55,033
with the statistical model,
it's the fact that someone
962
00:45:55,033 --> 00:45:57,166
took victim and made him an
offender.
963
00:45:57,166 --> 00:46:00,133
You've criminalized
someone who is at risk,
964
00:46:00,133 --> 00:46:01,666
that you should be prioritizing
saving their life.
965
00:46:01,666 --> 00:46:07,266
NARRATOR:
It turned out that some 400,000
people were included on the SSL.
966
00:46:07,266 --> 00:46:13,600
Of those,
77% were Black or Hispanic.
967
00:46:15,133 --> 00:46:18,100
The inspector general's audit
revealed
968
00:46:18,100 --> 00:46:20,766
that SSL scores were unreliable.
969
00:46:20,766 --> 00:46:23,633
The Rand Corporation found the
program had no impact
970
00:46:23,633 --> 00:46:28,233
on homicide or victimization
rates.
971
00:46:28,233 --> 00:46:31,366
(protesters chanting)
972
00:46:31,366 --> 00:46:34,300
NARRATOR:
The program was shut down.
973
00:46:37,166 --> 00:46:38,266
But data collection continues
974
00:46:38,266 --> 00:46:41,666
to be essential to law
enforcement.
975
00:46:41,666 --> 00:46:45,300
♪ ♪
976
00:46:45,300 --> 00:46:47,933
O'NEIL:
There are things
about us that we might not even
977
00:46:47,933 --> 00:46:51,300
be aware of that are sort of
being collected
978
00:46:51,300 --> 00:46:52,933
by the data brokers
979
00:46:52,933 --> 00:46:55,066
and will be held against us for
the rest of our lives--
980
00:46:55,066 --> 00:46:58,600
held against people forever,
digitally.
981
00:46:59,733 --> 00:47:03,133
NARRATOR:
Data is produced and collected.
982
00:47:03,133 --> 00:47:05,500
But is it accurate?
983
00:47:05,500 --> 00:47:08,500
And can the data be
properly vetted?
984
00:47:08,500 --> 00:47:09,633
PAPACHRISTOS:
And that was one of the
critiques
985
00:47:09,633 --> 00:47:12,866
of not just the
Strategic Subjects List,
986
00:47:12,866 --> 00:47:14,166
but the gang database in
Chicago.
987
00:47:14,166 --> 00:47:18,066
Any data source that treats data
as a stagnant, forever condition
988
00:47:18,066 --> 00:47:20,366
is a problem.
989
00:47:23,400 --> 00:47:25,433
WOMAN:
The gang database has been
around for four years.
990
00:47:25,433 --> 00:47:28,266
It'll be five in January.
991
00:47:28,266 --> 00:47:30,900
We want to get rid of
surveillance
992
00:47:30,900 --> 00:47:33,000
in Black and brown communities.
993
00:47:33,000 --> 00:47:34,833
BENJAMIN:
In places like Chicago,
994
00:47:34,833 --> 00:47:37,366
in places like L.A.,
where I grew up,
995
00:47:37,366 --> 00:47:40,666
there are gang databases
with tens of thousands
996
00:47:40,666 --> 00:47:43,766
of people listed, their names
listed in these databases.
997
00:47:43,766 --> 00:47:45,900
Just by simply having a certain
name
998
00:47:45,900 --> 00:47:47,900
and coming from a certain
ZIP code
999
00:47:47,900 --> 00:47:50,500
could land you in these
databases.
1000
00:47:50,500 --> 00:47:53,033
Do you all feel safe
in Chicago?
1001
00:47:53,033 --> 00:47:54,333
DARRELL DACRES:
The cops pulled up out of
nowhere.
1002
00:47:54,333 --> 00:47:59,233
Didn't ask any questions, just
immediately start beating on us.
1003
00:47:59,233 --> 00:48:00,466
And basically were saying,
like,
1004
00:48:00,466 --> 00:48:02,833
what are, what are we doing over
here, you know, like,
1005
00:48:02,833 --> 00:48:04,500
in this, in this gangbang area?
1006
00:48:04,500 --> 00:48:07,033
I was already labeled as a
gangbanger
1007
00:48:07,033 --> 00:48:08,866
from that area because of where
I lived.
1008
00:48:08,866 --> 00:48:10,733
I, I just happened to live
there.
1009
00:48:12,733 --> 00:48:13,900
NARRATOR:
The Chicago gang database
1010
00:48:13,900 --> 00:48:17,700
is shared with hundreds
of law enforcement agencies.
1011
00:48:17,700 --> 00:48:19,533
Even if someone is wrongly
included,
1012
00:48:19,533 --> 00:48:24,733
there is no mechanism
to have their name removed.
1013
00:48:24,733 --> 00:48:27,166
If you try to
apply for an apartment,
1014
00:48:27,166 --> 00:48:29,633
or if you try to
apply for a job or a college,
1015
00:48:29,633 --> 00:48:34,833
or even in a, um, a house, it
will show
1016
00:48:34,833 --> 00:48:37,800
that you are in this record of
a gang database.
1017
00:48:37,800 --> 00:48:39,600
I was arrested for peacefully
protesting.
1018
00:48:39,600 --> 00:48:42,100
And they told me that, "Well,
1019
00:48:42,100 --> 00:48:43,966
you're in the gang database."
1020
00:48:43,966 --> 00:48:46,333
But I was never in no gang.
1021
00:48:46,333 --> 00:48:47,466
MAN:
Because you have a gang
designation,
1022
00:48:47,466 --> 00:48:49,500
you're a security threat group,
1023
00:48:49,500 --> 00:48:51,200
right?
1024
00:48:51,200 --> 00:48:52,500
NARRATOR:
Researchers and activists
1025
00:48:52,500 --> 00:48:54,900
have been instrumental in
dismantling
1026
00:48:54,900 --> 00:48:57,366
some of these systems.
1027
00:48:57,366 --> 00:48:58,466
And so we continue to push
back.
1028
00:48:58,466 --> 00:48:59,333
I mean, the fight is not going
to finish
1029
00:48:59,333 --> 00:49:01,200
until we get rid of
the database.
1030
00:49:01,200 --> 00:49:02,600
♪ ♪
1031
00:49:02,600 --> 00:49:05,233
FERGUSON:
I think what we're seeing now
1032
00:49:05,233 --> 00:49:07,766
is not a move away from data.
1033
00:49:07,766 --> 00:49:11,300
It's just a move away from this
term "predictive policing."
1034
00:49:11,300 --> 00:49:14,233
But we're seeing big
companies,
1035
00:49:14,233 --> 00:49:15,666
big tech, enter the policing
space.
1036
00:49:15,666 --> 00:49:19,566
We're seeing the reality that
almost all policing now
1037
00:49:19,566 --> 00:49:21,900
is data-driven.
1038
00:49:21,900 --> 00:49:23,633
You're seeing these same police
departments
1039
00:49:23,633 --> 00:49:25,266
invest heavily in the
technology,
1040
00:49:25,266 --> 00:49:28,400
including other forms of
surveillance technology,
1041
00:49:28,400 --> 00:49:30,600
including other forms
of databases
1042
00:49:30,600 --> 00:49:32,433
to sort of manage policing.
1043
00:49:32,433 --> 00:49:33,833
(chanting):
We want you out!
1044
00:49:33,833 --> 00:49:37,366
NARRATOR:
More citizens are calling for
regulations
1045
00:49:37,366 --> 00:49:38,566
to audit algorithms
1046
00:49:38,566 --> 00:49:42,200
and guarantee they're
accomplishing what they promise
1047
00:49:42,200 --> 00:49:43,433
without harm.
1048
00:49:43,433 --> 00:49:46,900
BRAYNE:
Ironically, there is very
little data
1049
00:49:46,900 --> 00:49:49,300
on police use of big data.
1050
00:49:49,300 --> 00:49:52,333
And there is no systematic data
1051
00:49:52,333 --> 00:49:54,766
at a national level
on how these tools are used.
1052
00:49:54,766 --> 00:49:57,800
The deployment of these tools
1053
00:49:57,800 --> 00:50:00,100
so far outpaces
1054
00:50:00,100 --> 00:50:02,633
legal and regulatory responses
to them.
1055
00:50:02,633 --> 00:50:03,600
What you have happening
1056
00:50:03,600 --> 00:50:06,633
is essentially this regulatory
Wild West.
1057
00:50:08,000 --> 00:50:09,333
O'NEIL:
And we're, like, "Well,
it's an algorithm,
1058
00:50:09,333 --> 00:50:11,366
let's, let's just throw it
into production."
1059
00:50:11,366 --> 00:50:13,066
Without testing it to whether
1060
00:50:13,066 --> 00:50:18,966
it "works" sufficiently,
um, at all.
1061
00:50:20,333 --> 00:50:22,833
NARRATOR:
Multiple requests for comment
1062
00:50:22,833 --> 00:50:23,966
from police agencies and
law enforcement officials
1063
00:50:23,966 --> 00:50:27,766
in several cities, including
Chicago and New York,
1064
00:50:27,766 --> 00:50:31,933
were either declined or went
unanswered.
1065
00:50:31,933 --> 00:50:36,966
♪ ♪
1066
00:50:36,966 --> 00:50:40,533
Artificial intelligence must
serve people,
1067
00:50:40,533 --> 00:50:42,666
and therefore artificial
intelligence
1068
00:50:42,666 --> 00:50:44,333
must always comply
with people's rights.
1069
00:50:44,333 --> 00:50:49,533
NARRATOR:
The European Union is preparing
to implement legislation
1070
00:50:49,533 --> 00:50:51,433
to regulate artificial
intelligence.
1071
00:50:51,433 --> 00:50:56,833
In 2021, bills to regulate data
science algorithms
1072
00:50:56,833 --> 00:50:59,666
were introduced in 17 states,
1073
00:50:59,666 --> 00:51:03,500
and enacted in Alabama,
Colorado,
1074
00:51:03,500 --> 00:51:05,466
Illinois, and Mississippi.
1075
00:51:05,466 --> 00:51:07,666
SWEENEY:
If you look carefully on
electrical devices,
1076
00:51:07,666 --> 00:51:10,766
you'll see "U.L.," for
Underwriters Laboratory.
1077
00:51:10,766 --> 00:51:11,933
That's a process that came
about
1078
00:51:11,933 --> 00:51:13,533
so that things, when you plugged
them in,
1079
00:51:13,533 --> 00:51:14,900
didn't blow up in your hand.
1080
00:51:14,900 --> 00:51:16,366
That's the same kind of idea
1081
00:51:16,366 --> 00:51:18,966
that we need in these
algorithms.
1082
00:51:21,300 --> 00:51:24,266
O'NEIL:
We can adjust it to make
it better than the past,
1083
00:51:24,266 --> 00:51:25,900
and we can do it carefully,
1084
00:51:25,900 --> 00:51:28,100
and we can do it with, with
precision
1085
00:51:28,100 --> 00:51:30,833
in an ongoing conversation about
what it means to us
1086
00:51:30,833 --> 00:51:34,066
that it is, it's biased in the
right way.
1087
00:51:34,066 --> 00:51:35,400
I don't think you remove bias,
1088
00:51:35,400 --> 00:51:37,900
but you get to a bias that you
can live with,
1089
00:51:37,900 --> 00:51:40,633
that you, you think is moral.
1090
00:51:40,633 --> 00:51:43,000
To be clear, like, I, I think we
can do better,
1091
00:51:43,000 --> 00:51:44,566
but often doing better
1092
00:51:44,566 --> 00:51:47,566
would look like we don't use
this at all.
1093
00:51:47,566 --> 00:51:48,766
(radio running)
1094
00:51:48,766 --> 00:51:51,066
FARID:
There's nothing fundamentally
wrong
1095
00:51:51,066 --> 00:51:52,333
with trying to predict the
future,
1096
00:51:52,333 --> 00:51:54,500
as long as you understand how
the algorithms are working,
1097
00:51:54,500 --> 00:51:55,533
how are they being deployed.
1098
00:51:55,533 --> 00:51:59,133
What is the consequence
of getting it right?
1099
00:51:59,133 --> 00:52:00,466
And most importantly is,
1100
00:52:00,466 --> 00:52:03,133
what is the consequence of
getting it wrong?
1101
00:52:03,133 --> 00:52:04,333
OFFICER:
Keep your hands on
the steering wheel!
1102
00:52:04,333 --> 00:52:06,000
MAN:
My hands haven't moved
off the steering wheel!
1103
00:52:06,000 --> 00:52:07,833
MAN 2:
Are you gonna arrest me?
1104
00:52:07,833 --> 00:52:08,700
MAN 1:
Officer, what are we here for?
1105
00:52:08,700 --> 00:52:09,500
OFFICER:
We just want to talk with...
1106
00:52:31,333 --> 00:52:34,533
♪ ♪
1107
00:52:55,500 --> 00:52:58,566
ANNOUNCER:
This program is available
with PBS Passport
1108
00:52:58,566 --> 00:53:00,866
and on Amazon Prime Video.
1109
00:53:00,866 --> 00:53:04,833
♪ ♪
1110
00:53:15,366 --> 00:53:20,933
♪ ♪
88818
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.