Do Search Engines Suppress Controversy?
(How would we know?)

Susan Gerhart, Ph.D.
http://pr.erau.edu/~gerharts and http://www.twurl.com
Draft: Oct. 15 2003 and Outline: Oct. 17, 2003

Inherent Biases of the Web

Why Controversy Matters

  1. Life-critical, business-critical decisions might miss facts or disputes
  2. Controversies expand the richness and depth of topics, dramatize change
  3. Professionals need to report, investigate, and analyze controversies

Measuring Controversy - Experimental Methodology

  1. Choose topics: broad, controversial subtopic, factual subtopic
  2. Search using multiple engines (Google,FAST,Teoma,Profusion,Copernic) (top 50)
  3. Browse and categorize: deep, revealing, other, off-topic
  4. Summarize: amount of deep and revealing
  5. Identify and explain suppressing and revealing factors
  6. Assess validity and reliability of methodology
Topic and subtopic relationships


Broad Topic
Controversy
Factual
Distance Learning
David Noble: will  "digital diploma mills" ruin academics?
formative evaluation
Albert Einstein
1st wife Mileva Maric scientific colleague?
pacifism
Female astronauts
"Mercury 13" women aviators tested, barred, discrimination?
astronaut selection
St. John's Wort
Clinical trials in progress, effective? (safe?)
dosage
Belize
Guatemala border dispute, settlement?
narcotics

Results of Experiments


#URLs
%Deep %Deep or
Revealing
Comment
DISTANCE LEARNING (totals) 661
15 38 NO DARK SIDE
controversy david noble 281
36 89 Lots of articles in 1998
factual formative evaluation 179
0 1
No research influence
simple distance learning
212

0 2
seller dominated
ALBERT EINSTEIN (totals) 779
6
20
BIOGRAPHIES GALORE
controversy mileva maric 504
8
26 hard to find
factual albert einstein pacifist 180
1
5
deeper biographies
simple albert einstein 171
0 0 bio dominated
FEMALE ASTRONAUTS (totals)
663
14 26
RICH TALE REVIVED
controversy mercury 13, Jerrie Cobb 210
41 76
promoted by authors
factual astronaut selection 167
1
4 left out of history
simple female astronauts
349

9
16
enlivens stale topic
ST. JOHN's WORT (totals) 948
15
27 PUBLIC, BE WARY
controversy St. John's Wort  effectiveness 648
21
35 .gov disseminated
factual St. John's Wort dosage 221
9
16 confusion
simple St. John's Wort
267

11
29
warnings available
BELIZE (totals) 443
5 25
BURIED HISTORY
controversy belize guatemala dispute 144
16
7 history written
factual belize narcotics 167
0
3
current politics
simple belize 169
<1
7
tourism dominated



Revealing/Suppressing Factors



Distance Learning/
Diploma Mills
Albert Einstein/
Mileva Maric
Female Astronauts/
Mercury 13
St. John's Wort
Effectiveness
Belize/
Guatemala Border
Suppressing Factors
Organizational Clout
sellers, services, trade associations
reference and science sites
NASA and big space sites
store sites
tourism industry
Poorly organized
academic debates
no web site



Duplication,
junk

biographies, quotes
lists of women firsts, history
chains of stores
lots of hotels
Analytic web secondary
newsgroups, ezines
low readership
few in-depth biographies


dissertations and long histories
Revealing Factors
Promotion
debates,
articles online
letters, biographies
books, 99s, NOW
NIH, syndication
Govt. website
Social Relevance
college costs, professors' worry
Serb, feminism
role models, activism
safety, HIV
colonialism, mediations
Timeliness
Web commercialism
debates
Time person of century, ads, books
Columbia, 20/40 years,. Glenn flight
clinical trials in progress
referenda on settlement
Media interest
"diploma mill"  exposes
1st wives club, love letters
character profiles, pilot feats,
nutrition, medicine news
daily stories


Toward a More Objective Web:
From Power Laws to Objectively Distributed Links

Suppose

Simulated Objective Web

Conclusions from Experiments

  1. Experimental methodology is inconclusive, needs refining, but raises good questions
  2. Controversies are on the web, but hard to find, miss interesting and important content
  3. Search technology is biased  to present the "sunny side" of topics.
    You have to search harder for the "dark side".
  4. Why? Because engines reflect authors' links and searchers' choices
  5. Many suppressing/revealing influences are controllable by authors, engines, searchers