The oddly high impact of 'security' conferences

While researching something unrelated, I stumbled across an interesting feature of CiteSeerX: "estimated venue impact factors." That is, it attempts to rank CS-related conferences and journals in terms of their 'impact.' However, something seems to be wrong with their algorithm-- there is no way that a single sub-specialty (security) can contain eight of the top 25 conferences.

For those of you with no idea what I'm talking about: There are a number of sites that attempt to catalog and organize the vast, inter-connected web of academic papers in Computer Science. For example, DBLP, CiteSeer and CiteSeerX all allow you to search for papers by venue, papers by author, find papers that cite a particular paper, etc. etc. etc. (I'm sure that there are similar sites for other specialties, but I'm not familiar with them.)

Once you've got all this information, though, it becomes tempting to do something interesting with it. DBLP, for example, will list aggregate information about authors such as their co-authors, the venues they publish at, and how much they published each year. Their page on me, for example, reveals some decidely uncomfortable truths about my recent academic output.¹

But before I dwell too much on that depressing thought, let me quickly return to the venue-impact estimates I mentioned earlier. CiteSeerX attempts to estimate 'Garfield's traditional impact factor' for each conference, and thus produces a ranked list. At the time of this writing, the list contains 581 conferences and the top 25 are as follows:

POPL 0.45
OSDI 0.43
PLDI 0.4
ACM Conference on Computer and Communications Security 0.39
S&P 0.37
NSDI 0.37
CSFW 0.33
ASPLOS 0.32
SIGCOMM 0.31
RAID 0.31
EuroSys 0.3
FAST 0.3
TCC 0.26
IPTPS 0.26
CGO 0.25
CRYPTO 0.25
VMCAI 0.25
TACAS 0.25
SAS 0.23
CAV 0.22
ESOP 0.22
LCTES 0.2
USENIX Annual Technical Conference, General Track 0.19
EUROCRYPT 0.17
Public Key Cryptography 0.17

The ones in bold are security related, and you'll note that there are eight of them. To which I say: really? A full third? There's obviously something wrong here, and a quick glance at Wikipedia reveals what:

In a given year, the impact factor of a journal is the average number of citations to those papers that were published during the two preceding years. For example, the 2008 impact factor of a journal would be calculated as follows:

A = the number of times articles published in 2006 and 2007 were cited by indexed journals during 2008

B = the total number of "citable items" published in 2006 and 2007. ("Citable items" are usually articles, reviews, proceedings, or notes; not editorials or Letters-to-the-Editor.)

2008 impact factor = A/B

Ah, that explains things. It is not that security necessarily has a particularly high impact, but that we cite ourselves more frequently than other sub-disciplines. There could be a number of reasons for this, but I suspect that this is mostly just a cultural thing. And if this is the case, it is probably a mistake to use the same impact-estimate statistic to compare conferences across different sub-specialties. That is, this list might be useful to compare CSF to CRYPTO, for example,² but not CSF to POPL.

But I hasten to note that that's the page for "Jonathan C. Herzog." Most annoyingly, they seem to regard "Jonathan C. Herzog" and just "Jonathan Herzog" as different people. ↩
Well, not really. But the comparison between EUROCRYPT and Financial Cryptography is probably fair. ↩