This is the first of a series of blog-posts about the outcome of the CHI 2018 Program Committee meeting and elements of the wider submission-handling process. In this short post, we focus on the conditional acceptance and rejection of submissions to the Papers track.
Future posts will consider, amongst other things, the role of rebuttals in conditional acceptance, the effect of virtual and physical subcommittees on conditional acceptance and year-on-year reviewer turnover. We hope these analyses will give you more insight into how the process works.
…….The TPC team
- 667 papers were conditionally accepted, for a conditional acceptance rate of 25.7%. This is close to the average 2012-16 acceptance rate for Papers (not including Notes) of 25.1%.
- The mean of mean scores for accepted papers was 3.73.
- The lowest scoring paper with a conditional accept had an average score of 2.25. (So there’s always hope!)
- Only two papers with a mean score of 3.5 were rejected. (Above average scores are not a guarantee of conditional acceptance.)
Overall distribution of scores
The mean of mean scores across all papers was 2.56, which was unchanged from the average prior to the rebuttal period. (A full analysis of rebuttals will follow in a separate blog post.) The distribution is bimodal, with a relatively small number of means at or closely around 3.0.
Mean score frequency by paper decision
Apart from submissions that were quick or desk rejected, each manuscript received least four reviews: two external and two internal. In total, 10391 individual scores were given to papers. The modal score was 2.0, and 49% of reviews gave a submission a score of 2.0 or 2.5. Just under 16% of review scores were 4.0 or greater.
Table of final score frequency for all reviews
Conditionally accepted papers
The 667 conditionally accepted papers ranged in their average score from 2.25 to 5 (M=3.73). Twenty-one papers (3% of conditionally accepted submissions and 0.8% of all submissions) have been conditionally accepted with an average score of less than 3. After rebuttals, three papers had a mean score of 5. There were 213 submissions (11% of all submissions, 32% of conditional accepts) with a mean score of 4 or more.
Each paper had several reviews. The score of a given review usually varies from the average score for a paper. Therefore, a standard deviation of scores for each submission can be calculated. The mean of these standard deviations for conditionally accepted papers was 0.46, so scores on conditionally accepted papers were generally within a half-point of the mean of a paper.
Rejected papers had a mean of mean score of 2.2. 1267 rejected submissions had an average between 2 and 3. That’s 49% of all submissions and fully 66% of all rejected papers.
The mean of the standard deviations of scores for each paper was 0.28. This implies less variance in the scores for each rejected submission than in conditionally accepted submissions. This might be an artefact of scale compression; if a reviewer is positive about a paper they might give it between a 3.5 and a 5.0. If they don’t, they might give it a 2.5, a 2.0 or a 1.5. Only 12 papers that were rejected (not quick or desk rejected) with a mean score of 1.0. Only 85 papers averaged less than 1.5. Indeed, as we saw earlier, 49% of all scores given to submissions were a 2.0 or a 2.5.
As you can see from the histogram, a mean score between 2.75 and 3.25 is the cross-over point between rejection and conditional acceptance. Only 14 papers with a score of 3.25 or over have been rejected. Only seven papers with a score of less than 2.75 have been conditionally accepted. Predicting outcomes solely on mean score is therefore difficult between 2.75 and 3.25.
There were 345 (13%) papers that scored ≥2.75 and ≤3.25. On average, these papers received 4.28 reviews, versus 4.01 for all papers, so additional reviewers (e.g., a 3AC) were often brought in to bear on these ‘borderline’ papers. Of these 345 papers, 29 had one score (either by an AC or external reviewer) of less than 2.0. Of these, 12 were conditionally accepted and 17 were rejected. This year, significant preparatory work was undertaken to ensure that 3ACs, where requested, were assigned in advance of the PC meeting. This was to ensure sufficient time to thoroughly read papers, reviews, and rebuttals and to produce high-quality reviews.
This year, 1ACs were given a more explicit editorial role than in previous years. 2ACs and potentially 3ACs at the meeting would of course also have had an influence, but what was the effect of the 1AC’s score on the overall decision? Well, 214 of this subset of 345 papers had a 1AC score that was less than 3.0. Of these 214 papers, only 11 were subsequently conditionally accepted. Of the 131 submissions in this subset that scored 3.0 or more, 90 were conditionally accepted. This suggests that in this borderline region between 2.75 and 3.25, having the 1AC on-side and positive is essential for success. But then those of you reading this who have submitted in the past will already know that.
It is also plausible that 1ACs simply uprated their scores to match a positive outcome, but we know from analyses that we have already conducted that this doesn’t seem to be the case. We will explore this in more detail in our next blog post on the effect of rebuttals on outcomes.
Anna Cox and Mark Perry
Technical Programme Chairs, ACM CHI 2018
Analytics Chair, ACM CHI 2018