Total: 1
Information prioritization plays an important role in the way we perceive and understand the world. Homepage layouts, which are daily and manually curated by expert human news editors, serve as a tangible proxy for this prioritization. In this work, we present NewsHomepages, a novel and massive dataset of over 3,000 news website homepages, including local, national, and topic-specific outlets, captured twice daily over a five-year period. We develop a scalable pairwise preference model to capture ranked preferences between news items and confirm that these preferences are stable and learnable: our models infer editorial preference with over 0.7 F1 score (based on human trials). To demonstrate the importance of these learned preferences, we (1) perform a novel analysis showing that outlets across the political spectrum share surprising preference agreements and (2) apply our models to rank-order a collection of local city council policies passed over a ten-year period in San Francisco, assessing their “newsworthiness”. Our findings lay the groundwork for leveraging implicit cues to deepen our understanding of human informational preference.