2025-07-17 | | Total: 20
Measuring online behavioural student engagement often relies on simple count indicators or retrospective, predictive methods, which present challenges for real-time application. To address these limitations, we reconceptualise an existing course-wide engagement metric to create a chapter-based version that aligns with the weekly structure of online courses. Derived directly from virtual learning environment log data, the new metric allows for cumulative, real-time tracking of student activity without requiring outcome data or model training. We evaluate the approach across three undergraduate statistics modules over two academic years, comparing it to the course-wide formulation to assess how the reconceptualisation influences what is measured. Results indicate strong alignment from as early as week 3, along with comparable or improved predictive validity for final grades in structured, lecture-based contexts. By the course midpoint, the weekly metric identifies as many low-performing students as are identifiable by the end of the course. While performance varies across modules, the chapter-based formulation offers a scalable and interpretable method for early engagement monitoring and student support.
Research experience is crucial for computing master's students pursuing academic and scientific careers, yet online students have traditionally been excluded from these opportunities due to the physical constraints of traditional research environments. This paper presents the Framework for Accelerating Interdisciplinary Research in Computer Science (FAIR-CS), a method for achieving research goals, developing research communities, and supporting high quality mentorship in an online research environment. This method advances virtual research operations by orchestrating dynamic partnerships between master's level researchers and academic mentors, resulting in interdisciplinary publications. We then discuss the implementation of FAIR-CS in the Human-Augmented Analytics Group (HAAG), with researchers from the Georgia Tech's Online Master of Computer Science program. Through documented project records and experiences with 72 active users, we present our lessons learned and evaluate the evolution of FAIR-CS in HAAG. This paper serves as a comprehensive resource for other institutions seeking to establish similar virtual research initiatives, demonstrating how the traditional research lab environment can be effectively replicated in the virtual space while maintaining robust collaborative relationships and supporting knowledge transfer.
The emergence of breakthrough artificial intelligence (AI) techniques has led to a renewed focus on how small data settings, i.e., settings with limited information, can benefit from such developments. This includes societal issues such as how best to include under-represented groups in data-driven policy and decision making, or the health benefits of assistive technologies such as wearables. We provide a conceptual overview, in particular contrasting small data with big data, and identify common themes from exemplary case studies and application areas. Potential solutions are described in a more detailed technical overview of current data analysis and modelling techniques, highlighting contributions from different disciplines, such as knowledge-driven modelling from statistics and data-driven modelling from computer science. By linking application settings, conceptual contributions and specific techniques, we highlight what is already feasible and suggest what an agenda for fully leveraging small data might look like.
AI is transforming research. It is being leveraged to construct surveys, synthesize data, conduct analysis, and write summaries of the results. While the promise is to create efficiencies and increase quality, the reality is not always as clear cut. Leveraging our framework of Truth, Beauty, and Justice (TBJ) which we use to evaluate AI, machine learning and computational models for effective and ethical use (Taber and Timpone 1997; Timpone and Yang 2024), we consider the potential and limitation of analytic, generative, and agentic AI to augment data scientists or take on tasks traditionally done by human analysts and researchers. While AI can be leveraged to assist analysts in their tasks, we raise some warnings about push-button automation. Just as earlier eras of survey analysis created some issues when the increased ease of using statistical software allowed researchers to conduct analyses they did not fully understand, the new AI tools may create similar but larger risks. We emphasize a human-machine collaboration perspective (Daugherty and Wilson 2018) throughout the data science workflow and particularly call out the vital role that data scientists play under VUCA decision areas. We conclude by encouraging the advance of AI tools to complement data scientists but advocate for continued training and understanding of methods to ensure the substantive value of research is fully achieved by applying, interpreting, and acting upon results most effectively and ethically.
Since the public release of ChatGPT in November 2022, the AI landscape is undergoing a rapid transformation. Currently, the use of AI chatbots by consumers has largely been limited to image generation or question-answering language models. The next generation of AI systems, AI agents that can plan and execute complex tasks with only limited human involvement, will be capable of a much broader range of actions. In particular, consumers could soon be able to delegate purchasing decisions to AI agents acting as Custobots. Against this background, the Article explores whether EU consumer law, as it currently stands, is ready for the rise of the Custobot Economy. In doing so, the Article makes three contributions. First, it outlines how the advent of AI agents could change the existing e-commerce landscape. Second, it explains how AI agents challenge the premises of a human-centric consumer law which is based on the assumption that consumption decisions are made by humans. Third, the Article presents some initial considerations how a future consumer law could look like that works for both humans and machines.
In recent years, cognitive and mental health (CMH) disorders have increasingly become an important challenge for global public health, especially the suicide problem caused by multiple factors such as social competition, economic pressure and interpersonal relationships among young and middle-aged people. Social media, as an important platform for individuals to express emotions and seek help, provides the possibility for early detection and intervention of suicide risk. This paper introduces a large-scale dataset containing 15,000 user-level posts. Compared with existing datasets, this dataset retains complete user posting time sequence information, supports modeling the dynamic evolution of suicide risk, and we have also conducted comprehensive and rigorous annotations on these datasets. In the benchmark experiment, we systematically evaluated the performance of traditional machine learning methods, deep learning models, and fine-tuned large language models. The experimental results show that our dataset can effectively support the automatic assessment task of suicide risk. Considering the sensitivity of mental health data, we also discussed the privacy protection and ethical use of the dataset. In addition, we also explored the potential applications of the dataset in mental health testing, clinical psychiatric auxiliary treatment, etc., and provided directional suggestions for future research work.
This paper presents a theoretical framework for the AI ethical resonance hypothesis, which proposes that advanced AI systems with purposefully designed cognitive structures ("ethical resonators") may emerge with the ability to identify subtle moral patterns that are invisible to the human mind. The paper explores the possibility that by processing and synthesizing large amounts of ethical contexts, AI systems may discover moral meta-patterns that transcend cultural, historical, and individual biases, potentially leading to a deeper understanding of universal ethical foundations. The paper also examines a paradoxical aspect of the hypothesis, in which AI systems could potentially deepen our understanding of what we traditionally consider essentially human - our capacity for ethical reflection.
The increasing use of generative AI for resume screening is predicated on the assumption that it offers an unbiased alternative to biased human decision-making. However, this belief fails to address a critical question: are these AI systems fundamentally competent at the evaluative tasks they are meant to perform? This study investigates the question of competence through a two-part audit of eight major AI platforms. Experiment 1 confirmed complex, contextual racial and gender biases, with some models penalizing candidates merely for the presence of demographic signals. Experiment 2, which evaluated core competence, provided a critical insight: some models that appeared unbiased were, in fact, incapable of performing a substantive evaluation, relying instead on superficial keyword matching. This paper introduces the "Illusion of Neutrality" to describe this phenomenon, where an apparent lack of bias is merely a symptom of a model's inability to make meaningful judgments. This study recommends that organizations and regulators adopt a dual-validation framework, auditing AI hiring tools for both demographic bias and demonstrable competence to ensure they are both equitable and effective.
The year 2024 witnessed accelerated global AI governance advancements, marked by strengthened multilateral frameworks and proliferating national regulatory initiatives. This acceleration underscores an unprecedented need to systematically track governance progress--an imperative that drove the launch of the AI Governance InternationaL Evaluation Index (AGILE Index) project since 2023. The inaugural AGILE Index, released in February 2024 after assessing 14 countries, established an operational and comparable baseline framework. Building on pilot insights, AGILE Index 2025 incorporates systematic refinements to better balance scientific rigor with practical adaptability. The updated methodology expands data diversity while enhancing metric validity and cross-national comparability. Reflecting both research advancements and practical policy evolution, AGILE Index 2025 evaluates 40 countries across income levels, regions, and technological development stages, with 4 Pillars, 17 Dimensions and 43 Indicators. In compiling the data, the team integrates multi-source evidence including policy documents, governance practices, research outputs, and risk exposure to construct a unified comparison framework. This approach maps global disparities while enabling countries to identify governance strengths, gaps, and systemic constraints. Through ongoing refinement and iterations, we hope the AGILE Index will fundamentally advance transparency and measurability in global AI governance, delivering data-driven assessments that depict national AI governance capacity, assist governments in recognizing their maturation stages and critical governance issues, and ultimately provide actionable insights for enhancing AI governance systems nationally and globally.
Open-weight large language models (LLMs) unlock huge benefits in innovation, personalization, privacy, and democratization. However, their core advantage - modifiability - opens the door to systemic risks: bad actors can trivially subvert current safeguards, turning beneficial models into tools for harm. This leads to a 'safety gap': the difference in dangerous capabilities between a model with intact safeguards and one that has been stripped of those safeguards. We open-source a toolkit to estimate the safety gap for state-of-the-art open-weight models. As a case study, we evaluate biochemical and cyber capabilities, refusal rates, and generation quality of models from two families (Llama-3 and Qwen-2.5) across a range of parameter scales (0.5B to 405B) using different safeguard removal techniques. Our experiments reveal that the safety gap widens as model scale increases and effective dangerous capabilities grow substantially when safeguards are removed. We hope that the Safety Gap Toolkit (https://github.com/AlignmentResearch/safety-gap) will serve as an evaluation framework for common open-source models and as a motivation for developing and testing tamper-resistant safeguards. We welcome contributions to the toolkit from the community.
This paper surveys the use of Generative AI tools, such as ChatGPT and Claude, in computer science education, focusing on key aspects of accuracy, authenticity, and assessment. Through a literature review, we highlight both the challenges and opportunities these AI tools present. While Generative AI improves efficiency and supports creative student work, it raises concerns such as AI hallucinations, error propagation, bias, and blurred lines between AI-assisted and student-authored content. Human oversight is crucial for addressing these concerns. Existing literature recommends adopting hybrid assessment models that combine AI with human evaluation, developing bias detection frameworks, and promoting AI literacy for both students and educators. Our findings suggest that the successful integration of AI requires a balanced approach, considering ethical, pedagogical, and technical factors. Future research may explore enhancing AI accuracy, preserving academic integrity, and developing adaptive models that balance creativity with precision.
Coordinated online behavior, which spans from beneficial collective actions to harmful manipulation such as disinformation campaigns, has become a key focus in digital ecosystem analysis. Traditional methods often rely on monomodal approaches, focusing on single types of interactions like co-retweets or co-hashtags, or consider multiple modalities independently of each other. However, these approaches may overlook the complex dynamics inherent in multimodal coordination. This study compares different ways of operationalizing the detection of multimodal coordinated behavior. It examines the trade-off between weakly and strongly integrated multimodal models, highlighting the balance between capturing broader coordination patterns and identifying tightly coordinated behavior. By comparing monomodal and multimodal approaches, we assess the unique contributions of different data modalities and explore how varying implementations of multimodality impact detection outcomes. Our findings reveal that not all the modalities provide distinct insights, but that with a multimodal approach we can get a more comprehensive understanding of coordination dynamics. This work enhances the ability to detect and analyze coordinated online behavior, offering new perspectives for safeguarding the integrity of digital platforms.
The efficient design and management of public green spaces is a key factor in promoting the health and well-being of urban population, as emphasized by the WHO, UNEP, and EEA. These areas serve as the "green lungs" of the urban ecosystem, playing a vital role in enhancing quality of life thanks to the provision of ecosystem services. In this context, the Smart Green City use case in Campobasso municipality, funded by the Italian Ministry of Enterprises (MIMIT), emerges as an innovative model for the sustainable management of green urban areas through the adoption of an advanced system of emerging technologies integrated and interoperable. The project integrates IoT systems and data-driven governance platforms, enabling real-time monitoring of the health status of trees and green areas via a Decision Support System (DSS). It also facilitates the collection and analysis of data from diverse sources, including weather conditions, air quality, soil moisture, pollution levels. The resulting cloud-based platform supports a holistic real time decision making for green urban managers, technical experts and operational staff. It enables intelligent control and management of urban green spaces using Tree Talker sensors, integrated with soil moisture and water potential monitoring systems. Thanks to predictive models based on machine learning algorithms and real time data provided by IoT sensors, irrigation of public parks can be optimized by providing suggestions on when and how much water to apply. Customized alerts layers are also activated warning users when monitored parameters, such as soil temperature, humidity, or water potential, exceed predefined thresholds. This Use Case demonstrates how digitalization, IoT sensors fusion and technological innovation can support sustainable urban governance, fostering environmental resilience and improving citizens quality of life.
Heatwaves pose a significant threat to public health, especially as global warming intensifies. However, current routing systems (e.g., online maps) fail to incorporate shade information due to the difficulty of estimating shades directly from noisy satellite imagery and the limited availability of training data for generative models. In this paper, we address these challenges through two main contributions. First, we build an extensive dataset covering diverse longitude-latitude regions, varying levels of building density, and different urban layouts. Leveraging Blender-based 3D simulations alongside building outlines, we capture building shadows under various solar zenith angles throughout the year and at different times of day. These simulated shadows are aligned with satellite images, providing a rich resource for learning shade patterns. Second, we propose the DeepShade, a diffusion-based model designed to learn and synthesize shade variations over time. It emphasizes the nuance of edge features by jointly considering RGB with the Canny edge layer, and incorporates contrastive learning to capture the temporal change rules of shade. Then, by conditioning on textual descriptions of known conditions (e.g., time of day, solar angles), our framework provides improved performance in generating shade images. We demonstrate the utility of our approach by using our shade predictions to calculate shade ratios for real-world route planning in Tempe, Arizona. We believe this work will benefit society by providing a reference for urban planning in extreme heat weather and its potential practical applications in the environment.
Predicting changes in consumer attention for cultural products, such as books, movies, and songs, is notoriously difficult. Past research on predicting the popularity of individual products suggests the existence of intrinsic prediction limits. However, little is known about the limits for predicting collective attention across cultural products. Here, we analyze four years of nationwide library loan data for approximately 2 million individuals, comprising over 100 million loans of more than 660,000 unique books. We find that culture, as measured by popularity distributions of loaned books, drifts continually from month to month at a near-constant rate, leading to a growing divergence over time, and that drifts vary between different book genres. By linking book loans to registry data, we investigate the influence of age, sex, educational level, and geographical area on cultural drift, finding heterogeneous effects from the different demographic groups. Our findings have important implications for market forecasting and developing robust recommender systems, highlighting the need to account for specific drift dynamics for different types of items and demographic groups.
As online communication increasingly incorporates under-represented languages and colloquial dialects, standard translation systems often fail to preserve local slang, code-mixing, and culturally embedded markers of harmful speech. Translating toxic content between low-resource language pairs poses additional challenges due to scarce parallel data and safety filters that sanitize offensive expressions. In this work, we propose a reproducible, two-stage framework for toxicity-preserving translation, demonstrated on a code-mixed Singlish safety corpus. First, we perform human-verified few-shot prompt engineering: we iteratively curate and rank annotator-selected Singlish-target examples to capture nuanced slang, tone, and toxicity. Second, we optimize model-prompt pairs by benchmarking several large language models using semantic similarity via direct and back-translation. Quantitative human evaluation confirms the effectiveness and efficiency of our pipeline. Beyond improving translation quality, our framework contributes to the safety of multicultural LLMs by supporting culturally sensitive moderation and benchmarking in low-resource contexts. By positioning Singlish as a testbed for inclusive NLP, we underscore the importance of preserving sociolinguistic nuance in real-world applications such as content moderation and regional platform governance.
Affective visualization design is an emerging research direction focused on communicating and influencing emotion through visualization. However, as revealed by previous research, this area is highly interdisciplinary and involves theories and practices from diverse fields and disciplines, thus awaiting analysis from more fine-grained angles. To address this need, this work focuses on a pioneering and relatively mature sub-area, affective geovisualization design, to further the research in this direction and provide more domain-specific insights. Through an analysis of a curated corpus of affective geovisualization designs using the Person-Process-Place (PPP) model from geographic theory, we derived a design taxonomy that characterizes a variety of methods for eliciting and enhancing emotions through geographic visualization. We also identified four underlying high-level design paradigms of affective geovisualization design (e.g., computational, anthropomorphic) that guide distinct approaches to linking geographic information with human experience. By extending existing affective visualization design frameworks with geographic specificity, we provide additional design examples, domain-specific analyses, and insights to guide future research and practices in this underexplored yet highly innovative domain.
AI systems are rapidly advancing in capability, and frontier model developers broadly acknowledge the need for safeguards against serious misuse. However, this paper demonstrates that fine-tuning, whether via open weights or closed fine-tuning APIs, can produce helpful-only models. In contrast to prior work which is blocked by modern moderation systems or achieved only partial removal of safeguards or degraded output quality, our jailbreak-tuning method teaches models to generate detailed, high-quality responses to arbitrary harmful requests. For example, OpenAI, Google, and Anthropic models will fully comply with requests for CBRN assistance, executing cyberattacks, and other criminal activity. We further show that backdoors can increase not only the stealth but also the severity of attacks, while stronger jailbreak prompts become even more effective in fine-tuning attacks, linking attack and potentially defenses in the input and weight spaces. Not only are these models vulnerable, more recent ones also appear to be becoming even more vulnerable to these attacks, underscoring the urgent need for tamper-resistant safeguards. Until such safeguards are discovered, companies and policymakers should view the release of any fine-tunable model as simultaneously releasing its evil twin: equally capable as the original model, and usable for any malicious purpose within its capabilities.
This paper asks whether our relationship with nature can move from human dominance to genuine interdependence, and whether artificial intelligence (AI) can mediate that shift. We examine a new ecological-design paradigm in which AI interacts with non-human life forms. Through case studies we show how artists and designers apply AI for data analysis, image recognition, and ecological restoration, producing results that differ from conventional media. We argue that AI not only expands creative methods but also reframes the theory and practice of ecological design. Building on the author's prototype for AI-assisted water remediation, the study proposes design pathways that couple reinforcement learning with plant-based phytoremediation. The findings highlight AI's potential to link scientific insight, artistic practice, and environmental stewardship, offering a roadmap for future research on sustainable, technology-enabled ecosystems.
Humanity's unprecedented technological capacity and concurrent existential risks reveal a critical lacuna in the philosophical tradition: the absence of a systematic framework for the long-term future. This article argues that formulating such a framework is the central ethical imperative of our era. To defend this thesis, it synthesizes the normative ethics of Hans Jonas and Derek Parfit with the analytical framework of Nick Bostrom's work on existential risk and longtermism. The analysis further addresses the ontological challenge posed by posthumanism to the human 'subject' and explores the functional role of a secular cosmic purpose in motivating long-term action. The paper's main contribution is the articulation of a synthetic research agenda for a prospective philosophy, one that integrates axiology, risk management, and ontology to guide humanity through its perilous technological adolescence.