2025-06-19 | | Total: 31
Phishing websites continue to pose a significant cybersecurity threat, often leveraging deceptive structures, brand impersonation, and social engineering tactics to evade detection. While recent advances in large language models (LLMs) have enabled improved phishing detection through contextual understanding, most existing approaches rely on single-agent classification facing the risks of hallucination and lack interpretability or robustness. To address these limitations, we propose PhishDebate, a modular multi-agent LLM-based debate framework for phishing website detection. PhishDebate employs four specialized agents to independently analyze different textual aspects of a webpage--URL structure, HTML composition, semantic content, and brand impersonation--under the coordination of a Moderator and a final Judge. Through structured debate and divergent thinking, the framework delivers more accurate and interpretable decisions. Extensive evaluations on commercial LLMs demonstrate that PhishDebate achieves 98.2% recall and 98.2% True Positive Rate (TPR) on a real-world phishing dataset, and outperforms single-agent and Chain of Thought (CoT) baselines. Additionally, its modular design allows agent-level configurability, enabling adaptation to varying resource and application requirements.
Although Rust ensures memory safety by default, it also permits the use of unsafe code, which can introduce memory safety vulnerabilities if misused. Unfortunately, existing tools for detecting memory bugs in Rust typically exhibit limited detection capabilities, inadequately handle Rust-specific types, or rely heavily on manual intervention. To address these limitations, we present deepSURF, a tool that integrates static analysis with Large Language Model (LLM)-guided fuzzing harness generation to effectively identify memory safety vulnerabilities in Rust libraries, specifically targeting unsafe code. deepSURF introduces a novel approach for handling generics by substituting them with custom types and generating tailored implementations for the required traits, enabling the fuzzer to simulate user-defined behaviors within the fuzzed library. Additionally, deepSURF employs LLMs to augment fuzzing harnesses dynamically, facilitating exploration of complex API interactions and significantly increasing the likelihood of exposing memory safety vulnerabilities. We evaluated deepSURF on 27 real-world Rust crates, successfully rediscovering 20 known memory safety bugs and uncovering 6 previously unknown vulnerabilities, demonstrating clear improvements over state-of-the-art tools.
Randomness extractors are algorithms that distill weak random sources into near-perfect random numbers. Two-source extractors enable this distillation process by combining two independent weak random sources. Raz's extractor (STOC '05) was the first to achieve this in a setting where one source has linear min-entropy (i.e., proportional to its length), while the other has only logarithmic min-entropy in its length. However, Raz's original construction is impractical due to a polynomial computation time of at least degree 4. Our work solves this problem by presenting an improved version of Raz's extractor with quasi-linear computation time, as well as a new analytic theorem with reduced entropy requirements. We provide comprehensive analytical and numerical comparisons of our construction with others in the literature, and we derive strong and quantum-proof versions of our efficient Raz extractor. Additionally, we offer an easy-to-use, open-source code implementation of the extractor and a numerical parameter calculation module.
Dataflow neural network accelerators efficiently process AI tasks on FPGAs, with deployment simplified by ready-to-use frameworks and pre-trained models. However, this convenience makes them vulnerable to malicious actors seeking to reverse engineer valuable Intellectual Property (IP) through Side-Channel Attacks (SCA). This paper proposes a methodology to recover the hardware configuration of dataflow accelerators generated with the FINN framework. Through unsupervised dimensionality reduction, we reduce the computational overhead compared to the state-of-the-art, enabling lightweight classifiers to recover both folding and quantization parameters. We demonstrate an attack phase requiring only 337 ms to recover the hardware parameters with an accuracy of more than 95% and 421 ms to fully recover these parameters with an averaging of 4 traces for a FINN-based accelerator running a CNN, both using a random forest classifier on side-channel traces, even with the accelerator dataflow fully loaded. This approach offers a more realistic attack scenario than existing methods, and compared to SoA attacks based on tsfresh, our method requires 940x and 110x less time for preparation and attack phases, respectively, and gives better results even without averaging traces.
Software-exploitable Hardware Trojans (HTs) enable attackers to execute unauthorized software or gain illicit access to privileged operations. This manuscript introduces a hardware-based methodology for detecting runtime HT activations using Error Correction Codes (ECCs) on a RISC-V microprocessor. Specifically, it focuses on HTs that inject malicious instructions, disrupting the normal execution flow by triggering unauthorized programs. To counter this threat, the manuscript introduces a Hardware Security Checker (HSC) leveraging Hamming Single Error Correction (HSEC) architectures for effective HT detection. Experimental results demonstrate that the proposed solution achieves a 100% detection rate for potential HT activations, with no false positives or undetected attacks. The implementation incurs minimal overhead, requiring only 72 #LUTs, 24 #FFs, and 0.5 #BRAM while maintaining the microprocessor's original operating frequency and introducing no additional time delay.
Digitalization in the medical world provides major benefits while making it a target for attackers and thus hard to secure. To deal with network intruders we propose an anomaly detection system on hardware to detect malicious clients in real-time. We meet real-time and power restrictions using FPGAs. Overall system performance is achieved via the presented holistic system evaluation.
The rapid deployment of Large language model (LLM) agents in critical domains like healthcare and finance necessitates robust security frameworks. To address the absence of standardized evaluation benchmarks for these agents in dynamic environments, we introduce RAS-Eval, a comprehensive security benchmark supporting both simulated and real-world tool execution. RAS-Eval comprises 80 test cases and 3,802 attack tasks mapped to 11 Common Weakness Enumeration (CWE) categories, with tools implemented in JSON, LangGraph, and Model Context Protocol (MCP) formats. We evaluate 6 state-of-the-art LLMs across diverse scenarios, revealing significant vulnerabilities: attacks reduced agent task completion rates (TCR) by 36.78% on average and achieved an 85.65% success rate in academic settings. Notably, scaling laws held for security capabilities, with larger models outperforming smaller counterparts. Our findings expose critical risks in real-world agent deployments and provide a foundational framework for future security research. Code and data are available at https://github.com/lanzer-tree/RAS-Eval.
In this paper, we introduce an adaptation of the facility location problem and analyze it within the framework of local differential privacy (LDP). Under this model, we ensure the privacy of client presence at specific locations. When n is the number of points, Gupta et al. established a lower bound of $\Omega(\sqrt{n})$ on the approximation ratio for any differentially private algorithm applied to the original facility location problem. As a result, subsequent works have adopted the super-set assumption, which may, however, compromise user privacy. We show that this lower bound does not apply to our adaptation by presenting an LDP algorithm that achieves a constant approximation ratio with a relatively small additive factor. Additionally, we provide experimental results demonstrating that our algorithm outperforms the straightforward approach on both synthetically generated and real-world datasets.
With the rapid advancements in Natural Language Processing (NLP), large language models (LLMs) like GPT-4 have gained significant traction in diverse applications, including security vulnerability scanning. This paper investigates the efficacy of GPT-4 in identifying software vulnerabilities compared to traditional Static Application Security Testing (SAST) tools. Drawing from an array of security mistakes, our analysis underscores the potent capabilities of GPT-4 in LLM-enhanced vulnerability scanning. We unveiled that GPT-4 (Advanced Data Analysis) outperforms SAST by an accuracy of 94% in detecting 32 types of exploitable vulnerabilities. This study also addresses the potential security concerns surrounding LLMs, emphasising the imperative of security by design/default and other security best practices for AI.
Large language models (LLMs) are rapidly evolving from single-modal systems to multimodal LLMs and intelligent agents, significantly expanding their capabilities while introducing increasingly severe security risks. This paper presents a systematic survey of the growing complexity of jailbreak attacks and corresponding defense mechanisms within the expanding LLM ecosystem. We first trace the developmental trajectory from LLMs to MLLMs and Agents, highlighting the core security challenges emerging at each stage. Next, we categorize mainstream jailbreak techniques from both the attack impact and visibility perspectives, and provide a comprehensive analysis of representative attack methods, related datasets, and evaluation metrics. On the defense side, we organize existing strategies based on response timing and technical approach, offering a structured understanding of their applicability and implementation. Furthermore, we identify key limitations in existing surveys, such as insufficient attention to agent-specific security issues, the absence of a clear taxonomy for hybrid jailbreak methods, a lack of detailed analysis of experimental setups, and outdated coverage of recent advancements. To address these limitations, we provide an updated synthesis of recent work and outline future research directions in areas such as dataset construction, evaluation framework optimization, and strategy generalization. Our study seeks to enhance the understanding of jailbreak mechanisms and facilitate the advancement of more resilient and adaptive defense strategies in the context of ever more capable LLMs.
In recent years, the widespread application of large language models has inspired us to consider using inference for communication encryption. We therefore propose CipherMind, which utilizes intermediate results from deterministic fine-tuning of large model inferences as transmission content. The semantic parameters of large models exhibit characteristics like opaque underlying implementations and weak interpretability, thus enabling their use as an encryption method for data transmission. This communication paradigm can be applied in scenarios like intra-gateway transmission, and theoretically, it can be implemented using any large model as its foundation.
Decentralized learning is vulnerable to poison attacks, where malicious clients manipulate local updates to degrade global model performance. Existing defenses mainly detect and filter malicious models, aiming to prevent a limited number of attackers from corrupting the global model. However, restoring an already compromised global model remains a challenge. A direct approach is to remove malicious clients and retrain the model using only the benign clients. Yet, retraining is time-consuming, computationally expensive, and may compromise model consistency and privacy. We propose PDLRecover, a novel method to recover a poisoned global model efficiently by leveraging historical model information while preserving privacy. The main challenge lies in protecting shared historical models while enabling parameter estimation for model recovery. By exploiting the linearity of approximate Hessian matrix computation, we apply secret sharing to protect historical updates, ensuring local models are not leaked during transmission or reconstruction. PDLRecover introduces client-side preparation, periodic recovery updates, and a final exact update to ensure robustness and convergence of the recovered model. Periodic updates maintain accurate curvature information, and the final step ensures high-quality convergence. Experiments show that the recovered global model achieves performance comparable to a fully retrained model but with significantly reduced computation and time cost. Moreover, PDLRecover effectively prevents leakage of local model parameters, ensuring both accuracy and privacy in recovery.
Privacy-preserving neural network training in vertically partitioned scenarios is vital for secure collaborative modeling across institutions. This paper presents \textbf{EVA-S2PMLP}, an Efficient, Verifiable, and Accurate Secure Two-Party Multi-Layer Perceptron framework that introduces spatial-scale optimization for enhanced privacy and performance. To enable reliable computation under real-number domain, EVA-S2PMLP proposes a secure transformation pipeline that maps scalar inputs to vector and matrix spaces while preserving correctness. The framework includes a suite of atomic protocols for linear and non-linear secure computations, with modular support for secure activation, matrix-vector operations, and loss evaluation. Theoretical analysis confirms the reliability, security, and asymptotic complexity of each protocol. Extensive experiments show that EVA-S2PMLP achieves high inference accuracy and significantly reduced communication overhead, with up to $12.3\times$ improvement over baselines. Evaluation on benchmark datasets demonstrates that the framework maintains model utility while ensuring strict data confidentiality, making it a practical solution for privacy-preserving neural network training in finance, healthcare, and cross-organizational AI applications.
As AI capabilities advance rapidly, flexible hardware-enabled guarantees (flexHEGs) offer opportunities to address international security challenges through comprehensive governance frameworks. This report examines how flexHEGs could enable internationally trustworthy AI governance by establishing standardized designs, robust ecosystem defenses, and clear operational parameters for AI-relevant chips. We analyze four critical international security applications: limiting proliferation to address malicious use, implementing safety norms to prevent loss of control, managing risks from military AI systems, and supporting strategic stability through balance-of-power mechanisms while respecting national sovereignty. The report explores both targeted deployments for specific high-risk facilities and comprehensive deployments covering all AI-relevant compute. We examine two primary governance models: verification-based agreements that enable transparent compliance monitoring, and ruleset-based agreements that automatically enforce international rules through cryptographically-signed updates. Through game-theoretic analysis, we demonstrate that comprehensive flexHEG agreements could remain stable under reasonable assumptions about state preferences and catastrophic risks. The report addresses critical implementation challenges including technical thresholds for AI-relevant chips, management of existing non-flexHEG hardware, and safeguards against abuse of governance power. While requiring significant international coordination, flexHEGs could provide a technical foundation for managing AI risks at the scale and speed necessary to address emerging threats to international security and stability.
As artificial intelligence systems become increasingly powerful, they pose growing risks to international security, creating urgent coordination challenges that current governance approaches struggle to address without compromising sensitive information or national security. We propose flexible hardware-enabled guarantees (flexHEGs), that could be integrated with AI accelerators to enable trustworthy, privacy-preserving verification and enforcement of claims about AI development. FlexHEGs consist of an auditable guarantee processor that monitors accelerator usage and a secure enclosure providing physical tamper protection. The system would be fully open source with flexible, updateable verification capabilities. FlexHEGs could enable diverse governance mechanisms including privacy-preserving model evaluations, controlled deployment, compute limits for training, and automated safety protocol enforcement. In this first part of a three part series, we provide a comprehensive introduction of the flexHEG system, including an overview of the governance and security capabilities it offers, its potential development and adoption paths, and the remaining challenges and limitations it faces. While technically challenging, flexHEGs offer an approach to address emerging regulatory and international security challenges in frontier AI development.
In the ever-expanding domain of 5G-NR wireless cellular networks, over-the-air jamming attacks are prevalent as security attacks, compromising the quality of the received signal. We simulate a jamming environment by incorporating additive white Gaussian noise (AWGN) into the real-world In-phase and Quadrature (I/Q) OFDM datasets. A Convolutional Autoencoder (CAE) is exploited to implement a jamming detection over various characteristics such as heterogenous I/Q datasets; extracting relevant information on Synchronization Signal Blocks (SSBs), and fewer SSB observations with notable class imbalance. Given the characteristics of datasets, balanced datasets are acquired by employing a Conv1D conditional Wasserstein Generative Adversarial Network-Gradient Penalty(CWGAN-GP) on both majority and minority SSB observations. Additionally, we compare the performance and detection ability of the proposed CAE model on augmented datasets with benchmark models: Convolutional Denoising Autoencoder (CDAE) and Convolutional Sparse Autoencoder (CSAE). Despite the complexity of data heterogeneity involved across all datasets, CAE depicts the robustness in detection performance of jammed signal by achieving average values of 97.33% precision, 91.33% recall, 94.08% F1-score, and 94.35% accuracy over CDAE and CSAE.
The exponential growth of Internet of Things (IoT) applications has intensified the demand for efficient, high-throughput, and energy-efficient data processing at the edge. Conventional CPU-centric encryption methods suffer from performance bottlenecks and excessive data movement, especially in latency-sensitive and resource-constrained environments. In this paper, we present SPiME, a lightweight, scalable, and FPGA-compatible Secure Processor-in-Memory Encryption architecture that integrates the Advanced Encryption Standard (AES-128) directly into a Processing-in-Memory (PiM) framework. SPiME is designed as a modular array of parallel PiM units, each combining an AES core with a minimal control unit to enable distributed in-place encryption with minimal overhead. The architecture is fully implemented in Verilog and tested on multiple AMD UltraScale and UltraScale+ FPGAs. Evaluation results show that SPiME can scale beyond 4,000 parallel units while maintaining less than 5\% utilization of key FPGA resources on high-end devices. It delivers over 25~Gbps in sustained encryption throughput with predictable, low-latency performance. The design's portability, configurability, and resource efficiency make it a compelling solution for secure edge computing, embedded cryptographic systems, and customizable hardware accelerators.
Advancements in the defense industry are paramount for ensuring the safety and security of nations, providing robust protection against emerging threats. Among these threats, hypersonic missiles pose a significant challenge due to their extreme speeds and maneuverability, making accurate trajectory prediction a critical necessity for effective countermeasures. This paper addresses this challenge by employing a novel hybrid deep learning approach, integrating Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs). By leveraging the strengths of these architectures, the proposed method successfully predicts the complex trajectories of hypersonic missiles with high accuracy, offering a significant contribution to defense strategies and missile interception technologies. This research demonstrates the potential of advanced machine learning techniques in enhancing the predictive capabilities of defense systems.
This paper presents a multithread and efficient cryptographic hardware access (MECHA) for efficient and fast cryptographic operations that eliminates the need for context switching. Utilizing a UNIX domain socket, MECHA manages multiple requests from multiple applications simultaneously, resulting in faster processing and improved efficiency. We comprise several key components, including the Server thread, Client thread, Transceiver thread, and a pair of Sender and Receiver queues. MECHA design is portable and can be used with any communication protocol, with experimental results demonstrating a 83% increase in the speed of concurrent cryptographic requests compared to conventional interface design. MECHA architecture has significant potential in the field of secure communication applications ranging from cloud computing to the IoT, offering a faster and more efficient solution for managing multiple cryptographic operation requests concurrently.
The integration of AI/ML into medical devices is rapidly transforming healthcare by enhancing diagnostic and treatment facilities. However, this advancement also introduces serious cybersecurity risks due to the use of complex and often opaque models, extensive interconnectivity, interoperability with third-party peripheral devices, Internet connectivity, and vulnerabilities in the underlying technologies. These factors contribute to a broad attack surface and make threat prevention, detection, and mitigation challenging. Given the highly safety-critical nature of these devices, a cyberattack on these devices can cause the ML models to mispredict, thereby posing significant safety risks to patients. Therefore, ensuring the security of these devices from the time of design is essential. This paper underscores the urgency of addressing the cybersecurity challenges in ML-enabled medical devices at the pre-market phase. We begin by analyzing publicly available data on device recalls and adverse events, and known vulnerabilities, to understand the threat landscape of AI/ML-enabled medical devices and their repercussions on patient safety. Building on this analysis, we introduce a suite of tools and techniques designed by us to assist security analysts in conducting comprehensive premarket risk assessments. Our work aims to empower manufacturers to embed cybersecurity as a core design principle in AI/ML-enabled medical devices, thereby making them safe for patients.
We study the problem of differentially private continual counting in the unbounded setting where the input size $n$ is not known in advance. Current state-of-the-art algorithms based on optimal instantiations of the matrix mechanism cannot be directly applied here because their privacy guarantees only hold when key parameters are tuned to $n$. Using the common `doubling trick' avoids knowledge of $n$ but leads to suboptimal and non-smooth error. We solve this problem by introducing novel matrix factorizations based on logarithmic perturbations of the function $\frac{1}{\sqrt{1-z}}$ studied in prior works, which may be of independent interest. The resulting algorithm has smooth error, and for any $\alpha > 0$ and $t\leq n$ it is able to privately estimate the sum of the first $t$ data points with $O(\log^{2+2\alpha}(t))$ variance. It requires $O(t)$ space and amortized $O(\log t)$ time per round, compared to $O(\log(n)\log(t))$ variance, $O(n)$ space and $O(n \log n)$ pre-processing time for the nearly-optimal bounded-input algorithm of Henzinger et al. (SODA 2023). Empirically, we find that our algorithm's performance is also comparable to theirs in absolute terms: our variance is less than $1.5\times$ theirs for $t$ as large as $2^{24}$.
Confidential Virtual Machines (CVMs) provide isolation guarantees for data in use, but their threat model does not include physical level protection and side-channel attacks. Therefore, current deployments rely on trusted cloud providers to host the CVMs' underlying infrastructure. However, TEE attestations do not provide information about the operator hosting a CVM. Without knowing whether a Trusted Execution Environment (TEE) runs within a provider's infrastructure, a user cannot accurately assess the risks of physical attacks. We observe a misalignment in the threat model where the workloads are protected against other tenants but do not offer end-to-end security assurances to external users without relying on cloud providers. The attestation should be extended to bind the CVM with the provider. A possible solution can rely on the Protected Platform Identifier (PPID), a unique CPU identifier. However, the implementation details of various TEE manufacturers, attestation flows, and providers vary. This makes verification of attestations, ease of migration, and building applications without relying on a trusted party challenging, highlighting a key limitation that must be addressed for the adoption of CVMs. We discuss two points focusing on hardening and extensions of TEEs' attestation.
The Fair Data Exchange (FDE) protocol introduced at CCS 2024 offers atomic pay-per-file transfers with constant-size proofs, but its prover and verifier runtimes still scale linearly with the file length n. We collapse these costs to essentially constant by viewing the file as a rate-1 Reed-Solomon (RS) codeword, extending it to a lower-rate RS code with constant redundancy, encrypting this extended vector, and then proving correctness for only a small random subset of the resulting ciphertexts; RS decoding repairs any corrupted symbols with negligible failure probability. Our protocol preserves full client- and server-fairness, and adds only a tunable communication redundancy overhead. Finally, we patch the elliptic-curve mismatch in the Bitcoin instantiation of FDE with a compact zk-SNARK, enabling the entire exchange to run off-chain and falling back to just two on-chain transactions when channels are unavailable.
The pre-training of large language models (LLMs) relies on massive text datasets sourced from diverse and difficult-to-curate origins. Although membership inference attacks and hidden canaries have been explored to trace data usage, such methods rely on memorization of training data, which LM providers try to limit. In this work, we demonstrate that indirect data poisoning (where the targeted behavior is absent from training data) is not only feasible but also allow to effectively protect a dataset and trace its use. Using gradient-based optimization prompt-tuning, we make a model learn arbitrary secret sequences: secret responses to secret prompts that are absent from the training corpus. We validate our approach on language models pre-trained from scratch and show that less than 0.005% of poisoned tokens are sufficient to covertly make a LM learn a secret and detect it with extremely high confidence ($p < 10^{-55}$) with a theoretically certifiable scheme. Crucially, this occurs without performance degradation (on LM benchmarks) and despite secrets never appearing in the training set.
We study privacy leakage in the reasoning traces of large reasoning models used as personal agents. Unlike final outputs, reasoning traces are often assumed to be internal and safe. We challenge this assumption by showing that reasoning traces frequently contain sensitive user data, which can be extracted via prompt injections or accidentally leak into outputs. Through probing and agentic evaluations, we demonstrate that test-time compute approaches, particularly increased reasoning steps, amplify such leakage. While increasing the budget of those test-time compute approaches makes models more cautious in their final answers, it also leads them to reason more verbosely and leak more in their own thinking. This reveals a core tension: reasoning improves utility but enlarges the privacy attack surface. We argue that safety efforts must extend to the model's internal thinking, not just its outputs.