NDSS.2025 - Fall

| Total: 137

#1 ”Who is Trying to Access My Account?” Exploring User Perceptions and Reactions to Risk-based Authentication Notifications [PDF²] [Copy] [Kimi¹] [REL]

Authors: Tongxin Wei, Ding Wang, Yutong Li, Yuehuan Wang

Risk-based authentication (RBA) is gaining popularity and RBA notifications promptly alert users to protect their accounts from unauthorized access. Recent research indicates that users can identify legitimate login notifications triggered by themselves. However, little attention has been paid to whether RBA notifications triggered by non-account holders can effectively raise users' awareness of crises and prevent potential attacks. In this paper, we invite 258 online participants and 15 offline participants to explore users' perceptions, reactions, and expectations for three types of RBA notifications (i.e., RBA notifications triggered by correct passwords, incorrect passwords, and password resets). The results show that over 90% of participants consider RBA notifications important. Users do not show significant differences in their feelings and behaviors towards the three types of RBA notifications, but they have distinct expectations for each type. Most participants feel suspicious, nervous, and anxious upon receiving the three types of RBA notifications not triggered by themselves. Consequently, users immediately review the full content of the notification. 46% of users suspect that RBA notifications might be phishing attempts, while categorizing them as potential phishing attacks or spam may lead to ineffective account protection. Despite these suspicions, 65% of users still log into their accounts to check for suspicious activities and take no further action if no abnormalities are found. Additionally, the current format of RBA notifications fails to gain users' trust and meet their expectations. Our findings indicate that RBA notifications need to provide more detailed information about suspicious access, offer additional security measures, and clearly explain the risks involved. Finally, we offer five design recommendations for RBA notifications to better mitigate potential risks and enhance account security.

Subject: NDSS.2025 - Fall

#2 “Where Are We On Cyber?” – A Qualitative Study On Boards' Cybersecurity Risk Decision Making [PDF²] [Copy] [Kimi] [REL]

Authors: Jens Christian Opdenbusch, Jonas Hielscher, M. Angela Sasse

Boards are increasingly required to oversee the cybersecurity risks of their organizations. To make informed decisions, board members have to rely on the information given to them, which could come from their Chief Information Security Officers (CISOs), the reports of executives, audits, and regulations. However, little is known about how boards decide after receiving such information and how their relationship with other stakeholders shapes those decisions. Here, we present the results of an in-depth interview study with n=18 C-level managers, board members, CISOs, and C-level consultants of some of the largest UK-based companies. Our findings suggest that a power imbalance exists: board members will often not ask the right questions to executives and CISOs since they fear being exposed as IT novices. This ultimately makes boards highly dependent on those providing them with cybersecurity information, leading to losing their oversight function. Furthermore, cybersecurity risk is abstracted to budget decisions with no further involvement in cybersecurity strategies through boards. We discuss possible ways to strengthen boards' oversight functions, such as releasing industry benchmarks through public cyber agencies or implementing support structures within the company - such as standing (cybersecurity) risk and audit committees.

Subject: NDSS.2025 - Fall

#3 A Comprehensive Memory Safety Analysis of Bootloaders [PDF¹] [Copy] [Kimi] [REL]

Authors: Jianqiang Wang, Meng Wang, Qinying Wang, Nils Langius, Li Shi, Ali Abbasi, Thorsten Holz

The bootloader plays an important role during the boot process, as it connects two crucial components: the firmware and the operating system. After powering on, the bootloader takes control from the firmware, prepares the early boot environment, and then hands control over to the operating system. Modern computers often use a feature called secure boot to prevent malicious software from loading at startup. As a key part of the secure boot chain, the bootloader is responsible for verifying the operating system, loading its image into memory, and launching it. Therefore, the bootloader must be designed and implemented in a secure manner. However, bootloaders have increasingly provided more features and functionalities for end users. As the code base grows, bootloaders inevitably expose more attack surfaces. In recent years, vulnerabilities, particularly memory safety violations, have been discovered in various bootloaders. Some of these vulnerabilities can lead to denial of service or even bypass secure boot protections. Despite the bootloader’s critical role in the secure boot chain, a comprehensive memory safety analysis of bootloaders has yet to be conducted. In this paper, we present the first comprehensive and systematic memory safety analysis of bootloaders, based on a survey of previous bootloader vulnerabilities. We examine the potential attack surfaces of various bootloaders and how these surfaces lead to vulnerabilities. We observe that malicious input from peripherals such as storage devices and networks is a primary method attackers use to exploit bootloader vulnerabilities. To assist bootloader developers in detecting vulnerabilities at scale, we designed and implemented a bootloader fuzzing framework based on our analysis. In our experiments, we discovered 39 vulnerabilities in nine bootloaders, of which 38 are new vulnerabilities. In particular, 14 vulnerabilities were found in the widely used Linux standard bootloader GRUB, some of which can even lead to secure boot bypass if properly exploited. So far, five CVEs have been assigned to our findings.

Subject: NDSS.2025 - Fall

#4 A Formal Approach to Multi-Layered Privileges for Enclaves [PDF] [Copy] [Kimi¹] [REL]

Authors: Ganxiang Yang, Chenyang Liu, Zhen Huang, Guoxing Chen, Hongfei Fu, Yuanyuan Zhang, Haojin Zhu

Trusted Execution Environments (TEE) have been widely adopted as a protection approach for security-critical applications. Although feature extensions have been previously proposed to improve the usability of enclaves, their provision patterns are still confronted with security challenges. This paper presents Palantir, a verifiable multi-layered inter-enclave privilege model for secure feature extensions to enclaves. Specifically, a parent-children inter-enclave relationship, with which a parent enclave is granted two privileged permissions, the Execution Control and Spatial Control, over its children enclaves to facilitate secure feature extensions, is introduced. Moreover, by enabling nesting parent-children relationships, Palantir achieves multi-layered privileges (MLP) that allow feature extensions to be placed in various privilege layers following the Principle of Least Privilege. To prove the security of Palantir, we verified that our privilege model does not break or weaken the security guarantees of enclaves by building and verifying a formal model named $text{TAP}^{infty}$. Furthermore, We implemented a prototype of Palantir on Penglai, an open-sourced RISC-V TEE platform. The evaluation demonstrates the promising performance of Palantir in runtime overhead $(<5%)$ and startup latencies.

Subject: NDSS.2025 - Fall

#5 A Large-Scale Measurement Study of the PROXY Protocol and its Security Implications [PDF] [Copy] [Kimi] [REL]

Authors: Stijn Pletinckx, Christopher Kruegel, Giovanni Vigna

Reverse proxy servers play a critical role in optimizing Internet services, offering benefits ranging from load balancing to Denial of Service (DoS) protection. A known shortcoming of such proxies is that the backend server becomes oblivious to the IP address of the client who initiated the connection since all requests are forwarded by the proxy server. For HTTP, this issue is trivially solved by the X-Forwarded-For header, which allows the proxy server to pass to the backend server the IP address of the client that originated the request. Unfortunately, no such equivalent exists for many other protocols. To solve this issue, HAProxy created the PROXY protocol, which communicates client information from a proxy server to a backend server at a lower level in the network stack (Layer 4), making it protocol-agnostic. In this work, we are the first to study the use of the PROXY protocol at Internet scale and investigate the security impact of its misconfigurations. We launched a measurement study on the full IPv4 address range and found that, over HTTP, more than 170,000 hosts accept PROXY protocol data from arbitrary sources. We demonstrate how to abuse this protocol to bypass on-path proxies (and their protections) and leak sensitive information from backend infrastructures. We discovered over 10,000 servers that are vulnerable to an access bypass, triggered by injecting a (spoofed) PROXY protocol header. Using this technique, we obtained access to over 500 internal servers providing control over IoT monitoring platforms and smart home automation devices, allowing us to, for example, regulate remote controlled window blinds or control security cameras and alarm systems. Beyond HTTP, we demonstrate how the PROXY protocol can be used to turn over 350 SMTP servers into open relays, enabling an attacker to send arbitrary emails from any email address. In sum, our study exposes how PROXY protocol misconfigurations lead to severe security issues that affect multiple protocols prominently used in the wild.

Subject: NDSS.2025 - Fall

#6 A Multifaceted Study on the Use of TLS and Auto-detect in Email Ecosystems [PDF] [Copy] [Kimi] [REL]

Authors: Ka Fun Tang, Che Wei Tu, Sui Ling Angela Mak, Sze Yiu Chau

Various email protocols, including IMAP, POP3, and SMTP, were originally designed as “plaintext” protocols without inbuilt confidentiality and integrity guarantees. To protect the communication traffic, TLS can either be used implicitly before the start of those email protocols, or introduced as an opportunistic upgrade in a post-hoc fashion. In order to improve user experience, many email clients nowadays provide a so-called “auto-detect” feature to automatically determine a functional set of configuration parameters for the users. In this paper, we present a multifaceted study on the security of the use of TLS and auto-detect in email clients. First, to evaluate the design and implementation of client-side TLS and auto-detect, we tested 49 email clients and uncovered various flaws that can lead to covert security downgrade and exposure of user credentials to attackers. Second, to understand whether current deployment practices adequately avoid the security traps introduced by opportunistic TLS and auto-detect, we collected and analyzed 1102 email setup guides from academic institutes across the world, and observed problems that can drive users to adopt insecure email settings. Finally, with the server addresses obtained from the setup guides, we evaluate the sever-side support for implicit and opportunistic TLS, as well as the characteristics of their certificates. Our results suggest that many users suffer from an inadvertent loss of security due to careless handling of TLS and auto-detect, and organizations in general are better off prescribing concrete and detailed manual configuration to their users.

Subject: NDSS.2025 - Fall

#7 A New PPML Paradigm for Quantized Models [PDF] [Copy] [Kimi] [REL]

Authors: Tianpei Lu, Bingsheng Zhang, Xiaoyuan Zhang, Kui Ren

Model quantization has become a common practice in machine learning (ML) to improve efficiency and reduce computational/communicational overhead. However, adopting quantization in privacy-preserving machine learning (PPML) remains challenging due to the complex internal structure of quantized operators, which leads to inefficient protocols under the existing PPML frameworks. In this work, we propose a new PPML paradigm that is tailor-made for and can benefit from quantized models. Our main observation is that look-up tables can ignore the complex internal constructs of any functions which can be used to simplify the quantized operator evaluation. We view the model inference process as a sequence of quantized operators, and each operator is implemented by a look-up table. We then develop an efficient private look-up table evaluation protocol, and its online communication cost is only $log n$, where $n$ is the size of the look-up table. On a single CPU core, our protocol can evaluate $2^{26}$ tables with 8-bit input and 8-bit output per second. The resulting PPML framework for quantized models offers extremely fast online performance. The experimental results demonstrate that our quantization strategy achieves substantial speedups over SOTA PPML solutions, improving the online performance by $40sim 60 times$ w.r.t. convolutional neural network (CNN) models, such as AlexNet, VGG16, and ResNet18, and by $10sim 25 times$ w.r.t. large language models (LLMs), such as GPT-2, GPT-Neo, and Llama2.

Subject: NDSS.2025 - Fall

#8 Alba: The Dawn of Scalable Bridges for Blockchains [PDF] [Copy] [Kimi] [REL]

Authors: Giulia Scaffino, Lukas Aumayr, Mahsa Bastankhah, Zeta Avarikioti, Matteo Maffei

Over the past decade, cryptocurrencies have garnered attention from academia and industry alike, fostering a diverse blockchain ecosystem and novel applications. The inception of bridges improved interoperability, enabling asset transfers across different blockchains to capitalize on their unique features. Despite their surge in popularity and the emergence of Decentralized Finance (DeFi), trustless bridge protocols remain inefficient, either relaying too much information (e.g., light-client-based bridges) or demanding expensive computation (e.g., zk-based bridges). These inefficiencies arise because existing bridges securely prove a transaction's on-chain inclusion on another blockchain. Yet this is unnecessary as off-chain solutions, like payment and state channels, permit safe transactions without on-chain publication. However, existing bridges do not support the verification of off-chain payments. This paper fills this gap by introducing the concept of Pay2Chain bridges that leverage the advantages of off-chain solutions like payment channels to overcome current bridges' limitations. Our proposed Pay2Chain bridge, named Alba, facilitates the efficient, secure, and trustless execution of conditional payments or smart contracts on a target blockchain based on off-chain events. Alba, besides its technical advantages, enriches the source blockchain's ecosystem by facilitating DeFi applications, multi-asset payment channels, and optimistic stateful off-chain computation. We formalize the security of Alba against Byzantine adversaries in the UC framework and complement it with a game theoretic analysis. We further introduce formal scalability metrics to demonstrate Alba's efficiency. Our empirical evaluation confirms Alba's efficiency in terms of communication complexity and on-chain costs, with its optimistic case incurring only twice the cost of a standard Ethereum transaction of token ownership transfer.

Subject: NDSS.2025 - Fall

#9 All your (data)base are belong to us: Characterizing Database Ransom(ware) Attacks [PDF] [Copy] [Kimi] [REL]

Authors: Kevin van Liebergen, Gibran Gomez, Srdjan Matic, Juan Caballero

We present the first systematic study of database ransom(ware) attacks, a class of attacks where attackers scan for database servers, log in by leveraging the lack of authentication or weak credentials, drop the database contents, and demand a ransom to return the deleted data. We examine 23,736 ransom notes collected from 60,427 compromised database servers over three years, and set up database honeypots to obtain a first-hand view of current attacks. Database ransom(ware) attacks are prevalent with 6K newly infected servers in March 2024, a 60% increase over a year earlier. Our honeypots get infected in 14 hours since they are connected to the Internet. Weak authentication issues are two orders of magnitude more frequent on Elasticsearch servers compared to MySQL servers due to slow adoption of the latest Elasticsearch versions. To analyze who is behind database ransom(ware) attacks we implement a clustering approach that first identifies campaigns using the similarity of the ransom notes text. Then, it determines which campaigns are run by the same group by leveraging indicator reuse and information from the Bitcoin blockchain. For each group, it computes properties such as the number of compromised servers, the lifetime, the revenue, and the indicators used. Our approach identifies that the 60,427 database servers are victims of 91 campaigns run by 32 groups. It uncovers a dominant group responsible for 76% of the infected servers and 90% of the financial impact. We find links between the dominant group, a nation-state, and a previous attack on Git repositories.

Subject: NDSS.2025 - Fall

#10 ASGARD: Protecting On-Device Deep Neural Networks with Virtualization-Based Trusted Execution Environments [PDF²] [Copy] [Kimi¹] [REL]

Authors: Myungsuk Moon, Minhee Kim, Joonkyo Jung, Dokyung Song

On-device deep learning, increasingly popular for enhancing user privacy, now poses a serious risk to the privacy of deep neural network (DNN) models. Researchers have proposed to leverage Arm TrustZone's trusted execution environment (TEE) to protect models from attacks originating in the rich execution environment (REE). Existing solutions, however, fall short: (i) those that fully contain DNN inference within a TEE either support inference on CPUs only, or require substantial modifications to closed-source proprietary software for incorporating accelerators; (ii) those that offload part of DNN inference to the REE either leave a portion of DNNs unprotected, or incur large run-time overheads due to frequent model (de)obfuscation and TEE-to-REE exits. We present ASGARD, the first virtualization-based TEE solution designed to protect on-device DNNs on legacy Armv8-A SoCs. Unlike prior work that uses TrustZone-based TEEs for model protection, ASGARD's TEEs remain compatible with existing proprietary software, maintain the trusted computing base (TCB) minimal, and incur near-zero run-time overhead. To this end, ASGARD (i) securely extends the boundaries of an existing TEE to incorporate an SoC-integrated accelerator via secure I/O passthrough, (ii) tightly controls the size of the TCB via our aggressive yet security-preserving platform- and application-level TCB debloating techniques, and (iii) mitigates the number of costly TEE-to-REE exits via our exit-coalescing DNN execution planning. We implemented ASGARD on RK3588S, an Armv8.2-A-based commodity Android platform equipped with a Rockchip NPU, without modifying Rockchip- nor Arm-proprietary software. Our evaluation demonstrates that ASGARD effectively protects on-device DNNs in legacy SoCs with a minimal TCB size and negligible inference latency overhead.

Subject: NDSS.2025 - Fall

#11 Attributing Open-Source Contributions is Critical but Difficult: A Systematic Analysis of GitHub Practices and Their Impact on Software Supply Chain Security [PDF] [Copy] [Kimi¹] [REL]

Authors: Jan-Ulrich Holtgrave, Kay Friedrich, Fabian Fischer, Nicolas Huaman, Niklas Busch, Jan H. Klemmer, Marcel Fourné, Oliver Wiese, Dominik Wermke, Sascha Fahl

Critical open-source projects form the basis of many large software systems. They provide trusted and extensible implementations of important functionality for cryptography, compatibility, and security. Verifying commit authorship authenticity in open-source projects is essential and challenging. Git users can freely configure author details such as names and email addresses. Platforms like GitHub use such information to generate profile links to user accounts. We demonstrate three attack scenarios malicious actors can use to manipulate projects and profiles on GitHub to appear trustworthy. We designed a mixed-research study to assess the effect on critical open-source software projects and evaluated countermeasures. First, we conducted a large-scale measurement among 50,328 critical open-source projects on GitHub and demonstrated that contribution workflows can be abused in 85.9% of the projects. We identified 573,043 email addresses that a malicious actor can claim to hijack historic contributions and improve the trustworthiness of their accounts. When looking at commit signing as a countermeasure, we found that the majority of users (95.4%) never signed a commit, and for the majority of projects (72.1%), no commit was ever signed. In contrast, only 2.0% of the users signed all their commits, and for 0.2% of the projects all commits were signed. Commit signing is not associated with projects’ programming languages, topics, or other security measures. Second, we analyzed online security advice to explore the awareness of contributor spoofing and identify recommended countermeasures. Most documents exhibit awareness of the simple spoofing technique via Git commits but no awareness of problems with GitHub’s handling of email addresses.

Subject: NDSS.2025 - Fall

#12 Automated Mass Malware Factory: The Convergence of Piggybacking and Adversarial Example in Android Malicious Software Generation [PDF²] [Copy] [Kimi] [REL]

Authors: Heng Li, Zhiyuan Yao, Bang Wu, Cuiying Gao, Teng Xu, Wei Yuan, Xiapu Luo

Adversarial example techniques have been demonstrated to be highly effective against Android malware detection systems, enabling malware to evade detection with minimal code modifications. However, existing adversarial example techniques overlook the process of malware generation, thus restricting the applicability of adversarial example techniques. In this paper, we investigate piggybacked malware, a type of malware generated in bulk by piggybacking malicious code into popular apps, and combine it with adversarial example techniques. Given a malicious code segment (i.e., a rider), we can generate adversarial perturbations tailored to it and insert them into any carrier, enabling the resulting malware to evade detection. Through exploring the mechanism by which adversarial perturbation affects piggybacked malware code, we propose an adversarial piggybacked malware generation method, which comprises three modules: Malicious Rider Extraction, Adversarial Perturbation Generation, and Benign Carrier Selection. Extensive experiments have demonstrated that our method can efficiently generate a large volume of malware in a short period, and significantly increase the likelihood of evading detection. Our method achieved an average attack success rate (ASR) of 88.3% on machine learning-based detection models (e.g., Drebin and MaMaDroid), and an ASR of 76% and 92% on commercial engines Microsoft and Kingsoft, respectively. Furthermore, we have explored potential defenses against our adversarial piggybacked malware.

Subject: NDSS.2025 - Fall

#13 Automatic Insecurity: Exploring Email Auto-configuration in the Wild [PDF] [Copy] [Kimi] [REL]

Authors: Shushang Wen, Yiming Zhang, Yuxiang Shen, Bingyu Li, Haixin Duan, Jingqiang Lin

Email clients that support auto-configuration mechanisms automatically retrieve server configuration information, such as the hostname, port number, and connection type, allowing users to log in by simply entering email addresses and passwords. Auto-configuration mechanisms are being increasingly adopted. However, the security implications of these mechanisms, both in terms of implementation and deployment, have not yet been thoroughly studied. In this paper, we present the first systematic analysis of security threats associated with email auto-configuration and evaluate their impacts. We summarize 10 attack scenarios, covering 17 defects (including 8 newly identified ones), along with 4 inadequate client UI notifications. These attack scenarios can either cause a victim to connect to an attacker-controlled server or establish an insecure connection, putting the victim’s credentials at risk. Moreover, our large-scale measurements and in-depth analysis revealed serious insecurity of auto-configuration applications in the wild. On the server-side, we discovered 49,013 domains, including 19 of the Top-1K popular domains, were misconfigured. On the client-side, 22 out of 29 clients were vulnerable to those threats. Moreover, 27 out of 29 clients exhibited at least one UI-notification defect that facilitates silent attacks. These defects arise from misconfiguration, mismanagement, flawed implementation and compatibility. We hope this paper raises attention to email auto-configuration security.

Subject: NDSS.2025 - Fall

#14 Automatic Library Fuzzing through API Relation Evolvement [PDF] [Copy] [Kimi] [REL]

Authors: Jiayi Lin, Qingyu Zhang, Junzhe Li, Chenxin Sun, Hao Zhou, Changhua Luo, Chenxiong Qian

Software libraries are foundational components in modern software ecosystems. Vulnerabilities within these libraries pose significant security threats. Fuzzing is a widely used technique for uncovering software vulnerabilities. However, its application to software libraries poses considerable challenges, necessitating carefully crafted drivers that reflect diverse yet correct API usages. Existing works on automatic library fuzzing either suffer from high false positives due to API misuse caused by arbitrarily generated API sequences, or fail to produce diverse API sequences by overly relying on existing code snippets that express restricted API usages, thus missing deeper API vulnerabilities. This work proposes NEXZZER, a new fuzzer that automatically detects vulnerabilities in libraries. NEXZZER employs a hybrid relation learning strategy to continuously infer and evolve API relations, incorporating a novel driver architecture to augment the testing coverage of libraries and facilitate deep vulnerability discovery. We evaluated NEXZZER across 18 libraries and the Google Fuzzer Test Suite. The results demonstrate its considerable advantages in code coverage and vulnerability-finding capabilities compared to prior works. NEXZZER can also automatically identify and filter out most API misuse crashes. Moreover, NEXZZER discovered 27 previously unknown vulnerabilities in well-tested libraries, including OpenSSL and libpcre2. At the time of writing, developers have confirmed 24 of them, and 9 were fixed because of our reports.

Subject: NDSS.2025 - Fall

#15 Balancing Privacy and Data Utilization: A Comparative Vignette Study on User Acceptance of Data Trustees in Germany and the US [PDF] [Copy] [Kimi] [REL]

Authors: Leona Lassak, Hanna Püschel, Oliver D. Reithmaier, Tobias Gostomzyk, Markus Dürmuth

In times of big data, connected devices, and increasing self-measurement, protecting consumer privacy remains a challenge despite ongoing technological and legislative efforts. Data trustees present a promising solution, aiming to balance data utilization with privacy concerns by facilitating secure data sharing and ensuring individual control. However, successful implementation hinges on user acceptance and trust. We conducted a large-scale, vignette-based, census-representative online study examining factors influencing the acceptance of data trustees for medical, automotive, IoT, and online data. With n=714 participants from Germany and n=1036 from the US, our study reveals varied willingness to use data trustees across both countries, with notable skepticism and outright rejection from a significant portion of users. We also identified significant domain-specific differences, including the influence of user anonymity, perceived personal and societal benefits, and the recipients of the data. Contrary to common beliefs, organizational and regulatory decisions such as the storage location, the operator, and supervision appeared less relevant to users' decisions. In conclusion, while there exists a potential user base for data trustees, achieving widespread acceptance will require explicit and targeted implementation strategies tailored to address diverse user expectations. Our findings underscore the importance of understanding these nuances for effectively deploying data trustee frameworks that meet both regulatory requirements and user preferences while upholding highest security and privacy standards.

Subject: NDSS.2025 - Fall

#16 BARBIE: Robust Backdoor Detection Based on Latent Separability [PDF] [Copy] [Kimi] [REL]

Authors: Hanlei Zhang, Yijie Bai, Yanjiao Chen, Zhongming Ma, Wenyuan Xu

Backdoor attacks are an essential risk to deep learning model sharing. Fundamentally, backdoored models are different from benign models considering latent separability, i.e., distinguishable differences in model latent representations. However, existing methods quantify latent separability by clustering latent representations or computing distances between latent representations, which are easy to be compromised by adaptive attacks. In this paper, we propose BARBIE, a backdoor detection approach that can pinpoint latent separability under adaptive backdoor attacks. To achieve this goal, we propose a new latent separability metric, named relative competition score (RCS), by characterizing the dominance of latent representations over model output, which is robust against various backdoor attacks and is hard to compromise. Without the need to access any benign or backdoored sample, we invert two sets of latent representations of each label, reflecting the normal latent representations of benign models and intensifying the abnormal ones of backdoored models, to calculate RCS. We compute a series of RCS-based indicators to comprehensively reflect the differences between backdoored models and benign models. We validate the effectiveness of BARBIE on more than 10,000 models on 4 datasets against 14 types of backdoor attacks, including the adaptive attacks against latent separability. Compared with 7 baselines, BARBIE improves the average true positive rate by 17.05% against source-agnostic attacks, 27.72% against source-specific attacks, 43.17% against sample-specific attacks and 11.48% against clean-label attacks. BARBIE also maintains lower false positive rates than baselines. The source code is available at: https://github.com/Forliqr/BARBIE.

Subject: NDSS.2025 - Fall

#17 Beyond Classification: Inferring Function Names in Stripped Binaries via Domain Adapted LLMs [PDF] [Copy] [Kimi] [REL]

Authors: Linxi Jiang, Xin Jin, Zhiqiang Lin

Function name inference in stripped binaries is an important yet challenging task for many security applications, such as malware analysis and vulnerability discovery, due to the need to grasp binary code semantics amidst diverse instruction sets, architectures, compiler optimizations, and obfuscations. While machine learning has made significant progress in this field, existing methods often struggle with unseen data, constrained by their reliance on a limited vocabulary-based classification approach. In this paper, we present SymGen, a novel framework employing an autoregressive generation paradigm powered by domain-adapted generative large language models (LLMs) for enhanced binary code interpretation. We have evaluated SymGen on a dataset comprising 2,237,915 binary functions across four architectures (x86-64, x86-32, ARM, MIPS) with four levels of optimizations (O0-O3) where it surpasses the state-of-the-art with up to 409.3%, 553.5%, and 489.4% advancement in precision, recall, and F1 score, respectively, showing superior effectiveness and generalizability. Our ablation and case studies also demonstrate the significant performance boosts achieved by our design, e.g., the domain adaptation approach, alongside showcasing SymGen’s practicality in analyzing real-world binaries, e.g., obfuscated binaries and malware executables.

Subject: NDSS.2025 - Fall

#18 BinEnhance: An Enhancement Framework Based on External Environment Semantics for Binary Code Search [PDF] [Copy] [Kimi] [REL]

Authors: Yongpan Wang, Hong Li, Xiaojie Zhu, Siyuan Li, Chaopeng Dong, Shouguo Yang, Kangyuan Qin

Binary code search plays a crucial role in applications like software reuse detection, and vulnerability identification. Currently, existing models are typically based on either internal code semantics or a combination of function call graphs (CG) and internal code semantics. However, these models have limitations. Internal code semantic models only consider the semantics within the function, ignoring the inter-function semantics, making it difficult to handle situations such as function inlining. The combination of CG and internal code semantics is insufficient for addressing complex real-world scenarios. To address these limitations, we propose BINENHANCE, a novel framework designed to leverage the inter-function semantics to enhance the expression of internal code semantics for binary code search. Specifically, BINENHANCE constructs an External Environment Semantic Graph (EESG), which establishes a stable and analogous external environment for homologous functions by using different inter-function semantic relation (textit{e.g.}, textit{call}, textit{location}, textit{data-co-use}). After the construction of EESG, we utilize the embeddings generated by existing internal code semantic models to initialize EESG nodes. Finally, we design a Semantic Enhancement Model (SEM) that uses Relational Graph Convolutional Networks (RGCNs) and a residual block to learn valuable external semantics on the EESG for generating the enhanced semantics embedding. In addition, BinEnhance utilizes data feature similarity to refine the cosine similarity of semantic embeddings. We conduct experiments under six different tasks (textit{e.g.}, under textit{function inlining} scenario) and the results illustrate the performance and robustness of BINENHANCE. The application of BinEnhance to HermesSim, Asm2vec, TREX, Gemini, and Asteria on two public datasets results in an improvement of Mean Average Precision (MAP) from 53.6% to 69.7%. Moreover, the efficiency increases fourfold.

Subject: NDSS.2025 - Fall

#19 BitShield: Defending Against Bit-Flip Attacks on DNN Executables [PDF] [Copy] [Kimi] [REL]

Authors: Yanzuo Chen, Yuanyuan Yuan, Zhibo Liu, Sihang Hu, Tianxiang Li, Shuai Wang

Recent research has demonstrated the severity and prevalence of bit-flip attacks (BFAs; e.g., with Rowhammer techniques) on deep neural networks (DNNs). BFAs can manipulate DNN prediction and completely deplete DNN intelligence, and can be launched against both DNNs running on deep learning (DL) frameworks like PyTorch, as well as those compiled into standalone executables by DL compilers. While BFA defenses have been proposed for models on DL frameworks, we find them incapable of protecting DNN executables due to the new attack vectors on these executables. This paper proposes the first defense against BFA for DNN executables. We first present a motivating study to demonstrate the fragility and unique attack surfaces of DNN executables. Specifically, attackers can flip bits in the `.text` section to alter the computation logic of DNN executables and consequently manipulate DNN predictions; previous defenses guarding model weights can also be easily evaded when implemented in DNN executables. Subsequently, we propose BitShield, a full-fledged defense that detects BFAs targeting both data and `.text` sections in DNN executables. We novelly model BFA on DNN executables as a process to corrupt their semantics, and base BitShield on semantic integrity checks. Moreover, by deliberately fusing code checksum routines into a DNN’s semantics, we make BitShield highly resilient against BFAs targeting itself. BitShield is integrated in a popular DL compiler (Amazon TVM) and is compatible with all existing compilation and optimization passes. Unlike prior defenses, BitShield is designed to protect more vulnerable full-precision DNNs and does not assume specific attack methods, exhibiting high generality. BitShield also proactively detects ongoing BFA attempts instead of passively hardening DNNs. Evaluations show that BitShield provides strong protection against BFAs (average mitigation rate 97.51%) with low performance overhead (2.47% on average) even when faced with fully white-box, powerful attackers.

Subject: NDSS.2025 - Fall

#20 Blackbox Fuzzing of Distributed Systems with Multi-Dimensional Inputs and Symmetry-Based Feedback Pruning [PDF¹] [Copy] [Kimi] [REL]

Authors: Yonghao Zou, Jia-Ju Bai, Zu-Ming Jiang, Ming Zhao, Diyu Zhou

This paper presents DistFuzz, which, to our knowledge, is the first feedback-guided blackbox fuzzing framework for distributed systems. The novelty of DistFuzz comes from two conceptual contributions on key aspects of distributed system fuzzing: the input space and feedback metrics. Specifically, unlike prior work that focuses on systematically mutating faults, exploiting the request-driven and timing-dependence nature of distributed systems, DistFuzz proposes a multi-dimensional input space by incorporating regular events and relative timing among events as the other two dimensions. Furthermore, observing that important state changes in distributed systems can be indicated by network messages among nodes, DistFuzz utilizes the sequences of network messages with symmetry-based pruning as program feedback, which departs from the conventional wisdom that effective feedback requires code instrumentation/analysis and/or user inputs. DistFuzz finds 52 real bugs in ten popular distributed systems in C/C++, Go, and Java. Among these bugs, 28 have been confirmed by the developers, 20 were unknown before, and 4 have been assigned with CVEs.

Subject: NDSS.2025 - Fall

#21 Blindfold: Confidential Memory Management by Untrusted Operating System [PDF] [Copy] [Kimi] [REL]

Authors: Caihua Li, Seung-seob Lee, Lin Zhong

Confidential Computing (CC) has received increasing attention in recent years as a mechanism to protect user data from untrusted operating systems (OSes). Existing CC solutions hide confidential memory from the OS and/or encrypt it to achieve confidentiality. In doing so, they render OS memory optimization unusable or complicate the trusted computing base (TCB) required for optimization. This paper presents our results toward overcoming these limitations, synthesized in a CC design named Blindfold. Like many other CC solutions, Blindfold relies on a small trusted software component running at a higher privilege level than the kernel, called Guardian. It features three techniques that can enhance existing CC solutions. First, instead of nesting page tables, Blindfold’s Guardian mediates how the OS accesses memory and handles exceptions by switching page and interrupt tables. Second, Blindfold employs a lightweight capability system to regulate the OS’s semantic access to user memory, unifying case-by-case approaches in previous work. Finally, Blindfold provides carefully designed secure ABI for confidential memory management without encryption. We report an implementation of Blindfold that works on ARMv8-A/Linux. Using Blindfold's prototype, we are able to evaluate the cost of enabling confidential memory management by the untrusted Linux kernel. We show Blindfold has a smaller runtime TCB than related systems and enjoys competitive performance. More importantly, we show that the Linux kernel, including all of its memory optimizations except memory compression, can function properly for confidential memory. This requires only about 400 lines of kernel modifications.

Subject: NDSS.2025 - Fall

#22 CASPR: Context-Aware Security Policy Recommendation [PDF] [Copy] [Kimi] [REL]

Authors: Lifang Xiao, Hanyu Wang, Aimin Yu, Lixin Zhao, Dan Meng

Nowadays, SELinux has been widely used to provide flexible mandatory access control and security policies are critical to maintain the security of operating systems. Strictly speaking, all access requests must be restricted by appropriate policy rules to satisfy the functional requirements of the software or application. However, manually configuring security policy rules is an error-prone and time-consuming task that often requires expert knowledge. Therefore, it is a challenging task to recommend policy rules without anomalies effectively due to the numerous policy rules and the complexity of semantics. The majority of previous research mined information from policies to recommend rules but did not apply to the newly defined types without any rules. In this paper, we propose a context-aware security policy recommendation (CASPR) method that can automatically analyze and refine security policy rules. Context-aware information in CASPR includes policy rules, file locations, audit logs, and attribute information. According to these context-aware information, multiple features are extracted to calculate the similarity of privilege sets. Based on the calculation results, CASPR clusters types by the K-means model and then recommends rules automatically. The method automatically detects anomalies in security policy, namely, constraint conflicts, policy inconsistencies, and permission incompleteness. Further, the detected anomalous policies are refined so that the authorization rules can be effectively enforced. The experiment results confirm the feasibility of the proposed method for recommending effective rules for different versions of policies. We demonstrate the effectiveness of clustering by CASPR and calculate the contribution of each context-aware feature based on SHAP. CASPR not only recommends rules for newly defined types based on context-aware information but also enhances the accuracy of security policy recommendations for existing types, compared to other rule recommendation models. CASPR has an average accuracy of 91.582% and F1-score of 93.761% in recommending rules. Further, three kinds of anomalies in the policies can be detected and automatically repaired. We employ CASPR in multiple operating systems to illustrate the universality. The research has significant implications for security policy recommendation and provides a novel method for policy analysis with great potential.

Subject: NDSS.2025 - Fall

#23 CCTAG: Configurable and Combinable Tagged Architecture [PDF] [Copy] [Kimi] [REL]

Authors: Zhanpeng Liu, Yi Rong, Chenyang Li, Wende Tan, Yuan Li, Xinhui Han, Songtao Yang, Chao Zhang

Memory safety violations are a significant concern in real-world programs, prompting the development of various mitigation methods. However, existing cost-efficient defenses provide limited protection and can be bypassed by sophisticated attacks, necessitating the combination of multiple defenses. Unfortunately, combining these defenses often results in performance degradation and compatibility issues. We present CCTAG, a lightweight architecture that simplifies the integration of diverse tag-based defense mechanisms. It offers configurable tag verification and modification rules to build various security policies, acting as basic protection primitives for defense applications. Its policy-centric mask design boosts flexibility and prevents conflicts, enabling multiple defense mechanisms to run concurrently. Our RISC-V prototype on an FPGA board demonstrates that CCTAG incurs minimal hardware overhead, with a slight increase in LUTs (6.77%) and FFs (8.02%). With combined protections including ret address protection, code pointer and vtable pointer integrity, and memory coloring, the SPEC CPU CINT2006 and CINT2017 benchmarks report low runtime overheads of 4.71% and 7.93%, respectively. Security assessments with CVEs covering major memory safety vulnerabilities and various exploitation techniques verify CCTAG’s effectiveness in mitigating real-world threats.

Subject: NDSS.2025 - Fall

#24 Characterizing the Impact of Audio Deepfakes in the Presence of Cochlear Implant [PDF¹] [Copy] [Kimi] [REL]

Authors: Magdalena Pasternak, Kevin Warren, Daniel Olszewski, Susan Nittrouer, Patrick Traynor, Kevin Butler

Cochlear implants (CIs) allow deaf and hard-of-hearing individuals to use audio devices, such as phones or voice assistants. However, the advent of increasingly sophisticated synthetic audio (i.e., deepfakes) potentially threatens these users. Yet, this population's susceptibility to such attacks is unclear. In this paper, we perform the first study of the impact of audio deepfakes on CI populations. We examine the use of CI-simulated audio within deepfake detectors. Based on these results, we conduct a user study with 35 CI users and 87 hearing persons (HPs) to determine differences in how CI users perceive deepfake audio. We show that CI users can, similarly to HPs, identify text-to-speech generated deepfakes. Yet, they perform substantially worse for voice conversion deepfake generation algorithms, achieving only 67% correct audio classification. We also evaluate how detection models trained on a CI-simulated audio compare to CI users and investigate if they can effectively act as proxies for CI users. This work begins an investigation into the intersection between adversarial audio and CI users to identify and mitigate threats against this marginalized group.

Subject: NDSS.2025 - Fall

#25 CounterSEVeillance: Performance-Counter Attacks on AMD SEV-SNP [PDF] [Copy] [Kimi] [REL]

Authors: Stefan Gast, Hannes Weissteiner, Robin Leander Schröder, Daniel Gruss

Confidential virtual machines (VMs) promise higher security by running the VM inside a trusted execution environment (TEE). Recent AMD server processors support confidential VMs with the SEV-SNP processor extension. SEV-SNP provides guarantees for integrity and confidentiality for confidential VMs despite running them in a shared hosting environment. In this paper, we introduce CounterSEVeillance, a new side-channel attack leaking secret-dependent control flow and operand properties from performance counter data. Our attack is the first to exploit performance counter side-channel leakage with single-instruction resolution from SEV-SNP VMs and works on fully patched systems. We systematically analyze performance counter events in SEV-SNP VMs and find that 228 are exposed to a potentially malicious hypervisor. CounterSEVeillance builds on this analysis and records performance counter traces with an instruction-level resolution by single-stepping the victim VM using APIC interrupts in combination with page faults. We match CounterSEVeillance traces against binaries, precisely recovering the outcome of any secret-dependent conditional branch and inferring operand properties. We present four attack case studies, in which we exemplarily showcase concrete exploitable leakage with 6 of the exposed performance counters. First, we use CounterSEVeillance to extract a full RSA-4096 key from a single Mbed TLS signature process in less than 8 minutes. Second, we present the first side-channel attack on TOTP verification running in an AMD SEV-SNP VM, recovering a 6-digit TOTP with only 31.1 guesses on average. Third, we show that CounterSEVeillance can leak the secret key from which the TOTPs are derived from the underlying base32 decoder. Fourth and finally, we show that CounterSEVeillance can also be used to construct a plaintext-checking oracle in a divide-and-surrender-style attack. We conclude that moving an entire VM into a setting with a privileged adversary increases the attack surface, given the vast amounts of code not vetted for this specific security setting.

Subject: NDSS.2025 - Fall