| Total: 41
A Trigger-action platform (TAP) is a type of distributed system that allows end-users to create programs that stitch their web-based services together to achieve useful automation. For example, a program can be triggered when a new spreadsheet row is added, it can compute on that data and invoke an action, such as sending a message on Slack. Current TAP architectures require users to place complete trust in their secure operation. Experience has shown that unconditional trust in cloud services is unwarranted --- an attacker who compromises the TAP cloud service will gain access to sensitive data and devices for millions of users. In this work, we re-architect TAPs so that users have to place minimal trust in the cloud. Specifically, we design and implement TAPDance, a TAP that guarantees confidentiality and integrity of program execution in the presence of an untrustworthy TAP service. We utilize RISC-V Keystone enclaves to enable these security guarantees while minimizing the trusted software and hardware base. Performance results indicate that TAPDance outperforms a baseline TAP implementation using Node.js with 32% lower latency and 33% higher throughput on average.
Industrial Control Systems (ICS) govern critical infrastructure like power plants and water treatment plants. ICS can be attacked through manipulations of its sensor or actuator values, causing physical harm. A promising technique for detecting such attacks is machine-learning-based anomaly detection, but it does not identify which sensor or actuator was manipulated and makes it difficult for ICS operators to diagnose the anomaly's root cause. Prior work has proposed using attribution methods to identify what features caused an ICS anomaly-detection model to raise an alarm, but it is unclear how well these attribution methods work in practice. In this paper, we compare state-of-the-art attribution methods for the ICS domain with real attacks from multiple datasets. We find that attribution methods for ICS anomaly detection do not perform as well as suggested in prior work and identify two main reasons. First, anomaly detectors often detect attacks either immediately or significantly after the attack start; we find that attributions computed at these detection points are inaccurate. Second, attribution accuracy varies greatly across attack properties, and attribution methods struggle with attacks on categorical-valued actuators. Despite these challenges, we find that ensembles of attributions can compensate for weaknesses in individual attribution methods. Towards practical use of attributions for ICS anomaly detection, we provide recommendations for researchers and practitioners, such as the need to evaluate attributions with diverse datasets and the potential for attributions in non-real-time workflows.
Decoy passwords, or "honeywords," planted in a credential database can alert a site to its breach if ever submitted in a login attempt. To be effective, some honeywords must appear at least as likely to be user-chosen passwords as the real ones, and honeywords must be very difficult to guess without having breached the database, to prevent false breach alarms. These goals have proved elusive, however, for heuristic honeyword generation algorithms. In this paper we explore an alternative strategy in which the defender treats honeyword selection as a Bernoulli process in which each possible password (except the user-chosen one) is selected as a honeyword independently with some fixed probability. We show how Bernoulli honeywords can be integrated into two existing system designs for leveraging honeywords: one based on a honeychecker that stores the secret index of the user-chosen password in the list of account passwords, and another that does not leverage secret state at all. We show that Bernoulli honeywords enable analytic derivation of false breach-detection probabilities irrespective of what information the attacker gathers about the sites' users; that their true and false breach-detection probabilities demonstrate compelling efficacy; and that Bernoulli honeywords can even enable performance improvements in modern honeyword system designs.
Automatic speech recognition (ASR) provides diverse audio-to-text services for humans to communicate with machines. However, recent research reveals ASR systems are vulnerable to various malicious audio attacks. In particular, by removing the non-essential frequency components, a new spectrum reduction attack can generate adversarial audios that can be perceived by humans but cannot be correctly interpreted by ASR systems. It raises a new challenge for content moderation solutions to detect harmful content in audio and video available on social media platforms. In this paper, we propose an acoustic compensation system named ACE to counter the spectrum reduction attacks over ASR systems. Our system design is based on two observations, namely, frequency component dependencies and perturbation sensitivity. First, since the Discrete Fourier Transform computation inevitably introduces spectral leakage and aliasing effects to the audio frequency spectrum, the frequency components with similar frequencies will have a high correlation. Thus, considering the intrinsic dependencies between neighboring frequency components, it is possible to recover more of the original audio by compensating for the removed components based on the remaining ones. Second, since the removed components in the spectrum reduction attacks can be regarded as an inverse of adversarial noise, the attack success rate will decrease when the adversarial audio is replayed in an over-the-air scenario. Hence, we can model the acoustic propagation process to add over-the-air perturbations into the attacked audio. We implement a prototype of ACE and the experiments show that ACE can effectively reduce up to 87.9% of ASR inference errors caused by spectrum reduction attacks. Furthermore, by analyzing the residual errors on real audio samples, we summarize six general types of ASR inference errors and investigate the error causes and potential mitigation solutions.
We present a novel approach to developing programmable logic controller (PLC) malware that proves to be more flexible, resilient, and impactful than current strategies. While previous attacks on PLCs infect either the control logic or firmware portions of PLC computation, our proposed malware exclusively infects the web application hosted by the emerging embedded web servers within the PLCs. This strategy allows the malware to stealthily attack the underlying real-world machinery using the legitimate web application program interfaces (APIs) exposed by the admin portal website. Such attacks include falsifying sensor readings, disabling safety alarms, and manipulating physical actuators. Furthermore, this approach has significant advantages over existing PLC malware techniques (control logic and firmware) such as platform independence, ease-of-deployment, and higher levels of persistence. Our research shows that the emergence of web technology in industrial control environments has introduced new security concerns that are not present in the IT domain or consumer IoT devices. Depending on the industrial process being controlled by the PLC, our attack can potentially cause catastrophic incidents or even loss of life. We verified these claims by performing a Stuxnet-style attack using a prototype implementation of this malware on a widely-used PLC model by exploiting zero-day vulnerabilities that we discovered during our research (CVE-2022-45137, CVE-2022-45138, CVE-2022-45139, and CVE-2022-45140). Our investigation reveals that every major PLC vendor (80% of global market share) produces a PLC that is vulnerable to our proposed attack vector. Lastly, we discuss potential countermeasures and mitigations.
The InterPlanetary File System (IPFS) is currently the largest decentralized storage solution in operation, with thousands of active participants and millions of daily content transfers. IPFS is used as remote data storage for numerous blockchain-based smart contracts, Non-Fungible Tokens (NFT), and decentralized applications. We present a content censorship attack that can be executed with minimal effort and cost, and that prevents the retrieval of any chosen content in the IPFS network. The attack exploits a conceptual issue in a core component of IPFS, the Kademlia Distributed Hash Table (DHT), which is used to resolve content IDs to peer addresses. We provide efficient detection and mitigation mechanisms for this vulnerability. Our mechanisms achieve a 99.6% detection rate and mitigate 100% of the detected attacks with minimal signaling and computational overhead. We followed responsible disclosure procedures, and our countermeasures are scheduled for deployment in the future versions of IPFS.
With the increased capabilities at the edge (e.g., mobile device) and more stringent privacy requirement, it becomes a recent trend for deep learning-enabled applications to pre-process sensitive raw data at the edge and transmit the features to the backend cloud for further processing. A typical application is to run machine learning (ML) services on facial images collected from different individuals. To prevent identity theft, conventional methods commonly rely on an adversarial game-based approach to shed the identity information from the feature. However, such methods can not defend against adaptive attacks, in which an attacker takes a countermove against a known defence strategy. We propose Crafter, a feature crafting mechanism deployed at the edge, to protect the identity information from adaptive model inversion attacks while ensuring the ML tasks are properly carried out in the cloud. The key defence strategy is to mislead the attacker to a non-private prior from which the attacker gains little about the private identity. In this case, the crafted features act like poison training samples for attackers with adaptive model updates. Experimental results indicate that Crafter successfully defends both basic and possible adaptive attacks, which can not be achieved by state-of-the-art adversarial game-based methods.
Federated Learning (FL) is a promising approach enabling multiple clients to train Deep Neural Networks (DNNs) collaboratively without sharing their local training data. However, FL is susceptible to backdoor (or targeted poisoning) attacks. These attacks are initiated by malicious clients who seek to compromise the learning process by introducing specific behaviors into the learned model that can be triggered by carefully crafted inputs. Existing FL safeguards have various limitations: They are restricted to specific data distributions or reduce the global model accuracy due to excluding benign models or adding noise, are vulnerable to adaptive defense-aware adversaries, or require the server to access local models, allowing data inference attacks. This paper presents a novel defense mechanism, CrowdGuard, that effectively mitigates backdoor attacks in FL and overcomes the deficiencies of existing techniques. It leverages clients' feedback on individual models, analyzes the behavior of neurons in hidden layers, and eliminates poisoned models through an iterative pruning scheme. CrowdGuard employs a server-located stacked clustering scheme to enhance its resilience to rogue client feedback. The evaluation results demonstrate that CrowdGuard achieves a 100% True-Positive-Rate and True-Negative-Rate across various scenarios, including IID and non-IID data distributions. Additionally, CrowdGuard withstands adaptive adversaries while preserving the original performance of protected models. To ensure confidentiality, CrowdGuard uses a secure and privacy-preserving architecture leveraging Trusted Execution Environments (TEEs) on both client and server sides.
Audio eavesdropping poses serious threats to user privacy in daily mobile usage scenarios such as phone calls, voice messaging, and confidential meetings. Headphones are thus favored by mobile users as it provide physical sound isolation to protect audio privacy. However, our paper presents the first proof-of-concept system, Periscope, that demonstrates the vulnerabilities of headphone-plugged mobile devices. The system shows that unintentionally leaked electromagnetic radiations (EMR) from mobile devices' audio amplifiers can be exploited as an effective side-channel in recovering victim's audio sounds. Additionally, plugged headphones act as antennas that enhance the EMR strengths, making them easily measurable at long distances. Our feasibility studies and hardware analysis further reveal that EMRs are highly correlated with the device's audio inputs but suffer from signal distortions and ambient noises, making recovering audio sounds extremely challenging. To address this challenge, we develop signal processing techniques with a spectrogram clustering scheme that clears noises and distortions, enabling EMRs to be converted back to audio sounds. Our attack prototype, comparable in size to hidden voice recorders, successfully recovers victims' private audio sounds with a word error rate (WER) as low as 7.44% across 11 mobile devices and 6 headphones. The recovery results are recognizable to natural human hearing and online speech-to-text tools, and the system is robust against a wide range of attack scenario changes. We also reported the Periscope to 6 leading mobile manufacturers.
Although there has been extensive research on the transferability of adversarial attacks, existing methods for generating adversarial examples suffer from two significant drawbacks: poor stealthiness and low attack efficacy under low-round attacks. To address the above issues, we creatively propose an adversarial example generation method that ensembles the class activation maps of multiple models, called class activation mapping ensemble attack. We first use the class activation mapping method to discover the relationship between the decision of the Deep Neural Network and the image region. Then we calculate the class activation score for each pixel and use it as the weight for perturbation to enhance the stealthiness of adversarial examples and improve attack performance under low attack rounds. In the optimization process, we also ensemble class activation maps of multiple models to ensure the transferability of the adversarial attack algorithm. Experimental results show that our method generates adversarial examples with high perceptibility, transferability, attack performance under low-round attacks, and evasiveness. Specifically, when our attack capability is comparable to the most potent attack (VMIFGSM), our perceptibility is close to the best-performing attack (TPGD). For non-targeted attacks, our method outperforms the VMIFGSM by an average of 11.69% in attack capability against 13 target models and outperforms the TPGD by an average of 37.15%. For targeted attacks, our method achieves the fastest convergence, the most potent attack efficacy, and significantly outperforms the eight baseline methods in low-round attacks. Furthermore, our method can evade defenses and be used to assess the robustness of models.
In this paper, we uncover a new side-channel vulnerability in the widely used NAT port preservation strategy and an insufficient reverse path validation strategy of Wi-Fi routers, which allows an off-path attacker to infer if there is one victim client in the same network communicating with another host on the Internet using TCP. After detecting the presence of TCP connections between the victim client and the server, the attacker can evict the original NAT mapping and reconstruct a new mapping at the router by sending fake TCP packets due to the routers' vulnerability of disabling TCP window tracking strategy, which has been faithfully implemented in most of the routers for years. In this way, the attacker can intercept TCP packets from the server and obtain the current sequence and acknowledgment numbers, which in turn allows the attacker to forcibly close the connection, poison the traffic in plain text, or reroute the server's incoming packets to the attacker. We test 67 widely used routers from 30 vendors and discover that 52 of them are affected by this attack. Also, we conduct an extensive measurement study on 93 real-world Wi-Fi networks. The experimental results show that 75 of these evaluated Wi-Fi networks (81%) are fully vulnerable to our attack. Our case study shows that it takes about 17.5, 19.4, and 54.5 seconds on average to terminate an SSH connection, download private files from FTP servers, and inject fake HTTP response packets with success rates of 87.4%, 82.6%, and 76.1%. We responsibly disclose the vulnerability and suggest mitigation strategies to all affected vendors and have received positive feedback, including acknowledgments, CVEs, rewards, and adoption of our suggestions.
Although numerous dynamic testing techniques have been developed, they can hardly be directly applied to firmware of deeply embedded (e.g., microcontroller-based) devices due to the tremendously different runtime environment and restricted resources on these devices. This work tackles these challenges by leveraging the unique position of microcontroller devices during firmware development. That is, firmware developers have to rely on a powerful engineering workstation that connects to the target device to program and debug code. Therefore, we develop a decoupled firmware testing framework named IPEA, which shifts the overhead of resource-intensive analysis tasks from the microcontroller to the workstation. Only lightweight “needle probes” are left in the firmware to collect internal execution information without processing it. We also instantiated this framework with a sanitizer based on pointer capability (IPEA-San) and a greybox fuzzer (IPEA-Fuzz). By comparing IPEA-San with a port of AddressSanitizer for microcontrollers, we show that IPEA-San reduces memory overhead by 62.75% in real-world firmware with better detection accuracy. Combining IPEA-Fuzz with IPEA-San, we found 7 zero-day bugs in popular IoT libraries (3) and peripheral driver code (4).
Files are a significant attack vector for security boundary violation, yet a systematic understanding of the vulnerabilities underlying these attacks is lacking. To bridge this gap, we present a comprehensive analysis of File Hijacking Vulnerabilities (FHVulns), a type of vulnerability that enables attackers to breach security boundaries through the manipulation of file content or file paths. We provide an in-depth empirical study on 268 well-documented FHVuln CVE records from January 2020 to October 2022. Our study reveals the origins and triggering mechanisms of FHVulns and highlights that existing detection techniques have overlooked the majority of FHVulns. As a result, we anticipate a significant prevalence of zero-day FHVulns in software. We developed a dynamic analysis tool, JERRY, which effectively detects FHVulns at runtime by simulating hijacking actions during program execution. We applied JERRY to 438 popular software programs from vendors including Microsoft, Google, Adobe, and Intel, and found 339 zero-day FHVulns. We reported all vulnerabilities identified by JERRY to the corresponding vendors, and as of now, 84 of them have been confirmed or fixed, with 51 CVE IDs granted and $83,400 bug bounties earned.
Sharding is a prominent technique for scaling blockchains. By dividing the network into smaller components known as shards, a sharded blockchain can process transactions in parallel without introducing inconsistencies through the coordination of intra-shard and cross-shard consensus protocols. However, we observe a critical security issue with sharded systems: transaction ordering manipulations can occur when coordinating intra-shard and cross-shard consensus protocols, leaving the system vulnerable to attack. Specifically, we identify a novel security issue known as finalization fairness, which can be exploited through a front-running attack. This attack allows an attacker to manipulate the execution order of transactions, even if the victim's transaction has already been processed and added to the blockchain by a fair intra-shard consensus. To address the issue, we offer Haechi, a novel cross-shard protocol that is immune to front-running attacks. Haechi introduces an ordering phase between transaction processing and execution, ensuring that the execution order of transactions is the same as the processing order and achieving finalization fairness. To accommodate different consensus speeds among shards, Haechi incorporates a finalization fairness algorithm to achieve a globally fair order with minimal performance loss. By providing a global order, Haechi ensures strong consistency among shards, enabling better parallelism in handling conflicting transactions across shards. These features make Haechi a promising solution for supporting popular smart contracts in the real world. To evaluate Haechi's performance and effectiveness in preventing the attack, we implemented the protocol using Tendermint and conducted extensive experiments on a geo-distributed AWS environment. Our results demonstrate that Haechi can effectively prevent the presented front-running attack with little performance sacrifice compared to existing cross-shard consensus protocols.
Keyboards are the primary peripheral input devices for various critical computer application scenarios. This paper performs a security analysis of the keyboard sensing mechanisms and uncovers a new class of vulnerabilities that can be exploited to induce phantom keys---fake keystrokes injected into keyboards' analog circuits in a contactless way using electromagnetic interference (EMI). Besides normal keystrokes, such phantom keys also include keystrokes that cannot be achieved by human operators, such as rapidly injecting over 10,000 keys per minute and injecting hidden keys that do not exist on the physical keyboard. The underlying principles of phantom key injection consist in inducing false voltages on keyboard sensing GPIO pins through EMI coupled onto matrix circuits. We investigate the voltage and timing requirements of injection signals both theoretically and empirically to establish the theory of phantom key injection. To validate the threat of keyboard sensing vulnerabilities, we design GhostType that can cause denial-of-service of the keyboard and inject random keystrokes as well as certain targeted keystrokes of the adversary's choice. We have validated GhostType on 48 of 50 off-the-shelf keyboards/keypads from 20 brands including both membrane/mechanical structures and USB/Bluetooth protocols. Some example consequences of GhostType include completely blocking keyboard operations, crashing and turning off downstream computers, and deleting files on computers. Finally, we glean lessons from our investigations and propose countermeasures including EMI shielding, phantom key detection, and keystroke scanning signal improvement.
Generating accurate call graphs for large programs, particularly at the operating system (OS) level, poses a well-known challenge. This difficulty stems from the widespread use of indirect calls within large programs, wherein the computation of call targets is deferred until runtime to achieve program polymorphism. Consequently, compilers are unable to statically determine indirect call edges. Recent advancements have attempted to use type analysis to globally match indirect call targets in programs. However, these approaches still suffer from low precision when handling large target programs or generic types. This paper presents GNNIC, a Graph Neural Network (GNN) based Indirect Call analyzer. GNNIC employs a technique called abstract-similarity search to accurately identify indirect call targets in large programs. The approach is based on the observation that although indirect call targets exhibit intricate polymorphic behaviors, they share common abstract characteristics, such as function descriptions, data types, and invoked function calls. We consolidate such information into a representative abstraction graph (RAG) and employ GNNs to learn function embeddings. Abstract-similarity search relies on at least one anchor target to bootstrap. Therefore, we also propose a new program analysis technique to locally identify valid targets of each indirect call. Starting from anchor targets, GNNIC can expand the search scope to find more targets of indirect calls in the whole program. The implementation of GNNIC utilizes LLVM and GNN, and we evaluated it on multiple OS kernels. The results demonstrate that GNNIC outperforms state-of-the-art type-based techniques by reducing 86% to 93% of false target functions. Moreover, the abstract similarity and precise call graphs generated by GNNIC can enhance security applications by discovering new bugs, alleviating path-explosion issues, and improving the efficiency of static program analysis. The combination of static analysis and GNNIC resulted in finding 97 new bugs in Linux and FreeBSD kernels.
We propose a new compiler framework that automates code generation over multiple fully homomorphic encryption (FHE) schemes. While it was recently shown that algorithms combining multiple FHE schemes (e.g., CKKS and TFHE) achieve high execution efficiency and task utility at the same time, developing fast cross-scheme FHE algorithms for real-world applications generally require heavy hand-tuned optimizations by cryptographic experts, resulting in either high usability costs or low computational efficiency. To solve the usability and efficiency dilemma, we design and implement HEIR, a compiler framework based on multi-level intermediate representation (IR). To achieve cross-scheme compilation of efficient FHE circuits, we develop a two-stage code-lowering structure based on our custom IR dialects. First, the plaintext program along with the associated data types are converted into FHE-friendly dialects in the transformation stage. Then, in the optimization stage, we apply FHE-specific optimizations to lower the transformed dialect into our bottom-level FHE library operators. In the experiment, we implement the entire software stack for HEIR, and demonstrate that complex end-to-end programs, such as homomorphic K-Means clustering and homomorphic data aggregation in databases, can easily be compiled to run 72~179× faster than the program generated by the state-of-the-art FHE compilers.
Remote attestation has received much attention recently due to the proliferation of embedded and IoT devices. Among various solutions, methods based on hardware-software co-design (hybrid) are particularly popular due to their low overhead yet effective approaches. Despite their usefulness, hybrid methods still suffer from multiple limitations such as strict protections required for the attestation keys and restrictive operation and threat models such as disabling interrupts and neglecting time-of-check-time-of-use (TOCTOU) attacks. In this paper, we propose a new hybrid attestation method called IDA, which removes the requirement for disabling interrupts and restrictive access control for the secret key and attestation code, thus improving the system's overall security and flexibility. Rather than making use of a secret key to calculate the response, IDA verifies the attestation process with trusted hardware monitoring and certifies its authenticity only if it was followed precisely. Further, to prevent TOCTOU attacks and handle interrupts, we propose IDA+, which monitors program memory between attestation requests or during interrupts and informs the verifier of changes to the program memory. We implement and evaluate IDA and IDA+ on open-source MSP430 architecture, showing a reasonable overhead in terms of runtime, memory footprint, and hardware overhead while being robust against various attack scenarios. Comparing our method with the state-of-the-art, we show that it has minimal overhead while achieving important new properties such as support for interrupts and DMA requests and detecting TOCTOU attacks.
Automatic speech recognition (ASR) systems have been shown to be vulnerable to adversarial examples (AEs). Recent success all assumes that users will not notice or disrupt the attack process despite the existence of music/noise-like sounds and spontaneous responses from voice assistants. Nonetheless, in practical user-present scenarios, user awareness may nullify existing attack attempts that launch unexpected sounds or ASR usage. In this paper, we seek to bridge the gap in existing research and extend the attack to user-present scenarios. We propose VRIFLE, an inaudible adversarial perturbation (IAP) attack via ultrasound delivery that can manipulate ASRs as a user speaks. The inherent differences between audible sounds and ultrasounds make IAP delivery face unprecedented challenges such as distortion, noise, and instability. In this regard, we design a novel ultrasonic transformation model to enhance the crafted perturbation to be physically effective and even survive long-distance delivery. We further enable VRIFLE’s robustness by adopting a series of augmentation on user and real-world variations during the generation process. In this way, VRIFLE features an effective real-time manipulation of the ASR output from different distances and under any speech of users, with an alter-and-mute strategy that suppresses the impact of user disruption. Our extensive experiments in both digital and physical worlds verify VRIFLE’s effectiveness under various configurations, robustness against six kinds of defenses, and universality in a targeted manner. We also show that VRIFLE can be delivered with a portable attack device and even everyday-life loudspeakers.
Anonymous communication systems such as mix networks achieve anonymity at the expense of latency that is introduced to alter the flow of packets and hinder their tracing. A high latency however has a negative impact on usability. In this work, we propose LARMix, a novel latency-aware routing scheme for mixnets that reduces propagation latency with a limited impact on anonymity. LARMix can achieve this while also load balancing the traffic in the network. We additionally show how a network can be configured to maximize anonymity while meeting an average end-to-end latency constraint. Lastly, we perform a security analysis studying various adversarial strategies and conclude that LARMix does not significantly increase adversarial advantage as long as the adversary is not able to selectively compromise mixnodes after the LARMix routing policy has been computed.
Trusted execution environments (TEEs), like TrustZone, are pervasively employed to protect security sensitive programs and data from various attacks. We target compact TEE operating systems like OP-TEE, which implement minimum TEE internal core APIs. Such a TEE OS often has poor device driver support and we want to alleviate such issue by reusing existing Linux drivers inside TEE OSes. An intuitive approach is to port all its dependency functions into the TEE OS so that the driver can directly execute inside the TEE. But this approach significantly enlarges the trusted computing base (TCB), making the TEE OS no longer compact. In this paper, we propose a TEE driver execution environment---Linux driver runtime (LDR). A Linux driver needs two types of functions, library functions and Linux kernel subsystem functions that a compact TEE OS does not have. The LDR reuses the existing TEE OS library functions whenever possible and redirects the kernel subsystem function calls to the Linux kernel in the normal world. LDR is realized as a sandbox environment, which confines the Linux driver inside the TEE through the ARM domain access control features to address associated security issues. The sandbox mediates the driver's TEE functions calls, sanitizing arguments and return values as well as enforcing forward control flow integrity. We implement and deploy an LDR prototype on an NXP IMX6Q SABRE-SD evaluation board, adapt 6 existing Linux drivers into LDR, and evaluate their performance. The experimental results show that the LDR drivers can achieve comparable performance with their Linux counterparts with negligible overheads. We are the first to reuse functions in both the TEE OS and normal world Linux kernel to run a TEE device driver and address related security issues.
LiDAR (Light Detection And Ranging) is an indispensable sensor for precise long- and wide-range 3D sensing, which directly benefited the recent rapid deployment of autonomous driving (AD). Meanwhile, such a safety-critical application strongly motivates its security research. A recent line of research finds that one can manipulate the LiDAR point cloud and fool object detectors by firing malicious lasers against LiDAR. However, these efforts face 3 critical research gaps: (1) considering only one specific LiDAR (VLP-16); (2) assuming unvalidated attack capabilities; and (3) evaluating object detectors with limited spoofing capability modeling and setup diversity. To fill these critical research gaps, we conduct the first large-scale measurement study on LiDAR spoofing attack capabilities on object detectors with 9 popular LiDARs, covering both first- and new-generation LiDARs, and 3 major types of object detectors trained on 5 different datasets. To facilitate the measurements, we (1) identify spoofer improvements that significantly improve the latest spoofing capability, (2) identify a new object removal attack that overcomes the applicability limitation of the latest method to new-generation LiDARs, and (3) perform novel mathematical modeling for both object injection and removal attacks based on our measurement results. Through this study, we are able to uncover a total of 15 novel findings, including not only completely new ones due to the measurement angle novelty, but also many that can directly challenge the latest understandings in this problem space. We also discuss defenses.
*Prompt-tuning* has emerged as an attractive paradigm for deploying large-scale language models due to its strong downstream task performance and efficient multitask serving ability. Despite its wide adoption, we empirically show that prompt-tuning is vulnerable to downstream task-agnostic backdoors, which reside in the pretrained models and can affect arbitrary downstream tasks. The state-of-the-art backdoor detection approaches cannot defend against task-agnostic backdoors since they hardly converge in reversing the backdoor triggers. To address this issue, we propose LMSanitator, a novel approach for detecting and removing task-agnostic backdoors on Transformer models. Instead of directly inverting the triggers, LMSanitator aims to invert the *predefined attack vectors* (pretrained models' output when the input is embedded with triggers) of the task-agnostic backdoors, which achieves much better convergence performance and backdoor detection accuracy. LMSanitator further leverages prompt-tuning’s property of freezing the pretrained model to perform accurate and fast output monitoring and input purging during the inference phase. Extensive experiments on multiple language models and NLP tasks illustrate the effectiveness of LMSanitator. For instance, LMSanitator achieves 92.8% backdoor detection accuracy on 960 models and decreases the attack success rate to less than 1% in most scenarios.
Machine learning (ML) is promising in accurately detecting malicious flows in encrypted network traffic; however, it is challenging to collect a training dataset that contains a sufficient amount of encrypted malicious data with correct labels. When ML models are trained with low-quality training data, they suffer degraded performance. In this paper, we aim at addressing a real-world low-quality training dataset problem, namely, detecting encrypted malicious traffic generated by continuously evolving malware. We develop RAPIER that fully utilizes different distributions of normal and malicious traffic data in the feature space, where normal data is tightly distributed in a certain area and the malicious data is scattered over the entire feature space to augment training data for model training. RAPIER includes two pre-processing modules to convert traffic into feature vectors and correct label noises. We evaluate our system on two public datasets and one combined dataset. With 1000 samples and 45% noises from each dataset, our system achieves the F1 scores of 0.770, 0.776, and 0.855, respectively, achieving average improvements of 352.6%, 284.3%, and 214.9% over the existing methods, respectively. Furthermore, We evaluate RAPIER with a real-world dataset obtained from a security enterprise. RAPIER effectively achieves encrypted malicious traffic detection with the best F1 score of 0.773 and improves the F1 score of existing methods by an average of 272.5%.