| Total: 24
Contact discovery allows users of mobile messengers to conveniently connect with people in their address book. In this work, we demonstrate that severe privacy issues exist in currently deployed contact discovery methods. Our study of three popular mobile messengers (WhatsApp, Signal, and Telegram) shows that, contrary to expectations, large-scale crawling attacks are (still) possible. Using an accurate database of mobile phone number prefixes and very few resources, we have queried 10% of US mobile phone numbers for WhatsApp and 100% for Signal. For Telegram we find that its API exposes a wide range of sensitive information, even about numbers not registered with the service. We present interesting (cross-messenger) usage statistics, which also reveal that very few users change the default privacy settings. Regarding mitigations, we propose novel techniques to significantly limit the feasibility of our crawling attacks, especially a new incremental contact discovery scheme that strictly improves over Signal's current approach. Furthermore, we show that currently deployed hashing-based contact discovery protocols are severely broken by comparing three methods for efficient hash reversal of mobile phone numbers. For this, we also propose a significantly improved rainbow table construction for non-uniformly distributed inputs that is of independent interest.
Modern blockchains have evolved from cryptocurrency substrates to trust-decentralization platforms, supporting a wider variety of decentralized applications known as DApps. Blockchain remote procedure call (RPC) services emerge as an intermediary connecting the DApps to a blockchain network. In this work, we identify the free contract-execution capabilities that widely exist in blockchain RPCs as a vulnerability of denial of service (DoS) and present the DoERS attack, a Denial of Ethereum RPC service that incurs zero Ether cost to the attacker. To understand the DoERS exploitability in the wild, we conduct a systematic measurement study on nine real-world RPC services which control most DApp clients' connection to the Ethereum mainnet. In particular, we propose a novel measurement technique based on orphan transactions to discover the previously unknown behaviors inside the blackbox RPC services, including load balancing and gas limiting. Further DoERS strategies are proposed to evade the protection intended by these behaviors. We evaluate the effectiveness of DoERS attacks on deployed RPC services with minimal service interruption. The result shows that all the nine services tested (as of Apr. 2020) are vulnerable to DoERS attacks that can result in the service latency increased by $2.1Xsim{}50X$. Some of these attacks require only a single request. In addition, on a local Ethereum node protected by a very restrictive limit of $0.65$ block gas, sending 150 DoERS requests per second can slow down the block synchronization of the victim node by $91%$. We propose mitigation techniques against DoERS without dropping service usability, via unpredictable load balancing, performance anomaly detection, and others. These techniques can be integrated into a RPC service transparently to its clients.
Service workers are a powerful technology supported by all major modern browsers that can improve users' browsing experience by offering capabilities similar to those of native applications. While they are gaining significant traction in the developer community, they have not received much scrutiny from security researchers. In this paper, we explore the capabilities and inner workings of service workers and conduct the first comprehensive large-scale study of their API use in the wild. Subsequently, we show how attackers can exploit the strategic placement of service workers for history-sniffing in most major browsers, including Chrome and Firefox. We demonstrate two novel history-sniffing attacks that exploit the lack of appropriate isolation in these browsers, including a non-destructive cache-based version. Next, we present a series of use cases that illustrate how our techniques enable privacy-invasive attacks that can infer sensitive application-level information, such as a user's social graph. We have disclosed our techniques to all vulnerable vendors, prompting the Chromium team to explore a redesign of their site isolation mechanisms for defending against our attacks. We also propose a countermeasure that can be incorporated by websites to protect their users, and develop a tool that streamlines its deployment, thus facilitating adoption at a large scale. Overall, our work presents a cautionary tale on the severe risks of browsers deploying new features without an in-depth evaluation of their security and privacy implications.
Android's application framework plays a crucial part in protecting users' private data and the system integrity. Consequently, it has been the target of various prior works that analyzed its security policy and enforcement. Those works uncovered different security problems, including incomplete documentation, permission re-delegation within the framework, and inconsistencies in access control. However, all but one of those prior works were based on static code analysis. Thus, their results provide a one-sided view that inherits the limitations and drawbacks of applying static analysis to the vast, complex code base of the application framework. Even more, the performances of different security applications---including malware classification and least-privileged apps---depend on those analysis results, but those applications are currently tarnished by imprecise and incomplete results as a consequence of this imbalanced analysis methodology. To complement and refine this methodology and consequently improve the applications that are dependent on it, we add dynamic analysis of the application framework to the current research landscape and demonstrate the necessity of this move for improving the quality of prior results and advancing the field. Applying our solution, called Dynamo, to four prominent use-cases from the literature and taking a synoptical view on the results, we verify but also refute and extend the existing results of prior static analysis solutions. From the manual investigation of the root causes of discrepancies between results, we draw new insights and expert knowledge that can be valuable in improving both static and dynamic testing of the application framework.
Cybercrime scene reconstruction that aims to reconstruct a previous execution of the cyber attack delivery process is an important capability for cyber forensics (e.g., post mortem analysis of the cyber attack executions). Unfortunately, existing techniques such as log-based forensics or record-and-replay techniques are not suitable to handle complex and long-running modern applications for cybercrime scene reconstruction and post mortem forensic analysis. Specifically, log-based cyber forensics techniques often suffer from a lack of inspection capability and do not provide details of how the attack unfolded. Record-and-replay techniques impose significant runtime overhead, often require significant modifications on end-user systems, and demand to replay the entire recorded execution from the beginning. In this paper, we propose C^2SR, a novel technique that can reconstruct an attack delivery chain (i.e., cybercrime scene) for post-mortem forensic analysis. It provides a highly desired capability: interactable partial execution reconstruction. In particular, it reproduces a partial execution of interest from a large execution trace of a long-running program. The reconstructed execution is also interactable, allowing forensic analysts to leverage debugging and analysis tools that did not exist on the recorded machine. The key intuition behind C^2SR is partitioning an execution trace by resources and reproducing resource accesses that are consistent with the original execution. It tolerates user interactions required for inspections that do not cause inconsistent resource accesses. Our evaluation results on 26 real-world programs show that C^2SR has low runtime overhead (less than 5.47%) and acceptable space overhead. We also demonstrate with four realistic attack scenarios that C^2SR successfully reconstructs partial executions of long-running applications such as web browsers, and it can remarkably reduce the user's efforts to understand the incident.
Users can improve the security of remote communications by using Trusted Execution Environments (TEEs) to protect against direct introspection and tampering of sensitive data. This can even be done with applications coded in high-level languages with complex programming stacks such as R, Python, and Ruby. However, this creates a trade-off between programming convenience versus the risk of attacks using microarchitectural side channels. In this paper, we argue that it is possible to address this problem for important applications by instrumenting a complex programming environment (like R) to produce a Data-Oblivious Transcript (DOT) that is explicitly designed to support computation that excludes side channels. Such a transcript is then evaluated on a Trusted Execution Environment (TEE) containing the sensitive data using a small trusted computing base called the Data-Oblivious Virtual Environment (DOVE). To motivate the problem, we demonstrate a number of subtle side-channel vulnerabilities in the R language. We then provide an illustrative design and implementation of DOVE for R, creating the first side-channel resistant R programming stack. We demonstrate that the two-phase architecture provided by DOT generation and DOVE evaluation can provide practical support for complex programming languages with usable performance and high security assurances against side channels.
The controller area network (CAN) is widely adopted in modern automobiles to enable communications among in-vehicle electronic control units (ECUs). Lacking mainstream network security capabilities due to resource constraints, the CAN is susceptible to the ECU masquerade attack in which a compromised (attacker) ECU impersonates an uncompromised (victim) ECU and spoofs the latter’s CAN messages. A cost-effective state-of-the-art defense against such attacks is the CAN bus voltage-based intrusion detection system (VIDS), which identifies the source of each message using its voltage fingerprint on the bus. Since the voltage fingerprint emanates from an ECU's hardware characteristics, an attacker ECU by itself cannot controllably modify it. As such, VIDS has been proved effective in detecting masquerade attacks that each involve a single attacker. In this paper, we discover a novel voltage corruption tactic that leverages the capabilities of two compromised ECUs (i.e., an attacker ECU working in tandem with an accomplice ECU) to corrupt the bus voltages recorded by the VIDS. By exploiting this tactic along with the fundamental deficiencies of the CAN protocol, we propose a novel masquerade attack called DUET, which evades all existing VIDS irrespective of the features and classification algorithms employed in them. DUET follows a two-stage attack strategy to first manipulate a victim ECU’s voltage fingerprint during VIDS retraining mode, and then impersonate the manipulated fingerprint during VIDS operation mode. Our evaluation of DUET on real CAN buses (including three in two real cars) demonstrates an impersonation success rate of at least 90% in evading two state-of-the-art VIDS. Finally, to mitigate ECU masquerade attacks, we advocate the development of cost-effective defenses that break away from the "attack vs. IDS" arms race. We propose a lightweight defense called RAID, which enables each ECU to make protocol-compatible modifications in its frame format generating a unique dialect (spoken by ECUs) during VIDS retraining mode. RAID prevents corruption of ECUs’ voltage fingerprints, and re-enables VIDS to detect all ECU masquerade attacks including DUET.
BGP route leaks frequently precipitate serious disruptions to inter-domain routing. These incidents have plagued the Internet for decades while deployment and usability issues cripple efforts to mitigate the problem. Peerlock, introduced in 2016, addresses route leaks with a new approach. Peerlock enables filtering agreements between transit providers to protect their own networks without the need for broad cooperation or a trust infrastructure. We outline the Peerlock system and one variant, Peerlock-lite, and conduct live Internet experiments to measure their deployment on the control plane. Our measurements find evidence for significant Peerlock protection between Tier 1 networks in the peering clique, where 48% of potential Peerlock filters are deployed, and reveal that many other networks also deploy filters against Tier 1 leaks. To guide further deployment, we also quantify Peerlock’s impact on route leaks both at currently observed levels and under hypothetical future deployment scenarios via BGP simulation. These experiments reveal present Peerlock deployment restricts Tier 1 leak export to 10% or fewer networks for 40% of simulated leaks. Strategic additional Peerlock-lite deployment at all large ISPs (<1% of all networks), in tandem with Peerlock within the peering clique as deployed, completely mitigates about 80% of simulated Tier 1 route leaks.
Dynamic searchable symmetric encryption (SSE) supports updates and keyword searches in tandem on outsourced symmetrically encrypted data, while aiming to minimize the information revealed to the (untrusted) host server. The literature on dynamic SSE has identified two crucial security properties in this regard - emph{forward} and emph{backward} privacy. Forward privacy makes it hard for the server to correlate an update operation with previously executed search operations. Backward privacy limits the amount of information learnt by the server about documents that have already been deleted from the database. To date, work on forward and backward private SSE has focused mainly on single keyword search. However, for any SSE scheme to be truly practical, it should at least support conjunctive keyword search. In this setting, most prior SSE constructions with sub-linear search complexity do not support dynamic databases. The only exception is the scheme of Kamara and Moataz (EUROCRYPT'17); however it only achieves forward privacy. Achieving emph{both} forward and backward privacy, which is the most desirable security notion for any dynamic SSE scheme, has remained open in the setting of conjunctive keyword search. In this work, we develop the first forward and backward private SSE scheme for conjunctive keyword searches. Our proposed scheme, called Oblivious Dynamic Cross Tags (or ODXT in short), scales to very large arbitrarily-structured databases (including both attribute-value and free-text databases). ODXT provides a realistic trade-off between performance and security by efficiently supporting fast updates and conjunctive keyword searches over very large databases, while incurring only moderate access pattern leakages to the server that conform to existing notions of forward and backward privacy. We precisely define the leakage profile of ODXT, and present a detailed formal analysis of its security. We then demonstrate the practicality of ODXT by developing a prototype implementation and evaluating its performance on real world databases containing millions of documents.
When a domain is registered, information about the registrants and other related personnel is recorded by WHOIS databases owned by registrars or registries (called WHOIS providers jointly), which are open to public inquiries. However, due to the enforcement of the European Union’s General Data Protection Regulation (GDPR), certain WHOIS data (i.e., the records about EEA, or the European Economic Area, registrants) needs to be redacted before being released to the public. Anecdotally, it was reported that actions have been taken by some WHOIS providers. Yet, so far there is no systematic study to quantify the changes made by the WHOIS providers in response to the GDPR, their strategies for data redaction and impact on other applications relying on WHOIS data. In this study, we report the first large-scale measurement study to answer these questions, in hopes of guiding the enforcement of the GDPR and identifying pitfalls during compliance. This study is made possible by analyzing a collection of 1.2 billion WHOIS records spanning two years. To automate the analysis tasks, we build a new system GCChecker based on unsupervised learning, which assigns a compliance score to a provider. Our findings of WHOIS GDPR compliance are multi-fold. To highlight a few, we discover that the GDPR has a profound impact on WHOIS, with over 85% surveyed large WHOIS providers redacting EEA records at scale. Surprisingly, over 60% large WHOIS data providers also redact non-EEA records. A variety of compliance flaws like incomplete redaction are also identified. The impact on security applications is prominent and redesign might be needed. We believe different communities (security, domain and legal) should work together to solve the issues for better WHOIS privacy and utility.
Amazon's voice-based assistant, Alexa, enables users to directly interact with various web services through natural language dialogues. It provides developers with the option to create third-party applications (known as Skills) to run on top of Alexa. While such applications ease users' interaction with smart devices and bolster a number of additional services, they also raise security and privacy concerns due to the personal setting they operate in. This paper aims to perform a systematic analysis of the Alexa skill ecosystem. We perform the first large-scale analysis of Alexa skills, obtained from seven different skill stores totaling to 90,194 unique skills. Our analysis reveals several limitations that exist in the current skill vetting process. We show that not only can a malicious user publish a skill under any arbitrary developer/company name, but she can also make backend code changes after approval to coax users into revealing unwanted information. We, next, formalize the different skill-squatting techniques and evaluate the efficacy of such techniques. We find that while certain approaches are more favorable than others, there is no substantial abuse of skill squatting in the real world. Lastly, we study the prevalence of privacy policies across different categories of skill, and more importantly the policy content of skills that use the Alexa permission model to access sensitive user data. We find that around 23.3% of such skills do not fully disclose the data types associated with the permissions requested. We conclude by providing some suggestions for strengthening the overall ecosystem, and thereby enhance transparency for end-users.
Decision trees are popular machine-learning classification models due to their simplicity and effectiveness. Tai et al. (ESORICS '17) propose a privacy-preserving decision-tree evaluation protocol purely based on additive homomorphic encryption, without introducing dummy nodes for hiding the tree structure, but it runs a secure comparison for each decision node, resulting in linear complexity. Later protocols (DBSEC '18, PETS '19) achieve sublinear (client-side) complexity, yet the server-side path evaluation requires oblivious transfer among $2^d$ real and dummy nodes even for a sparse tree of depth $d$ to hide the tree structure. This paper aims for the best of both worlds and hence the most lightweight protocol to date. Our complete-tree protocol can be easily extended to the sparse-tree setting and the reusable outsourcing setting: a model owner (resp. client) can outsource the decision tree (resp. attributes) to two non-colluding servers for classifications. The outsourced extension supports multi-client joint evaluation, which is the first of its kind without using multi-key fully-homomorphic encryption (TDSC '19). We also extend our protocol for achieving privacy against malicious adversaries. Our experiments compare in various network settings our offline and online communication costs and the online computation time with the prior sublinear protocol of Tueno et al. (PETS '19) and $O(1)$-round linear protocols of Kiss et al. (PETS '19), which can be seen as garbled circuit variants of Tai et al.'s. Our protocols are shown to be desirable for IoT-like scenarios with weak clients and big-data scenarios with high-dimensional feature vectors.
Searchable Symmetric Encryption (SSE) allows a data owner to securely outsource its encrypted data to a cloud server while maintaining the ability to search over it and retrieve matched documents. Most existing SSE schemes leak which documents are accessed per query, i.e., the so-called access pattern, and thus are vulnerable to attacks that can recover the database or the queried keywords. Current techniques that fully hide access patterns, such as ORAM or PIR, suffer from heavy communication or computational costs, and are not designed with search capabilities in mind. Recently, Chen et al. (INFOCOM'18) proposed an obfuscation framework for SSE that protects the access pattern in a differentially private way with a reasonable utility cost. However, this scheme always produces the same obfuscated access pattern when querying for the same keyword, and thus leaks the so-called search pattern, i.e., how many times a certain query is performed. This leakage makes the proposal vulnerable to certain database and query recovery attacks. In this paper, we propose OSSE (Obfuscated SSE), an SSE scheme that obfuscates the access pattern independently for each query performed. This in turn hides the search pattern and makes our scheme resistant against attacks that rely on this leakage. Given certain reasonable assumptions on the database and query distribution, our scheme has smaller communication overhead than ORAM-based SSE. Furthermore, our scheme works in a single communication round and requires very small constant client-side storage. Our empirical evaluation shows that OSSE is highly effective at protecting against different query recovery attacks while keeping a reasonable utility level. Our protocol provides significantly more protection than the proposal by Chen et al. against some state-of-the-art attacks, which demonstrates the importance of hiding search patterns in designing effective privacy-preserving SSE schemes.
Apple devices (e.g., iPhone, MacBook, iPad, and Apple Watch) are high value targets for attackers. Although these devices use different operating systems (e.g., iOS, macOS, iPadOS, watchOS, and tvOS), they are all based on a hybrid kernel called XNU. Existing attacks demonstrated that vulnerabilities in XNU could be exploited to escalate privileges and jailbreak devices. To mitigate these threats, multiple security mechanisms have been deployed in latest systems. In this paper, we first perform a systematic assessment of deployed mitigations by Apple, and demonstrate that most of them can be bypassed through corrupting a special type of kernel objects, i.e., Mach port objects. We summarize this type of attack as (Mach) Port Object-Oriented Programming (POP). Accordingly, we define multiple attack primitives to launch the attack and demonstrate realistic scenarios to achieve full memory manipulation on recently released systems (i.e., iOS 13 and macOS 10.15). To defend against POP, we propose the Port Ultra-SHield (PUSH) system to reduce the number of unprotected Mach port objects. Specifically, PUSH automatically locates potential POP primitives and instruments related system calls to enforce the integrity of Mach port kernel objects. It does not require system modifications and only introduces 2% runtime overhead. The PUSH framework has been deployed on more than 40,000 macOS devices in a leading company. The evaluation of 18 public exploits and one zero-day exploit detected by our system demonstrated the effectiveness of PUSH. We believe that the proposed framework will facilitate the design and implementation of a more secure XNU kernel.
PDF is the de-facto standard for document exchange. It is common to open PDF files from potentially untrusted sources such as email attachments or downloaded from the Internet. In this work, we perform an in-depth analysis of the capabilities of malicious PDF documents. Instead of focusing on implementation bugs, we abuse legitimate features of the PDF standard itself by systematically identifying dangerous paths in the PDF file structure. These dangerous paths lead to attacks that we categorize into four generic classes: (1) Denial-of-Service attacks affecting the host that processes the document. (2) Information disclosure attacks leaking personal data out of the victim’s computer. (3) Data manipulation on the victim’s system. (4) Code execution on the victim’s machine. An evaluation of 28 popular PDF processing applications shows that 26 of them are vulnerable at least one attack. Finally, we propose a methodology to protect against attacks based on PDF features systematically.
Over the years, browsers have adopted an ever-increasing number of client-enforced security policies deployed by means of HTTP headers. Such mechanisms are fundamental for web application security, and usually deployed on a per-page basis. This, however, enables inconsistencies, as different pages within the same security boundaries (in form of origins or sites) can express conflicting security requirements. In this paper, we formalize inconsistencies for cookie security attributes, CSP, and HSTS, and then quantify the magnitude and impact of inconsistencies at scale by crawling 15,000 popular sites. We show numerous sites endanger their own security by omission or misconfiguration of the aforementioned mechanisms, which lead to unnecessary exposure to XSS, cookie theft and HSTS deactivation. We then use our data to analyse to which extent the recent *Origin Policy* proposal can fix the problem of inconsistencies. Unfortunately, we conclude that the current Origin Policy design suffers from major shortcomings which limit its practical applicability to address security inconsistencies, while catering to the need of real-world sites. Based on these insights, we propose Site Policy, an extension of Origin Policy designed to overcome the shortcomings of Origin Policy and to make any insecurity explicit.
Since their introduction over two decades ago, side-channel attacks have presented a serious security threat. While many ciphers’ implementations employ masking techniques to protect against such attacks, they often leak secret information due to unintended interactions in the hardware. We present Rosita, a code rewrite engine that uses a leakage emulator which we amend to correctly emulate the micro-architecture of a target system. We use Rosita to automatically protect masked implementations of AES, ChaCha, and Xoodoo. For AES and Xoodoo, we show the absence of observable leakage at 1000000 traces with less than 21% penalty to the performance. For ChaCha, which has significantly more leakage, Rosita eliminates over 99% of the leakage, at a performance cost of 64%
We introduce emph{screen gleaning}, a TEMPEST attack in which the screen of a mobile device is read without a visual line of sight, revealing sensitive information displayed on the phone screen. The screen gleaning attack uses an antenna and a software-defined radio (SDR) to pick up the electromagnetic signal that the device sends to the screen to display, e.g., a message with a security code. This special equipment makes it possible to recreate the signal as a gray-scale image, which we refer to as an emph{emage}. Here, we show that it can be used to read a security code. The screen gleaning attack is challenging because it is often impossible for a human viewer to interpret the emage directly. We show that this challenge can be addressed with machine learning, specifically, a deep learning classifier. Screen gleaning will become increasingly serious as SDRs and deep learning continue to rapidly advance. In this paper, we demonstrate the security code attack and we propose a testbed that provides a standard setup in which screen gleaning could be tested with different attacker models. Finally, we analyze the dimensions of screen gleaning attacker models and discuss possible countermeasures with the potential to address them.
Mobile ad fraud is a significant threat that victimizes app publishers and their users, thereby undermining the ecosystem of app markets. Prior works on detecting mobile ad fraud have focused on constructing predefined test scenarios that preclude user involvement in identifying ad fraud. However, due to their dependence on contextual testing environments, these works have neglected to track which app modules and which user interactions are responsible for observed ad fraud. To address these shortcomings, this paper presents the design and implementation of FraudDetective, a dynamic testing framework that identifies ad fraud activities. FraudDetective focuses on identifying fraudulent activities that originate without any user interactions. FraudDetective computes a full stack trace from an observed ad fraud activity to a user event by connecting fragmented multiple stack traces, thus generating the causal relationships between user inputs and the observed fraudulent activity. We revised an Android Open Source Project (AOSP) to emit detected ad fraud activities along with their full stack traces, which help pinpoint the app modules responsible for the observed fraud activities. We evaluate FraudDetective on 48,172 apps from Google Play Store. FraudDetective reports that 74 apps are responsible for 34,453 ad fraud activities and find that 98.6% of the fraudulent behaviors originate from embedded third-party ad libraries. Our evaluation demonstrates that FraudDetective is capable of accurately identifying ad fraud via reasoning based on observed suspicious behaviors without user interactions. The experimental results also yield the new insight that abusive ad service providers harness their ad libraries to actively engage in committing ad fraud.
Package managers have become a vital part of the modern software development process. They allow developers to reuse third-party code, share their own code, minimize their codebase, and simplify the build process. However, recent reports showed that package managers have been abused by attackers to distribute malware, posing significant security risks to developers and end-users. For example, eslint-scope, a package with millions of weekly downloads in Npm, was compromised to steal credentials from developers. To understand the security gaps and the misplaced trust that make recent supply chain attacks possible, we propose a comparative framework to qualitatively assess the functional and security features of package managers for interpreted languages. Based on qualitative assessment, we apply well-known program analysis techniques such as metadata, static, and dynamic analysis to study registry abuse. Our initial efforts found 339 new malicious packages that we reported to the registries for removal. The package manager maintainers confirmed 278 (82%) from the 339 reported packages where three of them had more than 100,000 downloads. For these packages we were issued official CVE numbers to help expedite the removal of these packages from infected victims. We outline the challenges of tailoring program analysis tools to interpreted languages and release our pipeline as a reference point for the community to build on and help in securing the software supply chain.
Accurate and robust disassembly of stripped binaries is challenging. The root of the difficulty is that high-level structures, such as instruction and function boundaries, are absent in stripped binaries and must be recovered based on incomplete information. Current disassembly approaches rely on heuristics or simple pattern matching to approximate the recovery, but these methods are often inaccurate and brittle, especially across different compiler optimizations. We present XDA, a transfer-learning-based disassembly framework that learns different contextual dependencies present in machine code and transfers this knowledge for accurate and robust disassembly. We design a self-supervised learning task motivated by masked Language Modeling to learn interactions among byte sequences in binaries. The outputs from this task are byte embeddings that encode sophisticated contextual dependencies between input binaries' byte tokens, which can then be finetuned for downstream disassembly tasks. We evaluate XDA's performance on two disassembly tasks, recovering function boundaries and assembly instructions, on a collection of 3,121 binaries taken from SPEC CPU2017, SPEC CPU2006, and the BAP corpus. The binaries are compiled by GCC, ICC, and MSVC on x86/x64 Windows and Linux platforms over 4 optimization levels. XDA achieves 99.0% and 99.7% F1 score at recovering function boundaries and instructions, respectively, surpassing the previous state-of-the-art on both tasks. It also maintains speed on par with the fastest ML-based approach and is up to 38x faster than hand-written disassemblers like IDA Pro. We release the code of XDA at https://github.com/CUMLSec/XDA.
Due to recent world events, video calls have become the new norm for both personal and professional remote communication. However, if a participant in a video call is not careful, he/she can reveal his/her private information to others in the call. In this paper, we design and evaluate an attack framework to infer one type of such private information from the video stream of a call -- keystrokes, i.e., text typed during the call. We evaluate our video-based keystroke inference framework using different experimental settings, such as different webcams, video resolutions, keyboards, clothing, and backgrounds. Our high keystroke inference accuracies under commonly occurring experimental settings highlight the need for awareness and countermeasures against such attacks. Consequently, we also propose and evaluate effective mitigation techniques that can automatically protect users when they type during a video call.