Total: 1
Along with the broad deployment of deep learning (DL) systems, their lack of trustworthiness, such as their lack of robustness, fairness, and numerical reliability, is raising serious social concerns, especially in safety-critical scenarios such as autonomous driving and aircraft navigation. Hence, a rigorous and accurate evaluation of the trustworthiness of DL systems is essential and would be a prerequisite for improving DL trustworthiness. The first part of the talk will be an overview of certified methods for DL trustworthiness. These methods provide computable guarantees for DL systems in terms of worst-case trustworthiness under certain realistic conditions, such as the accuracy lower bound against arbitrary tiny perturbations. Based on our taxonomy and systematization, we illustrate key methodologies, specifically semantic randomized smoothing and branch-and-bound, and their implications for certified DL trustworthiness. As a representative of recent DL breakthroughs, large language models (LLMs) are transforming our lives, but, on the other hand, posing more challenges to trustworthiness. For example, LLMs can be jailbroken with adversarial prompts to output harmful content with bias, harassment, misinformation, and more. The second part of the talk will be an overview of LLM trustworthiness. We will start with sharing hands-on experience in developing fontier LLMs, then illustrate common LLM trustworthiness issues via examples, then demonstrate evaluation challenges, take one benchmark as an example, and conclude by envisioning certifiable trustworthiness for LLMs.