Electrical Engineering and Systems Science

Date: Thu, 9 May 2024 | Total: 41

#1 Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models [PDF4] [Copy] [Kimi5]

Authors: Hongjie Wang ; Difan Liu ; Yan Kang ; Yijun Li ; Zhe Lin ; Niraj K. Jha ; Yuchen Liu

Diffusion Models (DMs) have exhibited superior performance in generating high-quality and diverse images. However, this exceptional performance comes at the cost of expensive architectural design, particularly due to the attention module heavily used in leading models. Existing works mainly adopt a retraining process to enhance DM efficiency. This is computationally expensive and not very scalable. To this end, we introduce the Attention-driven Training-free Efficient Diffusion Model (AT-EDM) framework that leverages attention maps to perform run-time pruning of redundant tokens, without the need for any retraining. Specifically, for single-denoising-step pruning, we develop a novel ranking algorithm, Generalized Weighted Page Rank (G-WPR), to identify redundant tokens, and a similarity-based recovery method to restore tokens for the convolution operation. In addition, we propose a Denoising-Steps-Aware Pruning (DSAP) approach to adjust the pruning budget across different denoising timesteps for better generation quality. Extensive evaluations show that AT-EDM performs favorably against prior art in terms of efficiency (e.g., 38.8% FLOPs saving and up to 1.53x speed-up over Stable Diffusion XL) while maintaining nearly the same FID and CLIP scores as the full model. Project webpage: https://atedm.github.io.

#2 Bridging the Gap Between Saliency Prediction and Image Quality Assessment [PDF1] [Copy] [Kimi]

Authors: Kirillov Alexey ; Andrey Moskalenko ; Dmitriy Vatolin

Over the past few years, deep neural models have made considerable advances in image quality assessment (IQA). However, the underlying reasons for their success remain unclear, owing to the complex nature of deep neural networks. IQA aims to describe how the human visual system (HVS) works and to create its efficient approximations. On the other hand, Saliency Prediction task aims to emulate HVS via determining areas of visual interest. Thus, we believe that saliency plays a crucial role in human perception. In this work, we conduct an empirical study that reveals the relation between IQA and Saliency Prediction tasks, demonstrating that the former incorporates knowledge of the latter. Moreover, we introduce a novel SACID dataset of saliency-aware compressed images and conduct a large-scale comparison of classic and neural-based IQA methods. All supplementary code and data will be available at the time of publication.

#3 Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification [PDF1] [Copy] [Kimi]

Authors: Mukaffi Bin Moin ; Fatema Tuj Johora Faria ; Swarnajit Saha ; Bushra Kamal Rafa ; Mohammad Shafiul Alam

Lung and colon cancer are serious worldwide health challenges that require early and precise identification to reduce mortality risks. However, diagnosis, which is mostly dependent on histopathologists' competence, presents difficulties and hazards when expertise is insufficient. While diagnostic methods like imaging and blood markers contribute to early detection, histopathology remains the gold standard, although time-consuming and vulnerable to inter-observer mistakes. Limited access to high-end technology further limits patients' ability to receive immediate medical care and diagnosis. Recent advances in deep learning have generated interest in its application to medical imaging analysis, specifically the use of histopathological images to diagnose lung and colon cancer. The goal of this investigation is to use and adapt existing pre-trained CNN-based models, such as Xception, DenseNet201, ResNet101, InceptionV3, DenseNet121, DenseNet169, ResNet152, and InceptionResNetV2, to enhance classification through better augmentation strategies. The results show tremendous progress, with all eight models reaching impressive accuracy ranging from 97% to 99%. Furthermore, attention visualization techniques such as GradCAM, GradCAM++, ScoreCAM, Faster Score-CAM, and LayerCAM, as well as Vanilla Saliency and SmoothGrad, are used to provide insights into the models' classification decisions, thereby improving interpretability and understanding of malignant and benign image classification.

#4 Dichotic harmony for the musical practice [PDF] [Copy] [Kimi]

Author: Vadim R. Madgazin

The dichotic method of hearing sound adapts in the region of musical harmony. The algorithm of the separation of the being dissonant voices into several separate groups is proposed. For an increase in the pleasantness of chords the different groups of voices are heard out through the different channels of headphones. Is created two demonstration program for PC. Keywords: music, harmony, chord, dichotic listening, dissonance, consonance, headphones, pleasantness, midi.

#5 Image Classification for CSSVD Detection in Cacao Plants [PDF] [Copy] [Kimi]

Authors: Atuhurra Jesse ; N'guessan Yves-Roland Douha ; Pabitra Lenka

The detection of diseases within plants has attracted a lot of attention from computer vision enthusiasts. Despite the progress made to detect diseases in many plants, there remains a research gap to train image classifiers to detect the cacao swollen shoot virus disease or CSSVD for short, pertinent to cacao plants. This gap has mainly been due to the unavailability of high quality labeled training data. Moreover, institutions have been hesitant to share their data related to CSSVD. To fill these gaps, we propose the development of image classifiers to detect CSSVD-infected cacao plants. Our proposed solution is based on VGG16, ResNet50 and Vision Transformer (ViT). We evaluate the classifiers on a recently released and publicly accessible KaraAgroAI Cocoa dataset. Our best image classifier, based on ResNet50, achieves 95.39\% precision, 93.75\% recall, 94.34\% F1-score and 94\% accuracy on only 20 epochs. There is a +9.75\% improvement in recall when compared to previous works. Our results indicate that the image classifiers learn to identify cacao plants infected with CSSVD.

#6 Some variation of COBRA in sequential learning setup [PDF] [Copy] [Kimi]

Authors: Aryan Bhambu ; Arabin Kumar Dey

This research paper introduces innovative approaches for multivariate time series forecasting based on different variations of the combined regression strategy. We use specific data preprocessing techniques which makes a radical change in the behaviour of prediction. We compare the performance of the model based on two types of hyper-parameter tuning Bayesian optimisation (BO) and Usual Grid search. Our proposed methodologies outperform all state-of-the-art comparative models. We illustrate the methodologies through eight time series datasets from three categories: cryptocurrency, stock index, and short-term load forecasting.

#7 Detecting and Refining HiRISE Image Patches Obscured by Atmospheric Dust [PDF] [Copy] [Kimi]

Author: Kunal Sunil Kasodekar

HiRISE (High-Resolution Imaging Science Experiment) is a camera onboard the Mars Reconnaissance orbiter responsible for photographing vast areas of the Martian surface in unprecedented detail. It can capture millions of incredible closeup images in minutes. However, Mars suffers from frequent regional and local dust storms hampering this data-collection process, and pipeline, resulting in loss of effort and crucial flight time. Removing these images manually requires a large amount of manpower. I filter out these images obstructed by atmospheric dust automatically by using a Dust Image Classifier fine-tuned on Resnet-50 with an accuracy of 94.05%. To further facilitate the seamless filtering of Images I design a prediction pipeline that classifies and stores these dusty patches. I also denoise partially obstructed images using an Auto Encoder-based denoiser and Pix2Pix GAN with 0.75 and 0.99 SSIM Index respectively.

#8 ATDM:An Anthropomorphic Aerial Tendon-driven Manipulator with Low-Inertia and High-Stiffness [PDF] [Copy] [Kimi]

Authors: Quman Xu ; Zhan Li ; Hai Li ; Xinghu Yu ; Yipeng Yang

Aerial Manipulator Systems (AMS) have garnered significant interest for their utility in aerial operations. Nonetheless, challenges related to the manipulator's limited stiffness and the coupling disturbance with manipulator movement persist. This paper introduces the Aerial Tendon-Driven Manipulator (ATDM), an innovative AMS that integrates a hexrotor Unmanned Aerial Vehicle (UAV) with a 4-degree-of-freedom (4-DOF) anthropomorphic tendon-driven manipulator. The design of the manipulator is anatomically inspired, emulating the human arm anatomy from the shoulder joint downward. To enhance the structural integrity and performance, finite element topology optimization and lattice optimization are employed on the links to replicate the radially graded structure characteristic of bone, this approach effectively reduces weight and inertia while simultaneously maximizing stiffness. A novel tensioning mechanism with adjustable tension is introduced to address cable relaxation, and a Tension-amplification tendon mechanism is implemented to increase the manipulator's overall stiffness and output. The paper presents a kinematic model based on virtual coupled joints, a comprehensive workspace analysis, and detailed calculations of output torques and stiffness for individual arm joints. The prototype arm has a total weight of 2.7 kg, with the end effector contributing only 0.818 kg. By positioning all actuators at the base, coupling disturbance are minimized. The paper includes a detailed mechanical design and validates the system's performance through semi-physical multi-body dynamics simulations, confirming the efficacy of the proposed design.

#9 Enhancing Data Integrity and Traceability in Industry Cyber Physical Systems (ICPS) through Blockchain Technology: A Comprehensive Approach [PDF] [Copy] [Kimi]

Authors: Mohammad Ikbal Hossain ; Dr. Tanja Steigner ; Muhammad Imam Hussain ; Afroja Akther

Blockchain technology, heralded as a transformative innovation, has far-reaching implications beyond its initial application in cryptocurrencies. This study explores the potential of blockchain in enhancing data integrity and traceability within Industry Cyber-Physical Systems (ICPS), a crucial aspect in the era of Industry 4.0. ICPS, integrating computational and physical components, is pivotal in managing critical infrastructure like manufacturing, power grids, and transportation networks. However, they face challenges in security, privacy, and reliability. With its inherent immutability, transparency, and distributed consensus, blockchain presents a groundbreaking approach to address these challenges. It ensures robust data reliability and traceability across ICPS, enhancing transaction transparency and facilitating secure data sharing. This research unearths various blockchain applications in ICPS, including supply chain management, quality control, contract management, and data sharing. Each application demonstrates blockchain's capacity to streamline processes, reduce fraud, and enhance system efficiency. In supply chain management, blockchain provides real-time auditing and compliance. For quality control, it establishes tamper-proof records, boosting consumer confidence. In contract management, smart contracts automate execution, enhancing efficiency. Blockchain also fosters secure collaboration in ICPS, which is crucial for system stability and safety. This study emphasizes the need for further research on blockchain's practical implementation in ICPS, focusing on challenges like scalability, system integration, and security vulnerabilities. It also suggests examining blockchain's economic and organizational impacts in ICPS to understand its feasibility and long-term advantages.

#10 Regime Learning for Differentiable Particle Filters [PDF] [Copy] [Kimi]

Authors: John-Joseph Brady ; Yuhui Luo ; Wenwu Wang ; Victor Elvira ; Yunpeng Li

Differentiable particle filters are an emerging class of models that combine sequential Monte Carlo techniques with the flexibility of neural networks to perform state space inference. This paper concerns the case where the system may switch between a finite set of state-space models, i.e. regimes. No prior approaches effectively learn both the individual regimes and the switching process simultaneously. In this paper, we propose the neural network based regime learning differentiable particle filter (RLPF) to address this problem. We further design a training procedure for the RLPF and other related algorithms. We demonstrate competitive performance compared to the previous state-of-the-art algorithms on a pair of numerical experiments.

#11 The Codecfake Dataset and Countermeasures for the Universally Detection of Deepfake Audio [PDF] [Copy] [Kimi]

Authors: Yuankun Xie ; Yi Lu ; Ruibo Fu ; Zhengqi Wen ; Zhiyong Wang ; Jianhua Tao ; Xin Qi ; Xiaopeng Wang ; Yukun Liu ; Haonan Cheng ; Long Ye ; Yi Sun

With the proliferation of Audio Language Model (ALM) based deepfake audio, there is an urgent need for effective detection methods. Unlike traditional deepfake audio generation, which often involves multi-step processes culminating in vocoder usage, ALM directly utilizes neural codec methods to decode discrete codes into audio. Moreover, driven by large-scale data, ALMs exhibit remarkable robustness and versatility, posing a significant challenge to current audio deepfake detection (ADD) models. To effectively detect ALM-based deepfake audio, we focus on the mechanism of the ALM-based audio generation method, the conversion from neural codec to waveform. We initially construct the Codecfake dataset, an open-source large-scale dataset, including two languages, millions of audio samples, and various test conditions, tailored for ALM-based audio detection. Additionally, to achieve universal detection of deepfake audio and tackle domain ascent bias issue of original SAM, we propose the CSAM strategy to learn a domain balanced and generalized minima. Experiment results demonstrate that co-training on Codecfake dataset and vocoded dataset with CSAM strategy yield the lowest average Equal Error Rate (EER) of 0.616% across all test conditions compared to baseline models.

#12 RF-based Energy Harvesting: Nonlinear Models, Applications and Challenges [PDF] [Copy] [Kimi]

Author: Ruihong Jiang

So far, various aspects associated with wireless energy harvesting (EH) have been investigated from diverse perspectives, including energy sources and models, usage protocols, energy scheduling and optimization, and EH implementation in different wireless communication systems. However, a comprehensive survey specifically focusing on models of radio frequency (RF)-based EH behaviors has not yet been presented. To address this gap, this article provides an overview of the mainstream mathematical models that capture the nonlinear behavior of practical EH circuits, serving as a valuable handbook of mathematical models for EH application research. Moreover, we summarize the application of each nonlinear EH model, including the associated challenges and precautions. We also analyze the impact and advancements of each EH model on RF-based EH systems in wireless communication, utilizing artificial intelligence (AI) techniques. Additionally, we highlight emerging research directions in the context of nonlinear RF-based EH. This article aims to contribute to the future application of RF-based EH in novel communication research domains to a significant extent.

#13 TGTM: TinyML-based Global Tone Mapping for HDR Sensors [PDF] [Copy] [Kimi]

Authors: Peter Todorov ; Julian Hartig ; Jan Meyer-Siemon ; Martin Fiedler ; Gregor Schewior

Advanced driver assistance systems (ADAS) relying on multiple cameras are increasingly prevalent in vehicle technology. Yet, conventional imaging sensors struggle to capture clear images in conditions with intense illumination contrast, such as tunnel exits, due to their limited dynamic range. Introducing high dynamic range (HDR) sensors addresses this issue. However, the process of converting HDR content to a displayable range via tone mapping often leads to inefficient computations, when performed directly on pixel data. In this paper, we focus on HDR image tone mapping using a lightweight neural network applied on image histogram data. Our proposed TinyML-based global tone mapping method, termed as TGTM, operates at 9,000 FLOPS per RGB image of any resolution. Additionally, TGTM offers a generic approach that can be incorporated to any classical tone mapping method. Experimental results demonstrate that TGTM outperforms state-of-the-art methods on real HDR camera images by up to 5.85 dB higher PSNR with orders of magnitude less computations.

#14 Leveraging AES Padding: dBs for Nothing and FEC for Free in IoT Systems [PDF] [Copy] [Kimi]

Authors: Jongchan Woo ; Vipindev Adat Vasudevan ; Benjamin D. Kim ; Rafael G. L. D'Oliveira ; Alejandro Cohen ; Thomas Stahlbuhk ; Ken R. Duffy ; Muriel Médard

The Internet of Things (IoT) represents a significant advancement in digital technology, with its rapidly growing network of interconnected devices. This expansion, however, brings forth critical challenges in data security and reliability, especially under the threat of increasing cyber vulnerabilities. Addressing the security concerns, the Advanced Encryption Standard (AES) is commonly employed for secure encryption in IoT systems. Our study explores an innovative use of AES, by repurposing AES padding bits for error correction and thus introducing a dual-functional method that seamlessly integrates error-correcting capabilities into the standard encryption process. The integration of the state-of-the-art Guessing Random Additive Noise Decoder (GRAND) in the receiver's architecture facilitates the joint decoding and decryption process. This strategic approach not only preserves the existing structure of the transmitter but also significantly enhances communication reliability in noisy environments, achieving a notable over 3 dB gain in Block Error Rate (BLER). Remarkably, this enhanced performance comes with a minimal power overhead at the receiver - less than 15% compared to the traditional decryption-only process, underscoring the efficiency of our hardware design for IoT applications. This paper discusses a comprehensive analysis of our approach, particularly in energy efficiency and system performance, presenting a novel and practical solution for reliable IoT communications.

#15 Exploring Speech Pattern Disorders in Autism using Machine Learning [PDF] [Copy] [Kimi]

Authors: Chuanbo Hu ; Jacob Thrasher ; Wenqi Li ; Mindi Ruan ; Xiangxu Yu ; Lynn K Paul ; Shuo Wang ; Xin Li

Diagnosing autism spectrum disorder (ASD) by identifying abnormal speech patterns from examiner-patient dialogues presents significant challenges due to the subtle and diverse manifestations of speech-related symptoms in affected individuals. This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues. Utilizing a dataset of recorded dialogues, we extracted 40 speech-related features, categorized into frequency, zero-crossing rate, energy, spectral characteristics, Mel Frequency Cepstral Coefficients (MFCCs), and balance. These features encompass various aspects of speech such as intonation, volume, rhythm, and speech rate, reflecting the complex nature of communicative behaviors in ASD. We employed machine learning for both classification and regression tasks to analyze these speech features. The classification model aimed to differentiate between ASD and non-ASD cases, achieving an accuracy of 87.75%. Regression models were developed to predict speech pattern related variables and a composite score from all variables, facilitating a deeper understanding of the speech dynamics associated with ASD. The effectiveness of machine learning in interpreting intricate speech patterns and the high classification accuracy underscore the potential of computational methods in supporting the diagnostic processes for ASD. This approach not only aids in early detection but also contributes to personalized treatment planning by providing insights into the speech and communication profiles of individuals with ASD.

#16 Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data [PDF] [Copy] [Kimi]

Authors: Zhuohong Li ; Wei He ; Jiepan Li ; Hongyan Zhang

Buildings, as fundamental man-made structures in urban environments, serve as crucial indicators for understanding various city function zones. Rapid urbanization has raised an urgent need for efficiently surveying building footprints and functions. In this study, we proposed a semi-supervised framework to identify every building's function in large-scale urban areas with multi-modality remote-sensing data. In detail, optical images, building height, and nighttime-light data are collected to describe the morphological attributes of buildings. Then, the area of interest (AOI) and building masks from the volunteered geographic information (VGI) data are collected to form sparsely labeled samples. Furthermore, the multi-modality data and weak labels are utilized to train a segmentation model with a semi-supervised strategy. Finally, results are evaluated by 20,000 validation points and statistical survey reports from the government. The evaluations reveal that the produced function maps achieve an OA of 82% and Kappa of 71% among 1,616,796 buildings in Shanghai, China. This study has the potential to support large-scale urban management and sustainable urban development. All collected data and produced maps are open access at https://github.com/LiZhuoHong/BuildingMap.

#17 A Dual-Motor Actuator for Ceiling Robots with High Force and High Speed Capabilities [PDF] [Copy] [Kimi]

Authors: Ian Lalonde ; Jeff Denis ; Mathieu Lamy ; Alexandre Girard

Patient transfer devices allow to move patients passively in hospitals and care centers. Instead of hoisting the patient, it would be beneficial in some cases to assist their movement, enabling them to move by themselves. However, patient assistance requires devices capable of precisely controlling output forces at significantly higher speeds than those used for patient transfers alone, and a single motor solution would be over-sized and show poor efficiency to do both functions. This paper presents a dual-motor actuator and control schemes adapted for a patient mobility equipment that can be used to transfer patients, assist patient in their movement, and help prevent falls. The prototype is shown to be able to lift patients weighing up to 318 kg, to assist a patient with a desired force of up to 100 kg with a precision of 7.8%. Also, a smart control scheme to manage falls is shown to be able to stop a patient who is falling by applying a desired deceleration.

#18 Picking watermarks from noise (PWFN): an improved robust watermarking model against intensive distortions [PDF] [Copy] [Kimi]

Authors: Sijing Xie ; Chengxin Zhao ; Nan Sun ; Wei Li ; Hefei Ling

Digital watermarking is the process of embedding secret information by altering images in a way that is undetectable to the human eye. To increase the robustness of the model, many deep learning-based watermarking methods use the encoder-decoder architecture by adding different noises to the noise layer. The decoder then extracts the watermarked information from the distorted image. However, this method can only resist weak noise attacks. To improve the robustness of the algorithm against stronger noise, this paper proposes to introduce a denoise module between the noise layer and the decoder. The module is aimed at reducing noise and recovering some of the information lost during an attack. Additionally, the paper introduces the SE module to fuse the watermarking information pixel-wise and channel dimensions-wise, improving the encoder's efficiency. Experimental results show that our proposed method is comparable to existing models and outperforms state-of-the-art under different noise intensities. In addition, ablation experiments show the superiority of our proposed module.

#19 An LSTM-Based Chord Generation System Using Chroma Histogram Representations [PDF] [Copy] [Kimi]

Author: Jack Hardwick

This paper proposes a system for chord generation to monophonic symbolic melodies using an LSTM-based model trained on chroma histogram representations of chords. Chroma representations promise more harmonically rich generation than chord label-based approaches, whilst maintaining a small number of dimensions in the dataset. This system is shown to be suitable for limited real-time use. While it does not meet the state-of-the-art for coherent long-term generation, it does show diatonic generation with cadential chord relationships. The need for further study into chroma histograms as an extracted feature in chord generation tasks is highlighted.

#20 Visually Guided Swarm Motion Coordination via Insect-inspired Small Target Motion Reactions [PDF] [Copy] [Kimi]

Authors: Md Arif Billah ; Imraan A. Faruque

Despite progress developing experimentally-consistent models of insect in-flight sensing and feedback for individual agents, a lack of systematic understanding of the multi-agent and group performance of the resulting bio-inspired sensing and feedback approaches remains a barrier to robotic swarm implementations. This study introduces the small-target motion reactive (STMR) swarming approach by designing a concise engineering model of the small target motion detector (STMD) neurons found in insect lobula complexes. The STMD neuron model identifies the bearing angle at which peak optic flow magnitude occurs, and this angle is used to design an output feedback switched control system. A theoretical stability analysis provides bi-agent stability and state boundedness in group contexts. The approach is simulated and implemented on ground vehicles for validation and behavioral studies. The results indicate despite having the lowest connectivity of contemporary approaches (each agent instantaneously regards only a single neighbor), collective group motion can be achieved. STMR group level metric analysis also highlights continuously varying polarization and decreasing heading variance.

#21 An Advanced Features Extraction Module for Remote Sensing Image Super-Resolution [PDF] [Copy] [Kimi]

Authors: Naveed Sultan ; Amir Hajian ; Supavadee Aramvith

In recent years, convolutional neural networks (CNNs) have achieved remarkable advancement in the field of remote sensing image super-resolution due to the complexity and variability of textures and structures in remote sensing images (RSIs), which often repeat in the same images but differ across others. Current deep learning-based super-resolution models focus less on high-frequency features, which leads to suboptimal performance in capturing contours, textures, and spatial information. State-of-the-art CNN-based methods now focus on the feature extraction of RSIs using attention mechanisms. However, these methods are still incapable of effectively identifying and utilizing key content attention signals in RSIs. To solve this problem, we proposed an advanced feature extraction module called Channel and Spatial Attention Feature Extraction (CSA-FE) for effectively extracting the features by using the channel and spatial attention incorporated with the standard vision transformer (ViT). The proposed method trained over the UCMerced dataset on scales 2, 3, and 4. The experimental results show that our proposed method helps the model focus on the specific channels and spatial locations containing high-frequency information so that the model can focus on relevant features and suppress irrelevant ones, which enhances the quality of super-resolved images. Our model achieved superior performance compared to various existing models.

#22 SingIt! Singer Voice Transformation [PDF] [Copy] [Kimi]

Authors: Amit Eliav ; Aaron Taub ; Renana Opochinsky ; Sharon Gannot

In this paper, we propose a model which can generate a singing voice from normal speech utterance by harnessing zero-shot, many-to-many style transfer learning. Our goal is to give anyone the opportunity to sing any song in a timely manner. We present a system comprising several available blocks, as well as a modified auto-encoder, and show how this highly-complex challenge can be achieved by tailoring rather simple solutions together. We demonstrate the applicability of the proposed system using a group of 25 non-expert listeners. Samples of the data generated from our model are provided.

#23 ResNCT: A Deep Learning Model for the Synthesis of Nephrographic Phase Images in CT Urography [PDF] [Copy] [Kimi]

Authors: Syed Jamal Safdar Gardezi ; Lucas Aronson ; Peter Wawrzyn ; Hongkun Yu ; E. Jason Abel ; Daniel D. Shapiro ; Meghan G. Lubner ; Joshua Warner ; Giuseppe Toia ; Lu Mao ; Pallavi Tiwari ; Andrew L. Wentland

Purpose: To develop and evaluate a transformer-based deep learning model for the synthesis of nephrographic phase images in CT urography (CTU) examinations from the unenhanced and urographic phases. Materials and Methods: This retrospective study was approved by the local Institutional Review Board. A dataset of 119 patients (mean $\pm$ SD age, 65 $\pm$ 12 years; 75/44 males/females) with three-phase CT urography studies was curated for deep learning model development. The three phases for each patient were aligned with an affine registration algorithm. A custom model, coined Residual transformer model for Nephrographic phase CT image synthesis (ResNCT), was developed and implemented with paired inputs of non-contrast and urographic sets of images trained to produce the nephrographic phase images, that were compared with the corresponding ground truth nephrographic phase images. The synthesized images were evaluated with multiple performance metrics, including peak signal to noise ratio (PSNR), structural similarity index (SSIM), normalized cross correlation coefficient (NCC), mean absolute error (MAE), and root mean squared error (RMSE). Results: The ResNCT model successfully generated synthetic nephrographic images from non-contrast and urographic image inputs. With respect to ground truth nephrographic phase images, the images synthesized by the model achieved high PSNR (27.8 $\pm$ 2.7 dB), SSIM (0.88 $\pm$ 0.05), and NCC (0.98 $\pm$ 0.02), and low MAE (0.02 $\pm$ 0.005) and RMSE (0.042 $\pm$ 0.016). Conclusion: The ResNCT model synthesized nephrographic phase CT images with high similarity to ground truth images. The ResNCT model provides a means of eliminating the acquisition of the nephrographic phase with a resultant 33% reduction in radiation dose for CTU examinations.

#24 System Identification of the Upgraded LHPOST6 Reaction Mass at the University of California San Diego [PDF] [Copy] [Kimi]

Authors: Andres Rodriguez-Burneo ; Jose I. Restrepo ; Joel P. Conte

Upon completing the upgrade from one to six degrees of freedom of the Outdoor Shake Table at UCSD in 2019, forced vibration tests were carried out to identify the dynamic characteristics of the reaction mass and soil system. This report describes the motivation, execution, and results from such tests, which independently excited the reaction mass in four degrees of freedom: longitudinal, transverse, yaw, and vertical. The report discusses the frequency response curves and deformation patterns from which the natural frequencies, damping ratio, mode shapes, and rigid body motion were determined. The first objective of the study was to investigate if the dynamic properties of the system had dramatically changed after the upgrade by comparing the results to those from forced vibration tests performed 20 years ago, during the construction of the facility. In addition, most recent tests also contributed with results from the vertical degree of freedom, which had never been tested. The second objective was to obtain high-quality response data of the system that will be used to develop a high-fidelity computational model of the reaction mass in future research. A comparison of results showed a slight difference of 0.5Hz in the natural frequency of 2 degrees of freedom. Moreover, maximum displacements in the recent tests were overall larger than the previous ones with few exceptions. The report thoroughly discusses the several sources of discrepancy between the past and most recent results. Finally, test results allowed us to estimate the system's response if the shake table actuators were to be used at their maximum nominal capacity. Small displacement and high damping results were consistent with those of previous tests and further validated the design of the reaction mass.

#25 HILCodec: High Fidelity and Lightweight Neural Audio Codec [PDF] [Copy] [Kimi]

Authors: Sunghwan Ahn ; Beom Jun Woo ; Min Hyun Han ; Chanyeong Moon ; Nam Soo Kim

The recent advancement of end-to-end neural audio codecs enables compressing audio at very low bitrates while reconstructing the output audio with high fidelity. Nonetheless, such improvements often come at the cost of increased model complexity. In this paper, we identify and address the problems of existing neural audio codecs. We show that the performance of Wave-U-Net does not increase consistently as the network depth increases. We analyze the root cause of such a phenomenon and suggest a variance-constrained design. Also, we reveal various distortions in previous waveform domain discriminators and propose a novel distortion-free discriminator. The resulting model, \textit{HILCodec}, is a real-time streaming audio codec that demonstrates state-of-the-art quality across various bitrates and audio types.