| Total: 13
Multi Agent Path Finding (MAPF) is widely needed to coordinate real-world robotic systems. New approaches turn to deep learning to solve MAPF instances, primarily using reinforcement learning, which has high computational costs. We propose a supervised learning approach to solve MAPF instances using a smaller, less costly model.
In recent times, the social credit systems (SCS) and similar AI-driven mass surveillance systems have been deployed by the Chinese government in various regions. However, the discussions around the SCS are ambiguous: some people call them very controversial and a breach of human rights, while other people say that the SCS are very similar in structure to the company rankings or background checks on individuals in the United States. In reality, though, there is no monolith and there are different forms of SCS deployed in different regions of China. In this paper, I review the different models of the Chinese SCS. Then, I compare how the different systems are upholding or breaching China’s own AI Ethics guidelines.
Deep Neural Networks have memory and computational demands that often render them difficult to use in low-resource environments. Also, highly dense networks are over-parameterized and thus prone to overfitting. To address these problems, we introduce a novel algorithm that prunes (sparsifies) weights from the network by taking into account their magnitudes and gradients taken against a validation dataset. Unlike existing pruning methods, our method does not require the network model to be retrained once initial training is completed. On the CIFAR-10 dataset, our method reduced the number of paramters of MobileNet by a factor of 9X, from 14 million to 1.5 million, with just a 3.8% drop in accuracy.
Hyperspectral imaging is used for a wide range of tasks from medical diagnostics to crop monitoring, but traditional imagers are prohibitively expensive for widespread use. This research strives to democratize hyperspectral imaging by using machine learning to reconstruct hyperspectral volumes from snapshot imagers. I propose a tunable lens with varying amounts of defocus paired with 31-channel spectral filter array mounted on a CMOS camera. These images are then fed into a reconstruction network that aims to recover the full 31-channel hyperspectral volume from a few encoded images with different amounts of defocus.
This paper explores the importance of using optimisation techniques when tuning a machine learning model. The hyperparameters that need to be determined for the Artificial Neural Network (ANN) to work most efficiently are supposed to find a value that achieves the highest recognition accuracy in a face recognition application. First, the model was trained with manual optimisation of the parameters. The highest recognition accuracy that could be achieved was 96.6% with a specific set of parameters used in the ANN. However, the error rate was at 30%, which was not optimal. After utilising Grid Search as the first automated tuning method for hyperparameters, the recognition accuracy rose to 96.9% and the error rate could be minimised to be less than 1%. Applying Random Search, a recognition accuracy of 98.1% could be achieved with the same error rate. Adding further optimisation to the results from Random Search resulted in receiving an accuracy of 98.2%. Hence, the accuracy of the facial recognition application could be increased by 1.6% by applying automated optimisation methods. Furthermore, this paper will also deal with common issues in face recognition and focus on potential solutions.
This paper provides an overview of my contributions to a project to measure and predict student’s mental workload when using digital interactive textbooks. The current work focuses on analysis of clickstream data from the textbook in search of viewing patterns among students. It was found that students typically fit one of three viewing patterns. These patterns can be used in further research to inform creation of new interactive texts for improved student success.
After criminal recidivism or hiring machine learning mod-els have inflicted harm, participatory machine learning meth-ods are often used as a corrective positioning. However, lit-tle guidance exists on how to develop participatory machinelearning models throughout stages of the machine learningdevelopment life-cycle. Here we demonstrate how to co-design and partner with community groups, in the specificcase of feminicide data activism. We co-designed and piloteda machine learning model for the detection of media arti-cles about feminicide. This provides a feminist perspectiveon practicing participatory methods in a co-creation mind-set for the real-world scenario of monitoring violence againstwomen.
Intensive care in hospitals is distributed to different units that care for patient populations reflecting specific comorbidities, treatments, and outcomes. Unit expertise can be shared to potentially improve the quality of methods and outcomes for patients across units. We propose an algorithmic rule pruning approach for use in building short lists of human-interpretable rules that reliably identify patient beneficiaries of expertise transfers in the form of machine learning risk models. Our experimental results, obtained with two intensive care monitoring datasets, demonstrate the potential utility of the proposed method in practice.
While text generated by current vision-language models may be accurate and syntactically correct, it is often general. Recent work has used optical character recognition to supplement visual information with text extracted from an image. In many cases, using text in the image improves the specificity and usefulness of generated text. We contend that vision-language models can benefit from additional information extracted from an image. We modify previous multimodal frameworks to accept relevant information from a number of auxiliary classifiers. In particular, we focus on person names as an additional set of tokens and create a novel image-caption dataset to facilitate captioning with person names. The dataset, Politicians and Athletes in Captions (PAC), consists of captioned images of well-known people in context. By fine-tuning pretrained models with this dataset, we demonstrate a model that can naturally integrate facial recognition tokens into generated text by training on limited data.
Breast reconstruction surgery requires extensive planning, usually with a CT scan that helps surgeons identify which vessels are suitable for harvest. Currently, there is no quantitative method for preoperative planning. In this work, we successfully develop a Deep Learning algorithm to segment the vessels within the region of interest for breast reconstruction. Ultimately, this information will be used to determine the optimal reconstructive method (choice of vessels, extent of the free flap/harvested tissue) to reduce intra- and postoperative complication rates.
Deep learning models have excelled in solving many problems in Natural Language Processing, but are susceptible to extensive vulnerabilities. We offer a solution to this vulnerability by using random perturbations such as spelling correction, synonym substitution, or dropping the word. These perturbations are applied to random words in random sentences to defend NLP models against adversarial attacks. Our defense methods are successful in returning attacked models to their original accuracy within statistical significance.
We introduce a novel technique to identify three spectra representing the three primary materials in a hyperspectral image of a scene. We accomplish this using a modified autoencoder. Further research will be conducted to verify the accuracy of these spectra.
Readers of novels need to identify and learn about the characters as they develop an understanding of the plot. The paper presents an end-to-end automated pipeline for literary character identification and ongoing work for extracting and comparing character representations for full-length English novels. The character identification pipeline involves a named entity recognition (NER) module with F1 score of 0.85, a coreference resolution module with F1 score of 0.76, and a disambiguation module using both heuristic and algorithmic approaches. Ongoing work compares event extraction as well as speech extraction pipelines for literary characters representations with case studies. The paper is the first to my knowledge that combines a modular pipeline for automated character identification, representation extraction and comparisons for full-length English novels.