2025-04-03 | | Total: 10
Machine learning (ML) has become a standard tool for the exploration of chemical space. Much of the performance of such models depends on the chosen database for a given task. Here, this aspect is investigated for "chemical tasks" including the prediction of hybridization, oxidation, substituent effects, and aromaticity, starting from an initial "restricted" database (iRD). Choosing molecules for augmenting this iRD, including increasing numbers of conformations generated at different temperatures, and retraining the models can improve predictions of the models on the selected "tasks". Addition of a small percentage of conformers (1 % ) obtained at 300 K improves the performance in almost all cases. On the other hand, and in line with previous studies, redundancy and highly deformed structures in the augmentation set compromise prediction quality. Energy and bond distributions were evaluated by means of Kullback-Leibler (DKL) and Jensen-Shannon (DJS) divergence and Wasserstein distance (W1). The findings of this work provide a baseline for the rational augmentation of chemical databases or the creation of synthetic databases.
Building upon our previously developed time-dependent Hartree-Fock (TDHF)@vW method, based on many-body perturbation theory and specifically the Bethe-Salpeter Equation (BSE), we introduce a parameterization scheme for the attenuated exchange kernel, vW(|r−r′|). In the original method, vW was determined individually for each system via an efficient stochastic short-time TD Hartree propagation for the screened Coulomb interaction, W(r,r′). The new parameterization leverages photochemical similarities in exciton binding energies (or exchange interaction attenuation) among molecules with comparable static dielectric responses. We parameterize the inverse dielectric function using a low-order polynomial with error function apodization, calibrated on a few representative molecules, each with its own vW. Using only 7 parameters, the parameterized vW is fully grid-independent and broadly applicable within a family of molecules. This enables TDHF@vW that retains BSE-level accuracy, achieving a mean absolute error of ∼0.1 eV compared to experimental optical gaps and representing a five- to ten-fold improvement over conventional TD density functional theory or TDHF while reducing the cost to that of standard TDHF.
Traditional atomistic machine learning (ML) models serve as surrogates for quantum mechanical (QM) properties, predicting quantities such as dipole moments and polarizabilities, directly from compositions and geometries of atomic configurations. With the emergence of ML approaches to predict the "ingredients" of a QM calculation, such as the ground state charge density or the effective single-particle Hamiltonian, it has become possible to obtain multiple properties through analytical physics-based operations on these intermediate ML predictions. We present a framework to seamlessly integrate the prediction of an effective electronic Hamiltonian, for both molecular and condensed-phase systems, with PySCFAD, a differentiable QM workflow that facilitates its indirect training against functions of the Hamiltonian, such as electronic energy levels, dipole moments, polarizability, etc. We then use this framework to explore various possible choices within the design space of hybrid ML/QM models, examining the influence of incorporating multiple targets on model performance and learning a reduced-basis ML Hamiltonian that can reproduce targets computed from a much larger basis. Our benchmarks evaluate the accuracy and transferability of these hybrid models, compare them against predictions of atomic properties from their surrogate models, and provide indications to guide the design of the interface between the ML and QM components of the model.
Boron, nitrogen and carbon are neighbors in the periodic table and can form strikingly similar twin structures-hexagonal boron nitride (hBN) and graphene-yet nanofluidic experiments demonstrate drastically different water friction on them. We investigate this discrepancy by probing the interfacial water and atomic-scale properties of hBN using surface-specific vibrational spectroscopy, atomic-resolution atomic force microscopy (AFM), and machine learning-based molecular dynamics. Spectroscopy reveals that pristine hBN acquires significant negative charges upon contacting water at neutral pH, unlike hydrophobic graphene, leading to interfacial water alignment and stronger hydrogen bonding. AFM supports that this charging is not defect-induced. pH-dependent measurements suggest OH- chemisorption and physisorption, which simulations validate as two nearly equally stable states undergoing dynamic exchange. These findings challenge the notion of hBN as chemically inert and hydrophobic, revealing its spontaneous surface charging and Janus nature, and providing molecular insights into its higher water friction compared to carbon surfaces.
We present a real-space method for computing the random phase approximation (RPA) correlation energy within Kohn-Sham density functional theory, leveraging the low-rank nature of the frequency-dependent density response operator. In particular, we employ a cubic scaling formalism based on density functional perturbation theory that circumvents the calculation of the response function matrix, instead relying on the ability to compute its product with a vector through the solution of the associated Sternheimer linear systems. We develop a large-scale parallel implementation of this formalism using the subspace iteration method in conjunction with the spectral quadrature method, while employing the Kronecker product-based method for the application of the Coulomb operator and the conjugate orthogonal conjugate gradient method for the solution of the linear systems. We demonstrate convergence with respect to key parameters and verify the method's accuracy by comparing with planewave results. We show that the framework achieves good strong scaling to many thousands of processors, reducing the time to solution for a lithium hydride system with 128 electrons to around 150 seconds on 4608 processors.
Near-infrared fluorescence imaging offers improved spatial precision by reducing light scattering and absorption in tissue. Despite this key advantage, the NIR region is limited by the availability of fluorophores, most of which exhibit relatively low quantum yield. In this study, gold nanospheres with absorption peaks in the visible range were used to enhance the fluorescence intensity of the cyanine NIR fluorophore IRdye 800 in the first NIR window of the electromagnetic spectrum. AuNSs with diameters ranging from 5 to 25 nm were chosen to investigate the impact of a nanoparticle size on fluorescence enhancement, functionalized with polyethylene glycol of varying molecular weights to optimize the distance between the fluorophore and the nanoparticle surface. Theoretical analyses using finite-difference time-domain simulations and experimental comparisons with non-metallic nanoparticles were performed to identify the factors contributing to the enhancement of fluorescence. PEGylated AuNSs conjugated with IRdye 800 (AuNDs) exhibited decreased photoisomerization, resulting in increased fluorescence intensity and altered fluorescence lifetimes. The observed enhancement in the fluorescence intensity of the AuNDs was attributed to three primary mechanisms: metal-enhanced fluorescence, altered radiative decay rates, and steric stabilization. Among these three mechanisms, two are attributed to the tail-end absorption spectral overlap of the AuNSs with IRdye 800. This study highlights the potential of AuNSs for improving NIR-I fluorescence imaging and opens up new possibilities for applications in biomedical research.
De novo molecular design has extensive applications in drug discovery and materials science. The vast chemical space renders direct molecular searches computationally prohibitive, while traditional experimental screening is both time- and labor-intensive. Efficient molecular generation and screening methods are therefore essential for accelerating drug discovery and reducing costs. Although reinforcement learning (RL) has been applied to optimize molecular properties via reward mechanisms, its practical utility is limited by issues in training efficiency, convergence, and stability. To address these challenges, we adopt Direct Preference Optimization (DPO) from NLP, which uses molecular score-based sample pairs to maximize the likelihood difference between high- and low-quality molecules, effectively guiding the model toward better compounds. Moreover, integrating curriculum learning further boosts training efficiency and accelerates convergence. A systematic evaluation of the proposed method on the GuacaMol Benchmark yielded excellent scores. For instance, the method achieved a score of 0.883 on the Perindopril MPO task, representing a 6\% improvement over competing models. And subsequent target protein binding experiments confirmed its practical efficacy. These results demonstrate the strong potential of DPO for molecular design tasks and highlight its effectiveness as a robust and efficient solution for data-driven drug discovery.
Confinement influences fluid properties. We show, employing molecular dynamics simulations with explicit solvents, that slit confinement drives a first-order transition for a small nanoparticle between staying at the slit center and binding to the slit surfaces. The transition follows a subcritical pitchfork bifurcation, accompanying a similar transition of the nanoparticle's lateral diffusion, depending on interparticle interactions and confinement interfaces. Our findings underscore the necessity for advancing molecular hydrodynamics under strong confinement.
Organosulfur species are potential major carriers of sulfur in the interstellar medium, as well as interesting ingredients in prebiotic chemistry. The most fundamental question regarding these species is under which conditions they reside in the gas versus solid phase. Here, we characterize the thermal desorption kinetics, binding energies, and entrapment of the organosulfur methyl mercaptan (CH3SH, or MeSH) in different ice environments, comparing them with those of methanol (CH3OH, or MeOH) ices. The derived multi-layer (pure MeSH-MeSH) and sub-monolayer (layered MeSH-H2O) binding energies are surprisingly similar, corresponding to snow line locations where the disk midplane temperature is ~105 K. In both H2O-dominated and more realistic H2O:CO2-dominated ices, 100% of the MeSH is entrapped, almost exclusively desorbing at the molecular volcano desorption peak, indicating that MeSH is retained at the water snow line if initially mixed with water ice during formation. Additionally, the presence of MeSH in an ice mixture enhances the entrapment of CO2 and MeOH (up to 100%) until the onset of volcano desorption; without MeSH, both desorb at their respective pure desorption temperatures and also co-desorb with water. Compared to MeOH, MeSH binds less well to water, explaining why MeSH escapes during water ice crystallization rather than co-desorbing with water. These results show the larger relative size of MeSH compared to MeOH significantly impacts its ability to bind to water and its entrapment efficiency. Therefore, molecular size plays an important role in the adsorption and retention of S-bearing organics and, in turn, other volatiles in ices.
Water drops spontaneously accumulate charges when they move on hydrophobic dielectric surfaces by slide electrification. On the one hand, slide electrification generates electricity with possible applications on tiny devices. On the other hand, the potential of up to 1 KV generated by slide electrification alters wetting and drop motion. Therefore, it is important to know the factors that affect slide electrification. To find out how surfactants affect slide electrification, we measured drop charges of aqueous drops containing cationic CTAB, anionic SDS and neutral C8E3 sliding on different hydrophobic surfaces. The result is: addition of surfactant significantly reduces the spontaneous charging of moving water drops. Based on zeta potential measurements, confocal microscopy of deposited surface-active dyes and drop impact studies, we propose that several factors contribute to this suppression of charge separation: (1) Surfactants tend to lower the contact angles, which reduces charge separation. (2) Surfactant adsorption at the solid-liquid interface can reduce the density of primary ions, particularly for anionic surfactants. (3) Anionic and neutral surfactants are mostly transferred to the liquid-air interface at the rear of the sliding drop, retaining primary ions within the drop. (4) Deposited cationic surfactant directly reduces the charge of the drop.