Dataset Persistent ID
|
doi:10.26165/JUELICH-DATA/TOBXWP |
Publication Date
|
2024-03-11 |
Title
|
Replication Data and Code for: 'Machine learning isotropic g values of radical polymers'
|
Author
|
Daniel, Davis Thomas (IEK-9 Forschungszentrum Jülich, ITMC RWTH Aachen) - ORCID: 0000-0001-7035-3416
Mitra, Souvik (University of Münster) - ORCID: 0009-0005-9476-980X
Eichel, Rüdiger-A (IEK-9 Forschungszentrum Jülich, IPC RWTH Aachen) - ORCID: 0000-0002-0013-6325
Diddens, Diddo (IEK-12 Forschungszentrum Jülich, University of Münster) - ORCID: 0000-0002-2137-1332
Granwehr, Josef (IEK-9 Forschungszentrum Jülich, ITMC RWTH Aachen) - ORCID: 0000-0002-9307-1101
|
Contact
|
Use email button above to contact.
Granwehr, Josef (IEK-9 Forschungszentrum Jülich, ITMC RWTH Aachen)
|
Description
|
This data repository contains the data sets and python scripts associated with the manuscript '
Machine learning isotropic g values of radical polymers '.
Electron paramagnetic resonance measurements allow for obtaining experimental g values of radical polymers. Analogous to chemical shifts,
g values give insight into the identity and environment of the paramagnetic center. In this work, Machine learning based prediction of
g values is explored as a viable alternative to computationally expensive density functional theory (DFT) methods.
Description of folder contents (switch to tree view):
- Datasets : Contains PTMA polymer structures from TR, TE-1, and TE-2 data sets transformed using a molecular descriptor (SOAP, MBTR or DAD) and corresponding DFT-calculated g values. Filenames contain 'PTMA_X' where X denotes the number of monomers which are radicals. Structure data sets have 'structure_data' in the title, DFT calculated g values have 'giso_DFT_data' in the title. The files are in .npy (NumPy) format.
- Models : ERT models trained on SOAP, MBTR and DAD feature vectors.
- Scripts : Contains scripts which can be used to predict g values from XYZ files of PTMA structures with 6 monomer units and varying radical density. The script 'prediction_functions.py' contains the functions which transform the XYZ coordinates into an appropriate feature vector which the trained model uses to predict. Description of individual functions are also given as docstrings (python documentation strings) in the code. The folder also contains additional files needed for the ERT-DAD model in .pkl format.
- XYZ_files : Contains atomic coordinates of PTMA structures in XYZ format. Two subfolders : WSD and TE-2 correspond to structures present in the whole structure data set and TE-2 test data set (see main text in the manuscript for details). Filenames in the folder 'XYZ_files/TE-2/PTMA-X/' are of the type 'chainlength_6ptma_Y'_Y''.xyz' where 'chainlength_6ptma' denotes the length of polymer chain (6 monomers), Y' denotes the proportion of monomers which are radicals (for instance, Y' = 50 means 3 out of 6 monomers are radicals) and Y'' denotes the order of the MD time frame. Actual time frame values of Y'' in ps is given in the manuscript.
- PTMA-ML.ipynb : Jupyter notebook detailing the workflow of generating the trained model. The file includes steps to load data sets, transform xyz files using molecular descriptors, optimise hyperparameters , train the model, cross validate using the training data set and evaluate the model.
- PTMA-ML.pdf : PTMA-ML.ipynb in PDF format.
List of abbreviations :
- PTMA : poly(2,2,6,6-tetramethyl-1-piperidinyloxy-4-yl methacrylate)
- TR : Training data set
- TE-1 : Test data set 1
- TE-2 : Test data set 2
- ERT : Extremely randomized trees
- WSD : Whole structure data set
- SOAP : Smooth overlap of atomic orbitals
- MBTR : Many-body tensor representation
- DAD : Distances-Angles-Dihedrals
|
Subject
|
Chemistry; Computer and Information Science
|
Keyword
|
Machine Learning
Radical polymers
EPR
|
Related Publication
|
Journal of Chemical Theory and Computation, 2024, https://doi.org/10.1021/acs.jctc.3c01252 doi: https://doi.org/10.1021/acs.jctc.3c01252
|
Language
|
English
|
Contributor
|
Data Collector : Daniel, Davis Thomas
Data Collector : Mitra, Souvik
Project Leader : Granwehr, Josef
Supervisor : Eichel, Rüdiger-A
Supervisor : Diddens, Diddo
|
Grant Information
|
DFG: SPP2248
RWTH Aachen University: rwth1253
|
Depositor
|
Daniel, Davis Thomas
|
Deposit Date
|
2023-11-03
|
Software
|
ORCA, Version: 5.0.2
GROMACS, Version: 2019
Python, Version: 3.9.18
Scikit-learn, Version: 1.2.2
DScribe, Version: 1.2.2
|