Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures
Authors: Pedro Hermosilla, Marco Schäfer, Matej Lang, Gloria Fackelmann, Pere-Pau Vázquez, Barbora Kozlikova, Michael Krone, Tobias Ritschel, Timo Ropinski
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further evaluate the accuracy of our algorithms on common downstream tasks, where we outperform state-of-the-art protein learning algorithms. |
| Researcher Affiliation | Academia | Pedro Hermosilla Ulm University Marco Schäfer Tübingen University Matˇej Lang Masaryk University Gloria Fackelmann Ulm University Pere Pau Vázquez University of Catalonia Barbora Kozlíková Masaryk University Michael Krone Tübingen University Tobias Ritschel University College London Timo Ropinski Ulm University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. It describes the methods in narrative text and mathematical formulas. |
| Open Source Code | Yes | Code and data of our approach is available at https://github.com/phermosilla/IEConv_proteins. |
| Open Datasets | Yes | We use the training/validation/test splits of the SCOPe 1.75 data set of Hou et al. (2018). This data set consolidated 16, 712 proteins from 1, 195 folds. We obtained the 3D structures of the proteins from the SCOPe 1.75 database (Murzin et al., 1955). |
| Dataset Splits | Yes | The data was then split into 29, 215 instances for training, 2, 562 instances for validation, and 5, 651 for testing. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running the experiments are provided in the paper. It only mentions 'high performance computing' in the context of a baseline method (Elnaggar et al. (2020)). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers for its own model. It mentions tools like DSSP (Kabsch & Sander, 1983) and various optimizers, but without specific versioning for reproducibility. |
| Experiment Setup | Yes | We trained our model with Momentum optimizer with momentum equal to 0.98 for 600 epochs for the FOLD task and 300 epochs for the REACTION. We used an initial learning rate of 0.001, which was multiplied by 0.5 every 50 epochs with the allowed minimum learning rate of 1e 6. Moreover, we used L2 regularization scaled by 0.001 and we clipped the norm of the gradients to 10.0. We used a batch size equal to eight for both tasks. We represented the convolution kernel with a single layer MLP with 16 hidden neurons for the FOLD task and 32 for the REACTION task. To further regularize our model, we applied dropout with a probability of 0.2 before each 1 1 convolution, and 0.5 in the final MLP. Moreover, we set to zero all features of an atom before the Intrinsic / Extrinsic convolution with a probability of 0.05 for the FOLD task and 0.0 for the REACTION task. Lastly, we added Gaussian noise to the features before each convolution with a standard deviation of 0.025. |