Group Equivariant Conditional Neural Processes
Authors: Makoto Kawano, Wataru Kumagai, Akiyoshi Sannai, Yusuke Iwasawa, Yutaka Matsuo
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that Equiv CNP with translation equivariance achieves comparable performance to conventional CNPs in a 1D regression task. Moreover, we demonstrate that incorporating an appropriate Lie group equivariance, Equiv CNP is capable of zero-shot generalization for an image-completion task by selecting an appropriate Lie group equivariance. |
| Researcher Affiliation | Academia | Makoto Kawano The University of Tokyo Tokyo, Japan kawano@weblab.t.u-tokyo.ac.jp Wataru Kumagai The University of Tokyo, RIKEN AIP Tokyo, Japan kumagai@weblab.t.u-tokyo.ac.jp Akiyoshi Sannai RIKEN AIP Tokyo, Japan akiyoshi.sannai@riken.jp Yusuke Iwasawa & Yutaka Matsuo The University of Tokyo Tokyo, Japan {iwasawa, matsuo}@weblab.t.u-tokyo.ac.jp |
| Pseudocode | Yes | Algorithm 1 Prediction of Group Equivariant Conditional Neural Process |
| Open Source Code | Yes | Code and dataset are available on https://github.com/makora9143/Equiv CNP. |
| Open Datasets | Yes | To train all NPs, the GPs generate the context and target points; the number of context points and target points is random-sampled uniformly from [3, 50] respectively. All NPs were trained for 200 epochs by 256 batches per epoch and the size of each batch is 16, We used Adam optimizer (Kingma & Ba, 2014) with learning rate 10-3. An architecture of CNP was based on the original code5. We visualize the result of periodic kernel regression at Figure 8. The original image of the digital clock number is shown in Figure 10. We first inverted in colors of black and white of the image. Then, we cropped the image so that each cropped image contains one digit and resize them to 64x64. Note that the vertical size of each number is set up to 56, while the horizontal size is not fixed. The values of all pixels are devided by 255 to rescale them to the [0, 1] range. |
| Dataset Splits | No | The paper describes how context and target points are sampled for tasks but does not specify validation splits for the overall dataset used to train and evaluate the meta-learning model (e.g., percentages or counts for training, validation, and testing sets of tasks/data). |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU model, CPU model, memory, cloud instance type) used to run the experiments. |
| Software Dependencies | No | The paper mentions the use of 'Adam optimizer' but does not specify any software dependencies with their version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | All NPs were trained for 200 epochs by 256 batches per epoch and the size of each batch is 16, We used Adam optimizer (Kingma & Ba, 2014) with learning rate 10-3. For 1D regression tasks, we use 4-layer Lie Conv architecture with Re LU activations. The average fraction of those Lie Conv is 5/32 and the number of MC sampling is 25. The channels of Lie Conv are [16, 32, 16, 8]. For the 2D image-completion task, we use Lie Conv Convθ instead of RBF kernels as ψ. The channels of this Lie Conv is 128, the average fraction is 1/10, and the number of MC sampling is 121. The batch size is 4, epoch is 100, and the optimizer is Adam (Kingma & Ba, 2014) whose learning rate is 5e-4. |