Learning with Previously Unseen Features

Authors: Yuan Shi, Craig A. Knoblock

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present an efficient optimization algorithm for learning the model parameters and empirically evaluate the approach on several regression and classification tasks. Experimental results show that our approach can achieve on average 11.2% improvement over baselines.
Researcher Affiliation Academia Yuan Shi Computer Science Department University of Southern California yuanshi@usc.edu Craig A. Knoblock Information Sciences Institute University of Southern California knoblock@isi.edu
Pseudocode Yes Algorithm 1 Optimization algorithm for LUF
Open Source Code Yes Our algorithms and datasets can be accessed from https://github.com/yuanshi/Unseen Features
Open Datasets Yes We experiment with four regression datasets Abalone,5 for predicting the age of abalone, contains 4,177 samples with 8 features. ... Bank,6 which predicts the fraction of bank customers that are turned away due to queuing, contains 8,192 samples with 8 features. ... CPU,7 for CPU running time prediction, contains 8,192 samples with 12 features. ... House,8 for housing price prediction, contains 20,640 samples with 9 features. ... We experiment with three classification datasets USPS, which recognizes handwriting digits from images, contains 9,298 samples from 10 classes [Hull, 1994]. Books, which performs sentiment analysis on book reviews from Amazon, contains 4,000 samples from 2 classes [Blitzer et al., 2006]. Webcam, which recognizes objects in low-resolution images taken by web cameras, contains 795 samples from 10 classes [Kulis et al., 2011]. ... We conduct experiments with data from Weather Underground,11 which contains sensor data from a large number of personal weather stations worldwide. ... 11http://www.wunderground.com
Dataset Splits Yes We apply ten-fold cross validation on the target domain and report the average error. ... To tune the weight γ and regularization parameter λ, we apply a leave-one-out cross validation strategy on the source domain to simulate our problem setting. ... In each trial, we randomly split the dataset into the source/target domain, each with half the number of samples.
Hardware Specification No The paper does not provide specific details on the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies No The paper mentions types of models like "kernel regression" and "logistic regression" but does not specify any software libraries or their version numbers used (e.g., Python, scikit-learn, PyTorch versions).
Experiment Setup Yes The hyper-parameters of the above methods, including c and d in polynomial kernel, k in k-NN regression, the regularization parameter λ, are tuned on the source domain. ... For all datasets, we scale each feature to [0,1], and then use principal component analysis (PCA) to reduce the dimensionality to 100, which reduces computational cost and feature noise. ... To prevent overfitting, we adopt an early stopping strategy: train a model on {(xt, ˆyt)} and apply it to source-domain data. If the prediction error on the source domain is larger than a certain threshold, we stop the learning process. We also terminate the optimization when the objective function decreases very slowly, which not only saves computational time but also reduces overfitting.