Logo
Nazad
Dino Oglic, R. Garnett, G. Thomas
1 2012.

Learning to Construct Novel Structures

We investigate the problem of constructing novel representatives of a fixed but unknown concept over a structured, discrete domain, such as the set of unlabeled graphs. In particular, we consider an online scenario in which the learning algo- rithm immediately observes the true label of each proposed structure. This prob- lem is highly relevant in many application domains and has so far been tackled mostly by special-purpose algorithms. We propose a general algorithm based on iteratively sampling novel structures from the domain and updating a conditional probability model of the concept based on the observed labels. The conditional probability model is used by the sampling procedure to focus on representatives of the desired concept. An empirical study of our approach demonstrates its ef- fectiveness on a variety of problems. In this paper, we investigate an iterative process for de novo construction of structured concept rep- resentatives, focusing on the construction of graphs with desirable properties. In each step, we first sample a candidate from the posterior distribution p(x j y) of instances x given the desired label y and obtain its true label from the oracle. We then update the conditional distribution p(y j x) by fitting a probability model to the current set of examples. The posterior distribution is initially equal to some given instrumental distribution and is in further iterations updated by the learned con- ditional probability model. The procedure to obtain samples from the updated posterior resembles a Metropolis-Hastings chain in which the instrumental distribution serves as the proposal distribu- tion and the acceptance probability is computed from the conditional probability model. A simple argument shows that the stationary distribution of this chain is exactly the posterior p(xj y). To be able to discover concept representatives, the support of the instrumental distribution must contain the desired concept. The uniform distribution is generally a safe choice, as it guarantees that the entire space will eventually be discovered. In the absence of any other information about the desired concept, we advocate the uniform distribution from a maximum-entropy standpoint. However, when domain-specific knowledge is available, the instrumental distribution can be modified appropriately. This can be especially useful when the concept class is relatively small compared to the entire search space, in which case a uniform sampler might need very many samples to see a positive candidate.


Pretplatite se na novosti o BH Akademskom Imeniku

Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo

Saznaj više