








Meaningful uncertainty quantification in computer vision requires reasoning about semantic informationâ€” say, the hair color of the person in a photo or the location of a car on the street. To this end, recent breakthroughs in generative modeling allow us to represent semantic information in disentangled latent spaces, but providing uncertainties on the semantic latent variables has remained challenging. In this work, we provide principled uncertainty intervals that are guaranteed to contain the true semantic factors for any underlying generative model. The method does the following: (1) it uses quantile regression to output a heuristic uncertainty interval for each element in the latent space (2) calibrates these uncertainties such that they contain the true value of the latent for a new, unseen input. The endpoints of these calibrated intervals can then be propagated through the generator to produce interpretable uncertainty visualizations for each semantic factor. This technique reliably communicates semantically meaningful, principled, and instanceadaptive uncertainty in inverse problems like image superresolution and image completion. 
Our work addresses the problem of directly giving uncertainty estimates on semantically meaningful image properties. We make progress on this problem by bringing techniques from quantile regression and distributionfree uncertainty quantification together with a disentangled latent space learned by a generative adversarial network (GAN). We call the coordinates of this latent space semantic factors, as each controls one meaningful aspect of the image, like age or hair color. Our method takes a corrupted image input and predicts each semantic factor along with an uncertainty interval that is guaranteed to contain the true semantic factor. When the model is unsure, the intervals are large, and viceversa. By propagating these intervals through the GAN coordinatewise, we can visualize uncertainty directly in imagespace (shown left) without resorting to perpixel intervals. The result of our procedure is a rich form of uncertainty quantification directly on the estimates of semantic properties of the image. 
Generative architectures such as StyleGAN and its variants have been shown to contain a disentangled latent space which factors into interpretable features such as hair color, expression etc. We use this feature to obtain semantically meaningful uncertainty in the latent space of a generative model. Since the latent space is disentangled, the uncertainty intervals natually factor into different interpretable dimensions. Thus given an image of a face for instance, we can obtain meaningful uncertainty estimates separately for different aspects of the face such as hair shape / color etc. Our training consists of three core steps:


The image on the left shows a uncertainty prediction output on a sample drawn from the CLEVR dataset trained generative model. The uncertainty factors naturally into the latent factors, we visualize shape and color here. The lower and upper quantile images yield similar colors, which is predictable from the blurry input. The model predicts that both a cylinder and sphere would be consistent with this blurry input. The calibrated quantiles cover the ground truth color value while the uncalibrated ones do not. 


For the image superresolution case, the corruption intensity is varied across each set, the input image in the top row is not corrupted while the input in the bottom row is undersampled by 16x. In both cases,
we can observe that the most diverse prediction is in the bottom row where the input is corrupted the most.
For the image inpainting case, a random mask is applied to the same input image in each row. When there is no mask (1st row), both quantiles are extremely close to the pointwise prediction.
As we increase the regions that are being masked, the predicted intervals expand, as indicated by the variability on the quantile predictions.


AcknowledgementsSS and PI acknowledge the MIT SuperCloud and Lincoln Laboratory Supercomputing Center for providing HPC resources that have contributed to the research results reported within this paper. 