Earth Observation Image Semantic Bias: A Collaborative User Annotation Approach
Originally Published In
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Correctly annotated image datasets are important for developing and validating image mining methods. However, there is some doubt regarding the generalizability of the models trained and validated on available datasets. This is due to dataset biases, which occur when the same semantic label is used in different ways across datasets, and/or when identical object categories are labeled differently across datasets. In this paper, we demonstrate the existence of dataset biases with a sample of eight remote sensing image datasets, first showing they are readily discriminable from a feature perspective, and then demonstrating that a model trained on one dataset is not always valid on others. Past approaches to reducing dataset biases have relied on crowdsourcing, however this is not always an option (e.g., due to public-accessibility restrictions of images), raising the question: How to structure annotation tasks to efficiently and accurately annotate images with a limited number of nonexpert annotators? We propose a collaborative annotation methodology, conducting image annotation experiments where users are placed in either a collaborative or individual condition, and we analyze their annotation performance. Results show the collaborators produce more thorough, precise annotations, requiring less time than the individuals. Collaborators labels show less variance around the consensus point, meaning their assigned labels are more predictable and likely to be generally accepted by other users. Therefore, collaborative image annotation is a promising annotation methodology for creating reliable datasets with a reduced number of nonexpert annotators. This in turn has implications for the creation of less biased image datasets.