Tomas Bubenicek
Supervisor(s): Jiri Bittner
Czech Technical University
Abstract: Datasets for use in computer vision machine learning are often challenging to
acquire. Often, datasets are created either using hand-labeling or via
expensive measurements. In this paper, we characterize different augmented
image data used in computer vision machine learning tasks and propose a method
of generating such data synthetically using a game engine. We implement a
Unity plugin for creating such augmented image data outputs, usable in
existing Unity projects. The implementation allows for RGB lit output and
several ground-truth outputs, such as depth and normal information, object or
category segmentation, motion segmentation, forward and backward optical flow
and occlusions, 2D and 3D bounding boxes, and camera parameters. We also
explore the possibilities of added realism by using an external path-tracing
renderer instead of the rasterization pipeline, which is currently the
standard in most game engines. We demonstrate our tool by creating
configurable example scenes, which are specifically designed for training
machine learning algorithms.
Keywords: Computer Vision, Rendering, Video Games
Full text: Year: 2020