3D Insights from a 2D Photo
Aug 31, 2022By John Passarelli – Senior Machine Learning Engineer
Introduction
When viewing photos and/or visual media of a property, there is an abundance of knowledge inherent in the photos that can be learned about the space. For example, it can be determined if there is a bathtub, the state (condition) of the kitchen, or if there are hardwood floors in the living room. Even beyond these more obvious examples, the objects within the photos can be mapped to their measurements, providing additional insights, such as the square footage of a wood floored living room. FoxyAI uses artificial intelligence and computer vision to help automate these insights.
What is LiDAR?
LiDAR uses laser light and its reflection from objects to measure the distance, shape, and orientation of three dimensional (3D) objects. LiDAR has made object measurements from images more accessible, effortless and accurate.
Consumer hardware devices, including iPhones, can use LiDAR in conjunction with photos to provide measurements and 3D models of objects in the photos. Without LiDAR, measurements can be obtained through two or more photos taken from different angles and corresponding camera parameters.
To better understand the LiDAR capabilities let us visualize the image and 3D model. Below is a single photo taken from an iPhone and the corresponding 3D reconstruction of the single image made possible by the iPhone’s built-in LiDAR scanner.
Polygon Predictions + LiDAR = Object Measurements
AI segmentation algorithms are able to predict polygons around objects. A visual example is shown below where a FoxyAI model is used to identify/predict the presence of mold on the fence.
Given the FoxyAI predicted Mold/Mildew/Moss polygon and the LiDAR information, the surface area measurement of the mold is determined to be 0.90 meters (2.95 feet). To validate this prediction, we measured the mold the tried and true way, with a tape measure. Using the tape measure the height and width of the mold measured 76.20 by 48.26 centimeters (30.00 by 19.00 inches). FoxyAI model’s predicted the height and width to be 76.38 by 48.39 centimeters ( 30.07 by 19.05 inches). Thus, demonstrating that insights extracted from a photo, with the help of LiDAR, can include accurate measurements.
Polygon Labels
An important aspect of building this algorithm, I would argue the most important aspect, is the process of human beings labeling the image data. An example video of this labeling process is shown below.
As indicated above, human annotators meticulously draw polygons around objects so that the algorithms can learn to predict these polygons. This process is of the utmost importance as the quality of the labels is the best performance that the predicted labels can hope to achieve. However, this process is even more crucial if the predicted polygons will be used for measurements. Having polygons tightly wrapped around an object will determine the accuracy of the object measurements. Loosely drawn polygons will lead to less accurate measurements. Simply put, accurate polygons are paramount for precise measurements.
Conclusion
Many useful insights can be extracted from a photo. With the addition of LiDAR capabilities, the information is no longer limited to the 2D realm. Exciting possibilities from a photo now exist in the 3D realm, like object measurements.