Points of View

Visual Data Labeling Is Crucial To Your Computer Vision Project

Aug 31, 2018 Reetika Fleming

Competing in the age of AI means that you’re only as good as your ability to wrangle the new “oil” or “currency”—data. To derive meaning from data for applications, AI solutions rely on a tremendous undertaking in data labeling and annotation. Even though the industry has made remarkable progress analyzing structured text data, most visual data in the form of images and video is untapped. Data labeling is a crucial part of the computer vision branch of AI, where programs observe raw data, label key zones, and capture relevant comments and tags in a structured format. Tech giants used to be the only companies with enough resources to sift through and catalog volumes of visual data to create unique products and services. As the technology for visual data labeling advances, however, computer vision will be thrown open to every business looking for AI-driven competitive advantages – including yours. Consider your options for visual data labeling carefully as you get started with computer vision projects.


Why you need to pay attention to visual data labeling


Autonomous vehicles (AV) have stolen the spotlight as the use case of choice for getting data labeling right. After all, the risks of non-performance are potentially life-threatening. Unsurprisingly, the emerging AV industry (including Waymo, Voyage, Lyft, and Embark) has invested significantly in data labeling and annotation startups in the last two to three years. It is easy to imagine labeling camera, radar, and lidar data so that it can be used to train AI models to drive safer. For example, it could teach the model to identify a stray deer on the highway or a person on a low-lit backroad. But, the applicability goes far beyond identifying things on roads; any AI solution with unstructured images and video data requires a rigorous data labeling exercise to provide its foundational training data. Applications across other industry verticals include insurance companies using drone images for roof inspections to calculate hail or wind damage, retailers analyzing images of shelf usage and layout to strategize product placement in stores, and manufacturers conducting product quality control through visual inspection. Whatever the application, the key challenge is that most AI models need hundreds of thousands of annotated records to train themselves on the best course of action.


Tech firms are meeting the challenge of visual data labeling with humans and algorithms


The applicability goes far beyond identifying things on roads; any AI solution with unstructured images and video data requires a rigorous data labeling exercise to provide its foundational training data.


  • Startup Scale is trying to solve the onerous task of data labeling for AI developers the old-fashioned way—with access to trained human labelers. Scale recently secured Series B funding of $18 million to expand its base of 10,000 contractors that create AI training data for AV and other businesses. This is a pragmatic approach, as manual data labeling is the currently predominant method, making it a significant bottleneck for enterprise AI development. Scale is helping Airbnb to discover nuanced reasons for its repeat customers’ property preferences and to make smarter recommendations based on qualitative nuances of property image data in addition to the standard search filters.

  • Using marketplace-based startups like Scale is one approach to sorting these masses of data, with specialized managed services firms cropping up in industries like manufacturing and AV. But, enterprises are also undertaking visual data labeling internally, often due to the sensitivity, level of regulation, and complexity of data. Instead of spending months developing training data systems, AI developers are starting to pick between several annotation products available off-the-shelf. Newly launched Labelbox is one such promising player, with a focus on helping clients create and manage visual training data in a centralized manner. Its platform scales from small R&D teams to thousands of labelers, available as both SaaS and on-premise implementations depending on the project. The ability to incorporate a wide variety of visual data and customize annotation rules makes Labelbox a highly flexible platform. The Valley-based startup just secured seed funding of $3.9 million from Kleiner Perkins, First Round, and Google's new AI fund, Gradient Ventures.

  • Clarifai is a startup that is using a different approach to cast a wide net on computer vision technology. Clarifai has come a long way since its founding in 2013 when it launched an image and video recognition platform. In the last two years, it released offerings to specifically target visual content curation, organization, and analysis for enterprise AI. Clarifai’s image—and now video—recognition works through the use of convolutional neural networks that allow computers to learn from data examples and start to automatically predict tags and data labels. Its platform can label data in this way using pre-built visual recognition models for general categories such as travel. Clients can also create custom models with their own data, although Clarifai’s broader experience seems to skew toward working with generalized datasets. With the launch of Clarifai’s mobile SDK, the company is now leading research into AI visual recognition at the localized device level.

Choose the right path for your next computer vision project


Visual data labeling is a crushing bottleneck for any type of computer vision project. The approaches that these AI tech firms are taking are diverse for that reason—no single, emergent solution has progressed AI in this field. Having said that, we will see predominantly labor-based approaches diminish in the next two years as tech-led automation of visual data labeling advances.

For now, enterprise AI leaders will need to select visual data labeling approaches based on their projects. Consider the following questions as a starting point:

  • Is the data too sensitive or specialized? Or could it be shared via a platform like Scale and understood by generalists?

  • Do internal teams have the bandwidth to create training data labels? Could they enhance their productivity with collaborative labeling tools like Labelbox?

  • Will auto-generated content tagging from ML-based companies like Clarifai be sufficient to get a pilot off the ground, given that the quality of auto-generated tagging is usually lower than that of manual labeling?

The key takeaway is that the tech industry is finding ways to make progress on the data challenge of computer vision—led by, but certainly not limited to, the autonomous vehicle market.

Bottom line: Don’t let visual data labeling halt your computer vision projects—you have multiple options to explore how to harness visual data for enterprise applications. 

Companies like Scale are quickly finding ways to acquire and manage data labelers across the globe at compelling price points. Meanwhile, product vendors like Labelbox are establishing best practices in training data creation and management, if your project already has an internal team. Finally, neural network based startups like Clarifai are cropping up to automatically tag data through object recognition, albeit with varying levels of quality compared against human labelers. Consider working with tech firms to speed up your computer vision projects and get them out of POC-limbo! 

Sign in or register an account to access HFS' Content

Sign In

Create an account

Enter a phone number
Select the newsletter(s) to which you wish to subscribe.