Building Meaningful AI: The Big Data Problem

Last month, Innotech Capitals, an international VC and private equity fund, invested an undisclosed amount of seed capital into Glever, an AI-assisted resume templates platform that turns a handful of hints into a unique resume, all in a matter of mouse clicks.

In a world where Alphabet GOOGL, Amazon AMZN, and institutional investors are making big plays in AI, these seemingly minute developments continue to serve as subtle reminders of how AI is systematically easing its way into everyday life.

But while Xiaoxin Yin, Founder and CEO of Glever’s parent company, Resure Technology Inc, is excited about the prospects of artificial intelligence, he’s quick to point out that it isn’t all a bed of roses.

Of Noisy, Unstructured Data

Yin, like many startup founders and entrepreneurs dabbling with AI, had to – and still does – deal with dozens of obstacles on the way to a finished product, chief among them, how their intelligent business writing platform handles data.

In 2017, the global size of data hit the 20 zettabytes mark, a number that is growing by the second. Unfortunately, 90 percent of that data is unstructured, meaning machines can only understand about 10 percent of the data. This creates a huge gap between the amount of data available and what AI systems can actually do with said data, making noisy or low-quality data one of the biggest obstacles to creating useful AI platforms.   

Noisy, unstructured data means more data scientists working on collecting and cleaning data instead of identifying and teaching patterns to machines. And because AI systems, even the most basic ones, utilize data-hungry algorithms to help solve simple problems, the data quality problem is one that continues to keep quality AI out of reach for many fintech startups.

Data Privacy and Security

And then there’s the “small” matter of data privacy and security.

If there’s anything we’ve learned from the recent Facebook FB data scandal, it must be the potential for misuse of personal data. In efforts to keep up with the evolving marketing landscape, and more specifically, micro-targeting, corporations have morphed into these data-hungry reservoirs of information that store every small detail about our lives.

Google, for instance, keeps a record of practically every encounter you’ve had with its platform, including location data, personal preferences, and even your favorite YouTube videos from years ago.

So, with so much personal data and information floating around, the issue of fair use becomes an important topic for enterprises that are looking to bring in AI. Regulators are slowly moving in to reign in on the use of big data, with the latest example being the Federal Trade Commission’s investigation into Facebook’s privacy practices. In coming up with the General Data Protection Regulation (GDPR), in another example, the EU hopes to give individuals more control of their personal data, something that is almost non-existent in most of the current digital landscape.

In doing so, this and many other pieces of upcoming regulation will put a strain on existing AI systems that heavily depend on such data, which will still be a good thing for data privacy and AI in general. They’ll force enterprises with AI platforms to address emerging issues, including biases and discriminatory profiling that often occur when AI systems are fed biased data.

But while these regulations are all well-meaning, they present additional overheads for smaller startups. For instance, an alternative lending startup must take extra measures to ensure their AI-assisted loan disbursement algorithms do not disqualify an individual simply because of race. In most cases, these extra measures might require advanced data analytics, something that may be out of reach for many startups and small businesses.

In addition to data privacy and roadblocks presented by unstructured data, there are many other big data barriers that stand in the way of building meaningful AI systems. Fintech companies still have to build the necessary infrastructure to support AI, not to mention staffing them with data scientists and professionals with relevant skill sets to help with implementation.

Still, it’s not all hopeless. Identifying the barriers to building valuable AI is the first important step into the future. Once we understand the data challenges that dog AI, we can really begin making progress in the right direction.     

    

Posted In: NewsTechGeneralcontributorcontributors
Benzinga simplifies the market for smarter investing

Trade confidently with insights and alerts from analyst ratings, free reports and breaking news that affects the stocks you care about.

Join Now: Free!

Loading...