We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!


The sheer number of backlogs and delays across the public sector are unsettling for an industry designed to serve constituents. Making the news last summer was the four-month wait period to receive passports, up substantially from the pre-pandemic norm of 6-8 weeks turnaround time. Most recently, the Internal Revenue Service (IRS) announced it entered the 2022 tax season with 15 times the usual amount of filing backlogs, alongside its plan for moving forward. 

These frequently publicized backlogs don’t exist due to a lack of effort. The sector has made strides with technological advancements over the last decade. Yet, legacy technology and outdated processes still plague some of our nation’s most prominent departments. Today’s agencies must adopt digital transformation efforts designed to reduce data backlogs, improve citizen response times and drive better agency outcomes.

By embracing machine learning (ML) solutions and incorporating advancements in natural language processing (NLP), backlogs can be a thing of the past. 

How ML and AI can bridge the physical and digital worlds

Whether tax documents or passport applications, processing items manually takes time and is prone to errors on the sending and receiving sides. For example, a sender may mistakenly check an incorrect box or the receiver may interpret the number “5” as the letter “S.” This creates unforeseen processing delays or, worse, inaccurate outcomes.

But managing the growing government document and data backlog problem is not as simple and clean-cut as uploading information to processing systems. The sheer number of documents and citizens’ information entering agencies in varied unstructured data formats and states, often with poor readability, make it nearly impossible to reliably and efficiently extract data for downstream decision-making.

Embracing artificial intelligence (AI) and machine learning in daily government operations, just as other industries have done in recent years, can provide the intelligence, agility and edge needed to streamline processes and enable end-to-end automation of document-centric processes. 

Government agencies must understand that real change and lasting success will not come with quick patchworks built upon legacy optical character recognition (OCR) or alternative automation solutions, given the vast amount of inbound data.

Bridging the physical and digital worlds can be attained with intelligent document processing (IDP), which leverages proprietary ML models and human intelligence to classify and convert complex, human-readable document formats. PDFs, images, emails and scanned forms can all be converted into structured, machine-readable information using IDP. It does so with greater accuracy and efficiency than legacy alternatives or manual approaches. 

In the case of the IRS, inundated with millions of documents such as 1099 forms and individuals’ W-2s, sophisticated ML models and IDP can automatically identify the digitized document, extract printed and handwritten text, and structure it into a machine-readable format. This automated approach speeds up processing times, incorporates human support where needed and is highly effective and accurate. 

Advancing ML efforts with NLP

Alongside automation and IDP, introducing ML and NLP technologies can significantly support the sector’s quest to improve processes and reduce backlogs. NLP is an area of computer science that processes and understands text and spoken words like humans do, traditionally grounded in computational linguistics, statistics and data science. 

The field has experienced significant advancements, like the introduction of complex language models that contain more than 100 billion parameters. These models could power many complex text processing tasks, such as classification, speech recognition and machine translation. These advancements could support even greater data extraction in a world overrun by documents.

Looking ahead, NLP is on course to reach the level of text understanding capability similar to that of a human knowledge worker, thanks to technological advancements driven by deep learning. Similar advancements in deep learning also enable the computer to understand and process other human-readable content such as images.

For the public sector specifically, this could be images included in disability claims or other forms or applications consisting of more than just text. These advancements could also improve downstream stages of public sector processes, such as ML-powered decision-making for agencies determining unemployment assistance, Medicaid insurance and other invaluable government services. 

Failure to modernize is no longer an option

Though we’ve seen a handful of promising digital transformation improvements, the call for systemic change has yet to be fully answered. 

Ensuring agencies go beyond patching and investing in various legacy systems is needed to move forward today. Patchwork and investments in outdated processes fail to support new use cases, are fragile to change and cannot handle unexpected surges in volume. Instead, introducing a flexible solution that can take the most complex, difficult-to-read documents from input to outcome should be a no-brainer. 

Why? Citizens deserve more out of the agencies who serve them.

CF Su is VP of machine learning at Hyperscience.

DataDecisionMakers

Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers