Internet marketers of all shades might add a website address, a URL, to a graphic or photo that might then be found through an image search engine.
The user finding such an image may be interested in visiting said site, but will have to type out the URL into their browser's address bar to do so.
Conversely, the URL might point to illicit content - pornography, gambling sites, illegal drugs, terrorist propaganda, researchers said.
Now, Nikolay Neshov of the Technical University of Sofia, Bulgaria and colleagues at the University of Karlstad, Sweden, and the University of Belgrade, Serbia, have developed a computer algorithm that can detect the presence of text overlaid on to an image or a still from a video, extract the text and convert it into an active URL for accessing or blocking a website.
Simple optical character recognition (OCR) does not work well with text overlaid on images as the background is usually complex, the text is likely to be of lower resolution and lower intensity and contrast than that seen in a scanned document or page, for instance.
It then removes the details surrounding those anomalies leaving just the area occupied by any text - the team calls this the binarisation process.
This isolated text image can then be fed into an OCR system to convert the image of the text into actual text in the computer.
The team has successfully tested their algorithm on thousands of images with overlaid URLs. They were able to identify 619 URLs from a random selection of 1,000 test images at a rate of three per second using their approach.
The researchers' initial motivation was to assist computer forensic investigations in which tens of thousands of illegal and illicit photos must be scanned and any associated websites identified quickly in an investigation.
This is critical in investigations of child pornography and child sexual abuse, the team said, but such work is often stymied by the vast numbers of images involved.
The research was published in the International Journal of Reasoning-based Intelligent Systems.
You’ve reached your limit of {{free_limit}} free articles this month.
Subscribe now for unlimited access.
Already subscribed? Log in
Subscribe to read the full story →
Smart Quarterly
₹900
3 Months
₹300/Month
Smart Essential
₹2,700
1 Year
₹225/Month
Super Saver
₹3,900
2 Years
₹162/Month
Renews automatically, cancel anytime
Here’s what’s included in our digital subscription plans
Exclusive premium stories online
Over 30 premium stories daily, handpicked by our editors


Complimentary Access to The New York Times
News, Games, Cooking, Audio, Wirecutter & The Athletic
Business Standard Epaper
Digital replica of our daily newspaper — with options to read, save, and share


Curated Newsletters
Insights on markets, finance, politics, tech, and more delivered to your inbox
Market Analysis & Investment Insights
In-depth market analysis & insights with access to The Smart Investor


Archives
Repository of articles and publications dating back to 1997
Ad-free Reading
Uninterrupted reading experience with no advertisements


Seamless Access Across All Devices
Access Business Standard across devices — mobile, tablet, or PC, via web or app
