Going stealth to preserve content quality

Once upon a time, content publishers did not have to worry about content duplication. With more than 4.5 billion people worldwide using the internet today, which is nearly 60 per cent of the world's population, that is a luxury content providers can no longer enjoy. Multimedia contents are flooding the internet, and they contribute to 80 per cent of internet traffic. Anyone can showcase fullscreen high-definition videos without waiting for the videos to buffer. This has led to a considerable shift in the way people are using the internet.

Content streaming services such as Netflix, Amazon Prime, Hulu, and the latest Disney+ offer thousands of films and television programs owned by major film studios. Streaming also provides an alternative to file downloading, which is blocked in many countries.

In 2016, Facebook upped their game by allowing any registered user to broadcast live videos, for free - which came at a hefty price. People could upload videos of live concerts, closed-door events, and fashion shows without acquiring the licenses for it. The piracy of digital content became unstoppable. Content providers had to deal with their content being duplicated and shared for free or sold at a lower price on other platforms.

Data hiding was introduced to combat the issues mentioned above. It is the process of inserting data into a host to serve specific purposes, such as proof of ownership, verification of genuine content, and linking related contents. The data can be information derived from the content itself (e.g., description), something external to the host content, or a mixture of both. Depending on its purpose, data can be inserted in various ways.

However, distortion in the host content is inevitable with data hiding. Associate Professor Wong Kok Sheik from the School of Information Technology, Monash University Malaysia and his team aim to ensure that a content's quality is ultimately preserved by preventing unintended or intended changes.

"We are analysing the coding structure, which is how things are stored in the digital domain. We are looking for ways to encode (insert) additional data while preserving the perceptual quality of the content," Professor Wong shared.

The first breakthrough was to identify two distinct ways to store the same value in video. For example, the value 12 can be represented as either 2 x 6 or 4 x 3. "To the decoder (a piece of software or device that reads the file and produces the corresponding content), it delivers the exact same value, but what's under the hood (e.g., 2 x 6 vs 4 x 3) carries additional data," Professor Wong explained. A similar idea is deployed in other media, including image and document files.

Instead of using traditional methods that directly modify the values, Professor Wong used different ways to obtain the same values or achieve the same visual impression. By following this approach, the quality and appearance of videos, PDFs and photos appear the same even after data hiding. "This is a complete quality preserving type of processing. Existing techniques can't offer this. We are also able to scale according to demands. If you tell me how much data you want to insert, I will prepare the space for you to do so," he explained.

Professor Wong's innovation is also reversible, where the process can be reverted to obtain an exact copy of the original content, down to the bit level.

The outlook for this research is bright. The reasons being - when new formats are put forward, new techniques specific to the format are required to provide different security and managerial functions to the content. This situation opens up new opportunities for innovations to hide data.

Digital content such as videos, audios, images as well as encrypted files can be better managed and linked to related contents. Identifying the source of leaked contents can also be carried out easily by a term called "fingerprinting" and ownership over content can be claimed with a watermark's aid whenever there is a dispute.

Data hiding can likewise have another interpretation - to obstruct the meaning of the data. A more common name is encryption. Professor Wong is also researching in this domain. He produced some pioneering work in the unified domain of encryption and data insertion.