I was invited by Prof Raouf Hamzaoui to give a talk on “When Discrete Optimisation Meets Multimedia Security” for the IEEE UK & Ireland Signal Processing Chapter on 25th May, 2016. The talk was hosted the De Montfort University and was also organised as part of the Faculty of Technology Research Seminar Series (FoT-RSS). The slides of the talk can be found at SlideShare:

The talk started from a problem of ciphertext-only attack of selective encryption of blockwise DCT-transformed images where some DCT coefficients of each block are encrypted. (In case you don’t know what DCT is: it is a mathematical transform that converts spatially correlated pixel values into a linear combination of many frequency components which are caused DCT coefficients. DCT coefficients are largely uncorrelated and the energy is highly concentrated on the low-frequency corner, so can help support more efficient lossy data compression. For image compression purpose, a large image is normally divided into 8×8 blocks and each block is transformed into DCT domain separately.) The concept of selective encryption is very simple: rather than encrypting everything you only selectively encrypt only part of the whole image in order to downgrade the visual quality to a level suitable for the required visual security. In this case, the security will be about recovering the original image with a better visual quality (for ciphertext-only attacks) or recovering the key (for some other attack scenarios). As its name implies, ciphertext-only attacks assume the attacker can only see the encrypted images only, which corresponds to the hardest task from the attacker’s point of view. Traditionally, simple error concealment attacks can be used to recover a selectively encrypted image by setting all missing components to a specific value (e.g. zero). For instance, if we apply DCT to the famous test image Lenna (see below, top), and then selectively encrypted the most important DCT coefficient in each 8×8 block, we will have the following result (see below, bottom):

While the image looks well encrypted in terms of concealing most visual information of the image, a naive method can recover the original image’s sketch not very badly by setting all encrypted DCT coefficients to zeros:

A key research question here is if and how we can recover the original image to get a visual quality better than the above. This was firstly answered by Takeyuki Uehara, Reihaneh Safavi-Naini and Philip Ogunbona in a paper published on *IEEE Transactions on Image Processing* in 2006. They found out two properties of (natural) images can be used to create a better image recovery method. For the above Lenna case, the recovered image is far better than the above one:

Uehara et al.’s method does not always work well. For instance, for another image I took when I was working in Hong Kong (below, top), the result become worse in terms of accuracy of recovered pixel values (below, bottom):

The inaccuracy largely comes from a lot of under-flowed and over-flowed pixel values in Uehara et al.’s method. In 2010 we published a paper at ICIP (International Conference on Image Processing) 2010 based on an iterative method which select a parameter of Uehara et al.’s method to minimize overall under- and over-flow rate. This methods improves visual quality of almost all 200 test images we tried in a database with both standard test images and pictures taken by me. For the above HK shop image, the recovered image is as follows which has a much better quality compared with the result from Uehara et al.’s method:

Our ICIP 2010 work is still not perfect as it still leaves noticeable artifacts for some other images. For instance, for another standard test image cameraman (bottom, left), the recovered result contains some visual artifacts at various places (bottom, right):

Can we further improve the recovery result? The answer is yes as shown by a follow-up paper we published at ICIP 2011. In that paper, we used a completely different but far more general approach: a linear optimisation model allowing *any* set of missing DCT coefficients to be recovered where the DC recovery case is merely the simplest case of the general model. Since the model is linear and all DCT coefficients and pixel values can be treated as continuous values (which can be later rounded), it can be effectively solved using linear programming in polynomial time. To avoid any math in this blog article, I refer interested readers to our paper for the mathematical representation of the model. For DC recovery case, the general model does a better job than the simple optimisation model we reported in our ICIP 2010 paper. For the above cameraman image, the recovered image by the general model is as follows (the improvement in terms of visual quality is obvious):

As said above, the general model is capable of recovering more than missing DC coefficients, which is something *none* of previous methods are capable of. For instance, when 12 lowest (most significant) DCT coefficients (i.e. DC coefficients plus the 11 most significant AC coefficients) are missing from each block of the cameraman image, the general model can recover the following:

While the general model can be solved in polynomial time, it is still quite slow and requires a large chunk of memory especially for large images, so there is a desire to find even faster algorithms. Working with two algorithm experts of the University of Konstanz, we managed to find a fast algorithm based on the min-cost network flow algorithm, which has a complexity of O(*n*^{1.5}) while the general model based on linear programming has a complexity of O(*n*^{2}), where *n* denotes the number of pixels in the image. The reduced complexity can effectively bring the solving time for recovering a 512×512 image from minus to well below 1 second (hundreds of times improvement). The fast algorithm is not approximate, so it produces exactly the same result as the linear programming based method. This paper was published at 14th SIAM Workshop on Algorithm Engineering and Experiments (ALENEX 2012). The fast algorithm cannot be directly generalized to other cases with missing AC coefficients as well, so the general model remains the only method for more complicated cases. We are also currently looking at ways to speed up the general model using some approximate approaches and has finished testing one approach which was just submitted for possible publication.

The general model of recovering DCT coefficients later also found applications in content authentication and self-restoration image watermarking where a similar model can be constructed to help restore a manipulated image based on watermarks embedded. This work was published at the *Signal Processing: Image Communication* journal in 2014.

Based on the above work, since 2015 we have also been working on recovering other types of missing information from DCT-transformed images and have got some promising results which are in the process of being submitted for publications. I will discuss these new results in a separate blog article later after we get the work accepted somewhere.

While our focus in the past is more about recovering missing information, all the methods have profound implications beyond multimedia security. For instance, being able to recover missing DCT coefficients with good quality means we don’t need to encode (all or part of) those coefficients, so this can lead to a more efficient image compression algorithm. In principle all the work on digital images can be applied to digital video as well although in this case the model can become more complicated since video coding involves a lot of steps and integer variables which have to be modeled in a very different way (linear programming => mixed integer programming: the latter is a NP-hard problem in general, i.e., among the most difficult problems which currently do not have a polynomial time algorithm).

The 6 years of research on this line of research has been a very exciting and enjoyable experience for me, and it allows me to work with a number of collaborators from Germany (Junaid Jameel Ahmad, Dietmar Saupe, Andreas Karrenbauer and Sabine Cornelsen), USA (C.-C. Jay Kuo), UK (Hui Wang and Anthony TS Ho) and Malaysia (KokSheik Wong, Simying Ong and Kuan Yew Tan). Two visiting students from City University of Hong Kong also helped: Kadoo (Ching-Hung) Yuen (2012) and Ruiyuan Lin (2015). I look forward to more interesting work ahead!