|
|
ORIGINAL ARTICLE |
|
Year : 2017 | Volume
: 3
| Issue : 1 | Page : 26-30 |
|
A Super-resolution Reconstruction Algorithm for Surveillance Video
Jian Shao, Feng Chao, Mian Luo, Jing Cheng Lin
Image Technology Lab, JingCheng Institute of Forensic Science, Beijing, China
Date of Web Publication | 31-Mar-2017 |
Correspondence Address: Prof. Jian Shao JingCheng Institute of Forensic Science, Beijing China
 Source of Support: None, Conflict of Interest: None  | Check |
DOI: 10.4103/jfsm.jfsm_11_17
Recent technological developments have resulted in surveillance video becoming a primary method of preserving public security. Many city crimes are observed in surveillance video. The most abundant evidence collected by the police is also acquired through surveillance video sources. Surveillance video footage offers very strong support for solving criminal cases, therefore, creating an effective policy, and applying useful methods to the retrieval of additional evidence is becoming increasingly important. However, surveillance video has had its failings, namely, video footage being captured in low resolution (LR) and bad visual quality. In this paper, we discuss the characteristics of surveillance video and describe the manual feature registration – maximum a posteriori – projection onto convex sets to develop a super-resolution reconstruction method, which improves the quality of surveillance video. From this method, we can make optimal use of information contained in the LR video image, but we can also control the image edge clearly as well as the convergence of the algorithm. Finally, we make a suggestion on how to adjust the algorithm adaptability by analyzing the prior information of target image. Keywords: Image registration, maximum a posteriori, projection onto convex sets, super-resolution, surveillance video
How to cite this article: Shao J, Chao F, Luo M, Lin JC. A Super-resolution Reconstruction Algorithm for Surveillance Video. J Forensic Sci Med 2017;3:26-30 |
Introduction | |  |
Nowadays, the practice of video surveillance is becoming increasingly important in the preservation of public security. A great number of surveillance cameras are currently being used toward this purpose. To deal with the large number of suspicious events captured by the cameras and whose images are always dark and blurry, police authorities are in need of new methods that make optimal use of video information for investigational purposes. Because surveillance video is always highly compressed [1] with low visual quality, a new approach is needed by which the information acquired from compressed surveillance video footage can be used optimally by restoring high resolution (HR) in compressed surveillance video.
To find an effective way of restoring video image quality, a few issues need to be explored first.
Surveillance video
Surveillance video is highly compressed. Many users often set a high compression ratio to make minimum use of storage space. This has the effect of blurring the captured video and obscuring information that is very important to the detection of criminal activity. To improve video quality, we must be knowledgeable on how video compression works.
The video compression process model is described in [Figure 1].
Image registration
Image registration is the process of overlaying two or more images of the same scene, taken at different times, from different viewpoints, and/or by different sensors. This process geometrically aligns two images: the reference and sensed image. Differences will be present in the images due to different imaging conditions. Image registration is a crucial step in all image analysis tasks in which the final information is composed from the combination of various data sources, such as image fusion, change detection, and multichannel image restoration.[2]
Maximum a posteriori super-resolution
Maximum a posteriori (MAP) is a sort of optimization problem, with respect to HR images. It provides an evaluation function which can be used to optimize a posteriori density. The evaluation function is usually defined with respect to the HR image in the spatial domain.[3]
Projection onto convex sets super-resolution
The projection onto convex sets (POCS) method is an easy to execute method that can easily be used in image degradation spatial modeling and can also introduce prior information. The POCS method is used to reconstruct the HR image from these aliased image sequences. As a result, we find that the reconstruction algorithm has the same precision for image registration as spatial image registration as well as a good effect on super-resolution (SR) image reconstruction. The POCS method has the additional benefit of edge continuity and preservation through edge detection and adaptive filters.[4]
Reconstruction Algorithm | |  |
Manual feature registration
In this paper, the manual feature registration (MFR) method is used to register a series of surveillance video frames. The detected features in the reference and sensed images can be matched by means of the image intensity values in their close neighbors, the feature spatial distribution, or the feature symbolic description. While looking for the feature correspondence, some methods simultaneously estimate the parameters of mapping functions and thus merge the second and third registration steps.
The facial image, which is captured in surveillance video, is what we want to get. This video is composed of some frames that show the facial image from different angles and different light conditions. Therefore, if we want to fuse these frames into a single HR facial image, we must first register these frames accurately. We can manually define some points coinciding with facial features, so that these frames may fuse to a single HR image registered by these features. This method can be described by the formula:

yk is the k th low resolution (LR) image, f is the HR image, Rk is the registration operator movement of the k th frame.
Maximum a posteriori super-resolution reconstruction algorithm
The MAP approach to estimating y1 seeks to estimate , for which the a posteriori probability p (x | y) is a maximum. Formally, we seek as:

Applying Bayes' rule yields:

Moreover, since the maximum is independent of y, we get:

The corresponding log-likelihood function is:

Where log p (Y | fk) is the log-likelihood function and log p (fk) is the log of the a priori density of fk. Assuming the image noise to be Gaussian with mean zero, variance, and total probability of an observed image, fk given an estimate of the SR image, is:

In addition, a typical choice for prior knowledge about the HR image is to use the following prior distribution:

Where Q represents a linear high-pass operation that penalizes nonsmooth estimation, λ controls the variance of prior distribution, and a higher value of λ represents a smaller variance in the distribution. When (3-4) and (3-5) are substituted into (3-3), the following function is produced:

Where α is the regularization parameter controlling the terms and .
Finally, we put our prior knowledge to the MAP algorithm for iterative computation.
Proposed super-resolution reconstruction algorithm
We can define the convex set and projection in the POCS algorithm for the discrete cosine transform (DCT) field, as the image DCT (block DCT) transform from RN2 to RN2, to be a linear transformation. The transform coefficient and image pixel are related as follows:

In this formula, F is the coefficient vector which is transformed by f and T is the DCT transformation matrix of .
For the M × N image and n × n block, T is the block diagonal matrix composed by many block matrices, each one being a n2 × n2matrix. T−1is the inverse DCT transformation. Based on the inverse DCT transformation relationship, T−1 = T́, T́ is the transposition of T. For this reason, each quantization step and quantization value of the DCT transformation ratio for F are known in the decoding process and each upper and lower bound of the DCT transformation ratio for F can be fixed. To make use of the DCT ratio information in the compressed video, we can define the convex set as follows:

In this formula, and are the upper and lower bound, respectively, of the nth transformation ratio F which is defined by the quantizer.
By projecting the random vector of RN2 to the projection operator Pf of C, we get the following formula:

T́ is the DCT inverse transformation matrix, and F can be defined as:

The CDCT projection and its projection operator reflect the DCT transformation and quantization information during image compression. They also reflect the particularity of compressed video. The convex set is defined as:

E1 is the ceiling of the out boundary value difference vector mode; therefore, C1 must be a closed convex set.
The uplink direction of the block boundary's convex constraint sets may also be defined as:

Manual feature registration – maximum a posteriori – projection onto convex sets algorithm
The problem tackled by the SR reconstruction of compressed video is to process a group of compressed video sequences. Frames in the video sequence are different; therefore, the information they provide is also different with respect to the HR image that we want to compose.[5] To use these different types of information effectively, we have to register these frames into a single image. In this paper, we use the MFR method to achieve this.
The noise product of the compressor makes the whole noise model of the image system more complex. It also induces the blocking effect and ringing artifacts, which make the visual effect worse.
Because of the occurrence of these problems, we use the MFR-MAP-POCS algorithm, which is based on MFR and is combined with the MAP and POCS spatial SR algorithms. The MAP method ensures the uniqueness of the solution as well as convergence stability while the POCS method can present the image with edge details.[6]
To reconstruct a HR image, we must create an emulated compression model through which we can research SR. The video system's characteristics were researched and then added to the algorithms using prior knowledge as discussed in this paper. Finally, the following model is constructed:

Referring to the formula just above, is the motion compensation of the first frame. TDCT and T−1DCT are the DCT transform and DCT inverse transform, respectively; Q [ ] is the quantization, C (Vl, i) is the prediction process, and Vl, i is the motion vector of coding.
The MAP iterative algorithm needs two parts of the probability distribution. The probability distribution p (Y | fk) represents the distribution of noise and p (fk) is the prior information of the HR image that always assumes Gibbs distribution:

In this formula, T is a constant equal to 1, Z is the normalization factor that can be omitted, and U (fk) is the energy function of the Gibbs distribution that is defined as follows:

Ci is the translation of one pixel by any direction.
It defines the POCS constraint set. The convex sets C1 and C2 can solve the block effect in POCS; therefore, we choose them to be convex constraint sets for an optimal POCS solution.
Finally, we arrived at a comprehensive formula that gives the estimating equations of the HR image:

In this formula, the term β is used to control the proportion of conditional and a priori probability.
Using the iterative method, we get the following formula:

Where is the estimated value of the nth iterative result and is the MAP estimated value of the iterative result of the , iterative result. B and α are used to control not only the proportion of conditional probability and a priori probability but also control the convergence of this algorithm. The compensation of the motion vector is represented by CT (d1,k). Then, is projected onto convex sets and following formula is obtained:

In this formula, P1 and P2 are the projection operators of C1 and C2 convex sets, and P3 is the projection operator of the registration operator movement convex sets. We can then get the estimated value of the HR image iteration.
By iterating as such, converges to the desired result.
Finally, we can describe the MFR-MAP-POCS algorithm as follows:
- The algorithm is initialized; we get the quantization matrix; then, we get the covariance matrix of space quantization noise covariance matrix KQ,1. A similar facial image is fixed to the initial first frame
. The iterative operators α and β are initialized - Image registration: from this step, we get R, which is the registration operator movement
- MAP: impose the LR image on the HR image, fix the pixel region to follow R which we got in step 2. Then, we iterate all LR images to HR images. In the end, we get the MAP estimation of the HR image
- POCS: calculate the pixel difference of the adjacent column; then, renew it depending on prior knowledge
- Repeat steps 2, 3, and 4 until convergence conditions are met or until the limited iteration number is reached.
Experiment | |  |
First, we find a facial image that is similar to the original facial image. Then, the feature points are marked manually as shown in [Figure 2].
The facial image in the HR template is shown left in [Figure 2]. The facial image shown right is acquired from a single frame of the surveillance video.
Going forward, we use prior knowledge, such as image registration, to conduct the MAP iterative operation. The MAP method controls the blocking effect and the convergence of this algorithm. Then, we use the POCS method and the MAP method alternately. Thus, we not only control convergence but we also get a good image edge. The image is acquired as shown in [Figure 3].
The original facial image is shown left while the result of applying the MFR-MAP-POCS algorithm is shown to the right in [Figure 3].
Conclusion | |  |
The proposition made in this paper was confirmed by experimental results. It was shown that the MAP algorithm can eliminate the blocking effect and ringing artifacts that reduce the surveillance video's visual quality. MAP can additionally result in smoother image detail. The POCS algorithm can make the edge and detail of the surveillance video image clearer, but it will also enforce the ringing effect. If we combine the MAP and POCS algorithms, we can get a clearer image.
Surveillance video footage containing clues for forensic investigation is always highly compressed. Because the intention is to improve video quality for facial recognition purposes, the surveillance video system's prior knowledge should be researched and every pixel present in the video's frames should be utilized as much as possible. The approach discussed in this paper is certainly time-consuming; however, it was shown to produce a clearer image. To ensure that the desired result is achieved, we can use MFR to give the space position prior knowledge.
By discussing the MFR-MAP-POCS algorithm, we arrive at the following conclusion: the poor quality of highly compressed surveillance video footage can be improved by implementing the MFR-MAP-POCS algorithm. This method can control the blocking effect as well as the convergence and can also improve the image edge by making the best possible use of information available in the video.
Alternatively, the MAP part of the algorithm can be applied alone to achieve facial recognition from poor quality, highly compressed surveillance video footage. To be successful, however, the video image should be smooth enough for facial features to be perceptible. If we sharpen it too much, we will get an edge instead of a face image.
Declaration of patient consent
The authors certify that they have obtained appropriate permission of animal experiments and patient consent forms. In the form the patient(s) has/have given his/her/their consent for his/her/their images and other clinical information to be reported in the journal. The patients understand that their names and initials will not be published and due efforts will be made to conceal their identity, but anonymity cannot be guaranteed.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References | |  |
1. | Qian L, Qian N, Wenhao FU, Liang H, Zhuang H, Manjing WU, et al. The status quo investigation and the solution to video monitoring. Comput Knowledge Technol 2015;11:167-70. |
2. | Zhong-Qiang X, Xiu-Chang Z. Super-resolution reconstruction technology for compressed video. J Electron Inf Technol 2007;29:499-505. |
3. | Yifei H, Qimin Y, Yixiong Z. MAP-based of super-resolution reconstruction algorithm. Video Eng 2014;38:20-25. |
4. | Xiao C, Yu J, Su K. Gibbs artifact reduction for POCS super-resolution image reconstruction. Front Comput Sci China 2008;2:87-93. |
5. | Chen G, Li S. Reconstruction of super-resolution image using MAP and POCS algorithms. Sci Technol Eng 2006;6:396-99. |
6. | Shao LT, Chen C, Hong-Gang Z, Guo-Hua Z. An Improved Hybrid MAP-POCS Algorithm for Super-Resolution Image Restoration Research. Electron Opt Control 2015;22:41-5. |
[Figure 1], [Figure 2], [Figure 3]
|