DSP MATLAB练习

HW1.1 Image Manipulation

A1.1. Load image and determine information of this image.


xxxxxxxxxx
im=imread('pears.jpg');
sz=size(im);
datatype=class(im); % uint8, unsigned integer

im_min=min(im(:));
im_max=max(im(:));
im_stddev=std(double(im(:)));
imhist(im);

A1.2. Convert the data type of image to `doubleimd` $[0,1]$ .

There is no difference when displaying im and imd using imshow().
Because imshow() automatically recognizes the type of image and devises the correct range based on that for display.

$2$ pixel wide black border without changing the size, save the new image to `im_framed`.


xxxxxxxxxx
color_black=0;
im_framed=im;
% rows of matrix / width of image
im_framed([1:2,end-1:end],:)=color_black;
% columns of matrix / height of image
im_framed(:,[1:2,end-1:end])=color_black;

$0.01$ $\sigma=0.005$ to the image. How many pixels do you expect to lie outside the toleranc range?

Given

\begin{matrix} (1) & e r f c (n) = 1 - e r f (n) = \frac{2}{\sqrt{π}} \int_{n}^{\infty} \exp (- t^{2}) d t \end{matrix}

We use

\begin{matrix} (2) & n = \frac{x - μ}{σ \sqrt{2}} = \frac{0.01}{σ \sqrt{2}} \end{matrix}

C1.2. Copy each second row and each third column of `im_framed`, save it to `im_part`.


xxxxxxxxxx
im_part=im_framed(1:2:end,1:3:end);

A1.4. Draw a red frame around the image using `plot()` function.

plot() does not change the image data, and is not constrained to integer pixel positions.

C1.3. Use Kronecker product to create colorized copies of grayscale image `im`

\begin{matrix} (3) & i m k_{i} = c_{i} \otimes i m \end{matrix}

Calculate each color channel separately using

\begin{matrix} (4) & \begin{matrix} C_{R} = [\begin{matrix} 1 & 0 & 1 \\ 0.3 & 1 & 0 \\ 0 & 0.3 & 1 \end{matrix}], C_{G} = [\begin{matrix} 0 & 1 & 1 \\ 0.3 & 0 & 1 \\ 0.7 & 0.3 & 0 \end{matrix}], C_{R} = [\begin{matrix} 0 & 0 & 0 \\ 1 & 1 & 0 \\ 0.7 & 1 & 0 \end{matrix}] \end{matrix} \end{matrix}

Merge these channels to a color image.


xxxxxxxxxx
sz=size(im);
im_R=zeros(3*sz);
im_G=zeros(3*sz);
im_B=zeros(3*sz);

for row=1:3
for col=1:3
    im_R(1+(row-1)*sz(1):row*sz(1),...
            1+(col-1)*sz(2):col*sz(2))=cR(row,col)*im;
    im_G(1+(row-1)*sz(1):row*sz(1),...
            1+(col-1)*sz(2):col*sz(2))=cG(row,col)*im;
    im_B(1+(row-1)*sz(1):row*sz(1),...
            1+(col-1)*sz(2):col*sz(2))=cB(row,col)*im;
end
end

im_rgb=zeros([3*sz 3]);
im_rgb(:,:,1)=im_R;
im_rgb(:,:,2)=im_G;
im_rgb(:,:,3)=im_B;

HW1.2. Periodic Sequences

C2.1. Compute atomic image elements from `im` specified by the periodicity matrix `pmloc_base` $(x,y)$ .

$x$ column/second $y$ in an image corresponds to the row/first index in a matrix.


xxxxxxxxxx
function [atomic,mask]=get_atomic(im,pm,loc_base)
% Define polygon
% 5 points determine a Parallelogramm
poly=zeros(2,5);
poly(:,1)=loc_base;
poly(:,2)=poly(:,1)+pm(:,1);
poly(:,3)=poly(:,1)+pm(:,2);
poly(:,4)=poly(:,1)-pm(:,1);
poly(:,5)=poly(:,1)-pm(:,2);

% Create mask for atomic element (this procedure is given)
poly=poly+repmat([0;0.4]-min(poly,[],2),1,5);
sz_mask=round(max(poly,[],2));
mask=poly2mask(poly(1,:),poly(2,:),sz_mask(2),sz_mask(1));

% Create atomic element
atomic=im(loc_base(2):loc_base(2)+sz_mask(2)-1,...
            loc_base(1):loc_base(1)+sz_mask(1)-1);
atomic=atomic.*mask;
end

C2.2. Implement an inverse operation: generate a periodic image given atomic element.?

C2.3. Calculate NCC between two equal-sized periodic images.

\begin{matrix} (5) & n c c (i_{1}, i_{2}) = \frac{1}{N} \sum_{x, y} \frac{(i_{1} (x, y) - \bar{i_{1}}) (i_{2} (x, y) - \bar{i_{2}})}{σ_{i_{1}} σ_{i_{2}}} \end{matrix}

$N$ is number of elements.


xxxxxxxxxx
sz=size(i1);
i1_nomi=(i1(:)-mean(i1(:)))/std(i1(:));
i2_nomi=(i2(:)-mean(i2(:)))/std(i2(:));
ncc=i1_nomi.*i2_nomi/(sz(1)*sz(2));
ncc=sum(ncc(:));

C2.4. Find the atomic element of a given periodic image using exhausive search and correlation.?

A2.1. Why do you observe several maxima in the NCC values of the whole image?

Because the considered image has partial repetitive parts, which can be matched to the whole.

HW2.1. Image Sampling

$N\times N$ $N=400$ $(1,1)$ $[1,512]$ $x$ $y$ .


xxxxxxxxxx
sz=512;
[gridX,gridY]=meshgrid(linspace(1,sz,N));

C1.2. Given gray scale image `im`, use `interp2()` to find image intensity at non-integer coordinates `(gridX,gridY)`. Try different interpolation options.


xxxxxxxxxx
function [im_linear,im_cubic]=interp(im,gridX,gridY)
sz=size(im,1);
% given image points
[imX,imY]=meshgrid(1:sz);
% interpolation
% imX,imY: sample points
% im: function values at sample points
% gridX, gridY: query points
im_linear=interp2(imX,imY,im,gridX,gridY,'linear');
im_cubic=interp2(imX,imY,im,gridX,gridY,'cubic');
end

A1.1. Describe how above images differ and compare the runtime.

Image interpolated using cubic can better preserve edges and other fine details, but takes more time.
linear interpolation loses some details but runs faster. F=TriScatteredInterp(gridX(:),gridY(:),im_scatter(:),'linear');

C1.3. Given image intensity at non-integer coordinates `(gridX,gridY)`, reconstruct image with size `sz` from `im_scatter` using `TriScatteredInterp()`.


xxxxxxxxxx
function im_rec=reconstruct(im_scatter,sz,gridX,gridY)
% Reconstruct function
% gridX(:),gridY(:): sample points
% im_scatter(:): function values at sample points
F=TriScatteredInterp(gridX(:),gridY(:),im_scatter(:),'linear');
% query image points
[imX,imY]=meshgrid(1:sz);
im_rec=F(imX,imY);
end

$(1,1)$ $d_x$ $d_y$ $x\in[1,511]$ $y\in[1,512]$ , given

\begin{matrix} (6) & d_{x} = \frac{511 - 1}{372 - 1}, d_{y} = \sqrt{3} \cdot 0.5 \cdot d x \end{matrix}

meshgrid() $d_x$ $d_y$
$0.5d_x$ to obtain the hexagonal grid


xxxxxxxxxx
sz=512;
dx=(sz-1-1)/(372-1);
gridX=1:dx:sz-1;
dy=sqrt(3)*0.5*dx;
gridY=1:dy:sz;
% 1. meshgrid
[hexaX,hexaY]=meshgrid(gridX,gridY);
% 2. shift rows
hexaX(2:2:end,:)=hexaX(2:2:end,:)+0.5*dx;

C1.5. Interpolate image `im` using `interp2()` at hexagonal sampling positions above, then reconstruct the image using `TriScatteredInterp()`.


xxxxxxxxxx
function im_rec=inter_recon(im,hexaX,hexaY)
sz=size(im,1);
% 1. Interpolation
[imX,imY]=meshgrid(1:sz);
im_interp=interp2(imX,imY,im,hexaX,hexaY,'linear');
% 2. Reconstruction
F=TriScatteredInterp(hexaX(:),hexaY(:),im_interp(:),'linear');
im_rec=F(imX,imY);

$5$ pixels to neglect border effect.

\begin{matrix} (7) & \begin{matrix} {PSNR}_{[dB]} = 10 \log_{10} \frac{1}{MSE} \\ MSE = \frac{1}{N} \sum_{x, y} (i_{1} (x, y) - i_{2} (x, y))^{2} \end{matrix} \end{matrix}


xxxxxxxxxx
function psnr_diff=psnr_diff(i1,i2)
i_diff=i1-i2;
border=5;
mse=i_diff(border+1;end-border,border+1:end-border).^2;
psnr_diff=10*log10(1/mse);
end

A1.2. Which image has better/higher PSNR?

Hexagonal sampling has better PSNR as it exhibits a better spectral efficiency.

HW2.2. Zone Plate

$N\times N$ image matrix whose entries represent the Euclidean distance in pixel unit to the top-left pixel. Then calculate zone plate image using

\begin{matrix} (8) & I_{z p} (x, y) = \frac{1}{2} + \frac{1}{2} \cos (π k d (x, y)^{2}) \end{matrix}

$k$ $\pi$ .

\begin{matrix} (9) & \begin{matrix} π k (\sqrt{2} (N - 1))^{2} - π k (\sqrt{2} (N - 2))^{2} = π k (4 N - 6) \overset{!}{=} π \\ \Rightarrow k = \frac{1}{4 N - 6} \end{matrix} \end{matrix}


xxxxxxxxxx
function [D,k,Izp]=zone_plate(N)
[x, y] = meshgrid(1:N, 1:N);
D = sqrt((x-1).^2 + (y-1).^2);
k = 1/(4*N-6);
Izp = 1/2+cos(D.^2*pi*k)/2;
end

`Izp` $2:1$ $4:1$ , what effects do you observe?

Higher frequency components are projected to lower frequency after subsampling.
Images after subsampling have low frquency components on three corners, which are similar to the top-left corner.

HW2.3. Image Filters

A3.1. Analyze different filters, plot the filter using `surf()` and analyze its frequency response using `freqz2()`.

Gaussian: low pass
Smaller Gaussian filters in space have a wider frequency spectrum,
Laplacian: high pass
Prewitt/Laplacian of Gaussian: high pass, and reduces sensitivity to noise.

C3.1. Find out how `freqz2()` generates its plot and implement it manually using `fft2()`.


xxxxxxxxxx
filter=fspecial('prewitt');
N=64; % default for freqz2()
freq_resp=fft2(filter,N,N);
freq_plot=abs(fftshift(freq_resp));

$(16\times 16,\sigma=3)$ and `replicate` border processing.


xxxxxxxxxx
function im_filtered=filter_im(im)
filter=fspecial('gaussian',16,3);
im_filtered=imfilter(im,filter,'replicate');
end

A3.2. ….

HW3.2. Separable Block Transforms

C2.1. Implement function `block_splitter`border blocks ${\boxed{\circ \underline\triangle\circ}}$ .


xxxxxxxxxx
function im_out=block_splitter(im,block_sz,block_func,block_arg)
% 1. Pad the input image based on block_sz=16
sz=size(im); % [486,732]
pad=block_sz-mod(sz,block_sz); % [16,16]-[6,12]=[10,4]
pad(pad==block_sz)=0 % if mod=0, no padding needed
im=padarray(im,pad,'replicate','post');
sz=sz+pad; % [486,732]+[10,4]=[496,736]

% 2. Loop over blocks and apply block_func
for x=1:block_sz:sz(2)-block_sz+1
    for y=1:block_sz:sz(1)-block_sz+1
        block_in=im(y:y+block_sz-1,x:x+block_sz-1);
        block_out=block_func(block_in,block_arg);
        im_out(y:y+block_sz-1,x:x+block_sz-1)=block_out;
    end
end

% 3. Cut away padding
im_out(end-pad(1)+1:end,:)=[]; % assign an empty matrix
im_out(:,end-pad(2)+1:end)=[];
end

C2.2. Write a simple `block_func` that flips the image block left to right and decreases the brightness by substracting 0.3 from each pixel.


xxxxxxxxxx
block_out=fliplr(block_in)-0.3;
block_out(block_out<0)=0;

A2.1. Above operation ensures that pixel value is non-negative, so it's nonlinear.

C2.3. Apply given Haar transform matrix to the image.


xxxxxxxxxx
fcnHandleBlockSplitter=@block_splitter;;
block_func=@(block_in,A) A*block_in*A'; % create a function
haar_im=fcnHandleBlockSplitter(im,block_sz,...
            block_func,haar_4);

C2.4. Reorder the transformed image to make it meaningful for human.


xxxxxxxxxx
sz=size(im);
im_reorder=zeros(sz);
rows=1:block_sz:sz(1);
% or rows=1:block_sz:sz(1)-block_sz+1;
cols=1:block_sz:sz(2);
% or cols=1:block_sz:sz(2)-block_sz+1;
lr=length(rows);
lc=length(cols);

for y=0:block_sz-1
    for x=0:block_sz-1
        block=im(rows+y,cols+x);
        im_reorder(1+y*lr:(1+y)*lr,1+x*lc:(1+x)*lc)=block;
    end
end

C2.5. Implement `calculate_statis` to calculate the mean and variance for each coefficient over all blocks.


xxxxxxxxxx
function [im_mean, im_var]=calculate_statis(haar_im, block_sz)

%%%%%%%%%%%%%%%%%%
% After padding: %
%   sz=sz+pad    %
%%%%%%%%%%%%%%%%%%

im_out=zeros(size(im));
all_blocks=zeros(block_sz,block_sz,sz(1)*sz(2)/block_sz^2);

% Loop over blocks
ibl=0;
for x=1:block_sz:sz(2)-block_sz+1
    for y=1:block_sz:sz(1)-block_sz+1
        ibl=ibl+1;
        block=im(y:y+block_sz-1,x:x+block_sz-1);
        all_blocks(:,:,ibl)=block;
    end
end

% Calculate statistics
im_mean=mean(all_blocks,3);
im_var=var(all_blocks,[],3); % []:no weights assigned
end

A2.2. What can be observed by analyzing `im_var`?

For natural image, variances of block elements are similar to each other, also similar to the global statistics. This may change once a transformation is applied.
After Haar transformation, variances decay considerably from (0,0) coefficient into higher coefficients such as (block_sz,block_sz), which means it packs most energy/variance into low-order coefficients.

$[0.25,0.5,0.25]$ to the image horizonally and vertically.


xxxxxxxxxx
blur_A=0.5*diag(ones(1,block_sz),0)+...
        0.25*diag(ones(1,block_sz-1),-1)+...
        0.25*diag(ones(1,block_sz-1),1);
block_func=@(block_in,A) A*block_in*A';
blur_im=fcnHandleBlockSplitter(im,block_sz,block_func,blur_A);

$AXA^T$ along the block borders.

Image is blurred, i.e., low-pass filtered so that fine details cannot be observed anymore.
$AXA^T$ does not have access to neighbor pixels outside the block, so border effect such as dark frame is observed around blocks, thus not "smooth".
In order to mitigate the border effect, one could use overlapping blocks at the cost of performance.

HW4.1. Discrete Cosine Transform

C1.1. Transform images to DCT domain using block processing.


xxxxxxxxxx
block_sz=size(dct_matrix,1);
block_func=@(block_in,A) A*block_in*A';
dct_im=fcnHandleBlkSplitter(im,block_sz,block_func,dct_matrix);

A1.1. What can be observed by analyzing `im_mean` and `im_var` of all blocks?

The mean of the DC-coefficient is proportional to the mean of the untransformed image in the respective area.
The variance of the DC coefficient is the largest, and it decreases with increasing spatial frequency.
Variances in horizontal directions exhibit slightly higher amplitude compared to the vertical direction.
AC-coefficients exhibit zero mean.

$N$ coefficients of each block, others are set to zero. Return in image in spatial domain resulting from compression.


xxxxxxxxxx
% Remove all but the first N coefficients
d=zeros(block_sz,1);
d(1:sqrt(N))=1;
comp_matrix=diag(d);
block_func=@(block_in,A) A*block_in*A';

dct_im_comp=fcnHandleBlkSplitter(dct_im,block_sz,...
                block_func,comp_matrix);

% Back-transform the compressed image
dct_im_comp_rec=fcnHandleBlkSplitter(dct_im_comp,block_sz,...
                block_func,dct_image');

$(N=9,16,64)$ ?

$9$ coefficients are kept, one can observe significant block effects in image regions with much details, such as hair and eyes.
$64$ coefficients are kept, one observes only minor block effects.
Different image regions require different numbers of coefficients to look well for human.

$3$ dimensions. Implement `apply_3ddct` to a block along each dimension subsequently to get the 3D transformed DCT image.


xxxxxxxxxx
block_dct=zeros(size(block_in));

% 1. vertical columns
for i=1:size(block_in,3) 
    for j=1:size(block_in,2)
        vert(:,j,i)=dct_mat*block_in(:,j,i);
    end
end

% 2. horizontal rows
for i=1:size(block_in,3) 
    for j=1:size(block_in,1)
        hor(j,:,i)=dct_mat*vert(j,:,i)'; % transpose!
    end
end

% 3. out-of-plane rotations
for i=1:size(block_in,1) 
    for j=1:size(block_in,2)
        vec=squeeze(hor(i,j,end:-1:1)); % out-of-plane
        block_dct(i,j,end:-1:1)=dct_mat*vec;
    end
end

C1.4. Apply above 3D DCT to any 3D sequence `video`. Ignore the padding. Return the DC and high-frequency components of the resulting sequence.


xxxxxxxxxx
fcnHandle3DDCT=@apply_3ddct;
sz=size(video);

for i=1:block_sz:sz(1)-block_sz+1
for j=1:block_sz:sz(2)-block_sz+1
for k=1:block_sz:sz(3)-block_sz+1
    tmp=video(i:i+block_sz-1,j:j+block_sz-1,k:k+block_sz-1);
    block_dct=fcnHandle3DDCT(tmp,dct_mat);
    video_out(i:i+block_sz-1,j:j+block_sz-1,k:k+block_sz-1)=block_dct;
end
end
end

video_dc=video_out(1:block_sz:end,1:block_sz:end,2:block_sz:end);
video_hf=video_out(2:block_sz:end,2:block_sz:end,1:block_sz:end);

A1.3. Which DCT component of the 3D DCT transformed sequence contains the most energy? What is the best strategy for video compression?

The most energy is stored in DC component.
The energy decreases with increasing coefficient positions.
For sequences with dynamic content but with ares of same color, it is best to preserve more DCT coefficients corresponding to time dimension.
For sequences with static content but with high level of spatial detail, it is best to preserve the coefficients corresponding to the spatial domain.

HW4.2. Karhunen-Loeve-Transform

$32\times 32$ $0.1$ to the vector elements `[529:544,272:32:784]` and reshapes the result back to a block. Perform above block processor to an image.


xxxxxxxxxx
add=zeros(N^2,1);
add([529:544,272:32:784])=0.1;
block_func=@(block,add) reshape(block(:)+add,N,N);
im_reshape=block_splitter(im,N,block_func,add);

$8\times 8$ blocks using following steps:

$8\times 8$ and save results to blocks
Calculate the autocorrelation matrix for those blocks and save results to C_im
Perform EVD of C_im
Sort the resulting eigenvalues with descending order and correspondingly the eigenvectors
$8\times 8$ $64\times 64$ image in column major order


xxxxxxxxxx
N=8;
sz=size(im);

% 1. Restructure im into blocks
ibl=0;
for x=1:N:sz(2)-N+1
    for y=1:N:sz(1)-N+1
        ibl=ibl+1;
        block=im(y:y+N-1,x:x+N-1);
        blocks(ibl,:)=block(:);
        % size(blocks,1)=...
        %   length(1:N:sz(2)-N+1)*length(1:N:sz(1)-N+1)
        % size(blocks,2)=64
    end
end

% 2. Compute autocorrelation over blocks, i.e., rows of blocks
% C_XX=E[XX^H]
C_im=blocks'*blocks/size(blocks,1);

% 3. EVD of C_im
[evec,eval]=eig(C_im);
[eval,eval_idx]=sort(diag(eval),'descend');
% eval is sorted vector
% eval_idx is the permutation
klt_base_v=evec(:,eval_idx);

% 4. Reshape in column major order
idx=0;
for x=0:N-1
    for y=0:N-1
        idx=idx+1;
        klt_base(1+y*N,(1+y)*N,1+x*N:(1+x)*N)=...
                      reshape(klt_base_v(:,idx),N,N);
    end
end

A2.1. Compare KLT basis images when using different images for training.

KLT is the most efficient transform in terms of decorrelation and energy packing.
But it depends on image statistics. As shown above, autocorrelation of a specific image is calculated to derive KLT basis images.
For different images with different statistics, the resulting KLT basis images would be different.

C2.3. Project an image to the KLT basis above. Compare the back-transformed image with the original one and calculte the PSNR.


xxxxxxxxxx
% Projection is obtained by inner product
block_func=@(block_in,base) reshape(base'*block_in(:),N,N);
% Project the image
im_proj=block_splitter(im,N,block_func,klt_base_v);
% Back project, using base*base'=I
im_back_proj=block_splitter(im_proj,N,block_func,klt_base_v');
% Compute PSNR
im_diff=(im_back_proj-im).^2;
klt_psnr=10*log10(1/mean(im_diff(:)));

C2.4. Calculate the autocorrelation matrix of all blocks from KLT transformed image. (same as C2.2.)


xxxxxxxxxx
sz=size(im_proj);
% 1. Restructure im into blocks
ibl=0;
for x=1:N:sz(2)-N+1
    for y=1:N:sz(1)-N+1
        ibl=ibl+1;
        block=im(y:y+N-1,x:x+N-1);
        blocks(ibl,:)=block(:);
        % size(blocks,1)=...
        %   length(1:N:sz(2)-N+1)*length(1:N:sz(1)-N+1)
        % size(blocks,2)=64
    end
end

% 2. Compute autocorrelation over blocks, i.e., rows of blocks
% C_XX=E[XX^H]
autocorr=blocks'*blocks/size(blocks,1);

A2.2. How should the autocorrelation matrix of KLT blocks look like?

When KLT learned from Lenna image is applied to Lenna image, it allows for perfect decorrelation (except for numerical errors).
Thus, the autocorrelation matrix exhibits only elements on its diagonal and zero elsewhere.
If this KLT is applied to other images, the diagonal of the autocorrelation matrix still dominates, but there are significant non-diagonal elements.
KLT is not suitable as a generic compression method.

A2.3. Calculate the statistics (mean and variance) of above KLT transformed image.

KLT of Lenna on its own base exhibits a monotomic decrease of the variance coefficients.
For other images, the energy compaction is not optimal, there is a noisy behavior in the variance plot.

A2.4. Recall the variance plot for Haar transform, plot the variance of DCT and KLT, what do you observe?

The KLT of Lenna on its own base has a monotonic decrease and higher energy compaction efficiency than DCT and Haar.

HW5.1. Autoregressive Models

`I` $11\times 11$ for each pixel as follows:

\begin{matrix} (10) & C_{I I} (k, l) = E [I (x, y) \cdot I (x + k, y + l)] with - 5 \leq k, l \leq 5 \end{matrix}

Don't forget to substract the mean value of the image.
$5$ pixel border.
$\sigma_x^2$ $1$ pixel horizontal and vertical displacement.


xxxxxxxxxx
sz=size(im);
border=5;
y=1+border:sz(1)-border;
x=1+border:sz(2)-border;

imm=im-mean(im(:));
for t_y=-border:border
    for t_x=-border:border
        data_mat=imm(y,x).*imm(t_y+y,t_x+x);
        acov(t_y+border+1,t_x+border+1)=mean(data_mat(:));
    end
end

var=acov(border+1,border+1);
cov_h=acov(border+1,border+1+1);
cov_v=acov(border+1+1,border+1);

A1.1. Discuss above values and explain the structure of the autocovariance matrix.

The center of autocovariance matrix is the average energy of pixels.
Since we measure all pixels, the autocovariance matrix should be point symmetrix around its center point.
(With increasing distance between pixels, the autocovariance drops almost linearly. A slightly higher covariance is observed in vertical direction)

$AR(1)$ process can be used to generate a 2D signal, and has autocovariance

\begin{matrix} (11) & C_{x x} (k, l) = σ_{x}^{2} \cdot ρ_{h}^{| k |} \cdot ρ_{v}^{| l |}, σ_{x}^{2} = \frac{σ_{z}^{2}}{(1 - ρ_{h}^{2}) (1 - ρ_{v}^{2})} \end{matrix}

The model is implemented by a casual recursive filter according to

\begin{matrix} (12) & x (x, y) = ρ_{h} x (x - 1, y) + ρ_{v} x (x, y - 1) - ρ_{h} ρ_{v} x (x - 1, y - 1) + n_{z} (x, y) \end{matrix}

$n_z$ $\sigma_z$ .

$512\times 512$ `im_out` $100$ $\rho_h$ $C_{xx,10}/C_{xx,00}=C_{II,67}/C_{II,66}=$ `cov_h`/`var`.


xxxxxxxxxx
sz=512;
border=100;
rh=cov_h/var;
rv=cov_v/var;
std_z=sqrt(var*(1-rh^2)*(1-rv^2));

im=zeros(border+sz,border+sz);

for i=2:border+sz
for j=2:border+sz
    im(i,j)=rh*im(i,j-1)+rv*im(i-1,j)-rv*rh*im(i-1,j-1)+std_z*randn(1);
end
end

im_out=im(border+1:border+sz,border+1:border+sz);

Compute the autocovariance of above generated image, compare it to the autocovariance obtained by the equation in C1.2.

$AR(1)$ process produces an autocovariance which is piecewise planar in four quadrants.
$k=0$ $l=0$ .
There is a constant offset between the measured autocovairance and expected autocovariance.
This error can be minimized using a larger startup border.

$AR(1)$ process with parameters measured from this image look like?

The resulting autocovariance matrix exhibits a significantly higher correlation in vertical direction, due to the dominant vartical structures in the image.
$AR(1)$ image created with corresponding parameters will exhibit vertically dominating structures.

HW5.2. Pyramid Representations

$3\times 3$ approximation of Gaussian lowpass filter is given

\begin{matrix} (13) & \begin{matrix} F = \frac{1}{8} [\begin{matrix} 0 & 1 & 0 \\ 1 & 4 & 1 \\ 0 & 1 & 0 \end{matrix}] \end{matrix} \end{matrix}

$3\times 3$ reconstruction filter `G` for 1:2 upsampling using linear interpolation.

First, zeros are filled between pixels

\begin{matrix} (14) & [\begin{matrix} a & b \\ c & d \end{matrix}] \overset{upsampling}{⟶} [\begin{matrix} 0 & a & 0 & b & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & c & 0 & d & 0 \\ 0 & 0 & 0 & 0 & 0 \end{matrix}] \end{matrix}

The reconstructed matrix should be

\begin{matrix} (15) & [\begin{matrix} 0 & a & \frac{a + b}{2} & b & 0 \\ 0 & \frac{a + c}{2} & \frac{a + b + c + d}{4} & \frac{b + d}{2} & 0 \\ 0 & c & \frac{c + d}{2} & d & 0 \\ 0 & 0 & 0 & 0 & 0 \end{matrix}] \end{matrix}

We begin from the block at top-left corner, this determines two entries of the filter

\begin{matrix} (16) & [\begin{matrix} 0 & a & 0 \\ 0 & 0 & 0 \\ 0 & c & 0 \end{matrix}] * \frac{1}{4} [\begin{matrix} 0 & 2 & 0 \\ 0 & 0 & 0 \\ 0 & 2 & 0 \end{matrix}] = [\begin{matrix} 0 & a & 0 \\ 0 & \frac{a + c}{2} & 0 \\ 0 & c & 0 \end{matrix}] \end{matrix}

Continuing with this technique, we get the filter

\begin{matrix} (17) & \begin{matrix} G = \frac{1}{4} [\begin{matrix} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{matrix}] \end{matrix} \end{matrix}

`im` $5$ $1$ $512\times 512$ $5$ $32\times 32$ $2:1$ horizontally and vertically.


xxxxxxxxxx
NL=5;
lap_pyr=cell(1,5);
tmp_im=im;

for level=1:NL-1
    % filter & downsample
    imf=imfilter(tmp_im,F,'replicate');
    imf=imf(1:2:end,1:2:end);
    % upsample and filter
    imr=zeros(2*size(imf));
    imr(1:2:end,1:2:end)=imf;
    imr=imfilter(imr,G,'replicate');
    % difference image
    lap_pyr{level}=tmp_im-imr;
    % for next loop
    tmp_im=imf;
end
lap_pyr{level+1}=tmp_im;

C2.3. Obtain the 2D filter kernel for analysis and synthesis steps given 1D lowpass filter for 9-7 Cohen-Daubechies-Feauveau wavelet.


xxxxxxxxxx
function [F,G]=compute_k(f1d,g1d)
F=f1d*f1d';
G=g1d*g1d';
end

$4$ pixels.


xxxxxxxxxx
function var_pyr=pyr_stat(pyr)
border=4;
var_pyr=zeros(5,1);
for i=1:5
    tmp=pyr{i};
    tmp=tmp(border+1;end-border,border+1:end-border);
    var_pyr(i)=var(tmp(:));
end
end

A2.1. Explain the difference you observe for `var_pyr` when using 9-7 Cohen-Daubechies-Feauveau wavelet and Gaussian filters.

In both cases, the variance/energy of levels 1-4 are lower than the last level.
With CDF filters, a little more energy is packed into level 5.

HW1.1 Image Manipulation

A1.1. Load image and determine information of this image.

A1.2. Convert the data type of image to double and save it to imd, which is MATLAB's default data type required by many built-in functions. Data of this type lies in range [0,1][0,1].

Draw a 22 pixel wide black border without changing the size, save the new image to im_framed.

A1.3. Coursework platform accepts a tolerance of 0.010.01. Add Gaussian noise with σ=0.005\sigma=0.005 to the image. How many pixels do you expect to lie outside the toleranc range?

C1.2. Copy each second row and each third column of im_framed, save it to im_part.

A1.4. Draw a red frame around the image using plot() function.

C1.3. Use Kronecker product to create colorized copies of grayscale image im

Calculate each color channel separately using

Merge these channels to a color image.

HW1.2. Periodic Sequences

C2.1. Compute atomic image elements from im specified by the periodicity matrix pm, given the location loc_base of the topmost pixel (x,y)(x,y).

Notice that index xx in an image corresponds to the column/second index in a matrix, and index yy in an image corresponds to the row/first index in a matrix.

C2.2. Implement an inverse operation: generate a periodic image given atomic element.?

C2.3. Calculate NCC between two equal-sized periodic images.

NN is number of elements.

C2.4. Find the atomic element of a given periodic image using exhausive search and correlation.?

A2.1. Why do you observe several maxima in the NCC values of the whole image?

HW2.1. Image Sampling

C1.1. Generate a regular and rectangular sampling grid of dimensions N×NN\times N, N=400N=400, which starts at (1,1)(1,1) and spans exactly the range [1,512][1,512] for xx and yy.

C1.2. Given gray scale image im, use interp2() to find image intensity at non-integer coordinates (gridX,gridY). Try different interpolation options.

A1.1. Describe how above images differ and compare the runtime.

C1.3. Given image intensity at non-integer coordinates (gridX,gridY), reconstruct image with size sz from im_scatter using TriScatteredInterp().

C1.4. Generate a horizontally aligned hexagonal grid with the first sampling point located at (1,1)(1,1). The horizontal and vertical sampling distance is dxd_x and dyd_y. The sampling locations must span horizontally x∈[1,511]x\in[1,511] and vertically y∈[1,512]y\in[1,512], given

C1.5. Interpolate image im using interp2() at hexagonal sampling positions above, then reconstruct the image using TriScatteredInterp().

C1.6. To know the construction quality, compute PSNR, remove a border of 55 pixels to neglect border effect.

A1.2. Which image has better/higher PSNR?

HW2.2. Zone Plate

C2.1. Generate an N×NN\times N image matrix whose entries represent the Euclidean distance in pixel unit to the top-left pixel. Then calculate zone plate image using

Choose kk such that the maximum frequency/phase change along last two diagonal pixels is π\pi.

A2.1. Subsample Izp with 2:12:1 and 4:14:1, what effects do you observe?

HW2.3. Image Filters

A3.1. Analyze different filters, plot the filter using surf() and analyze its frequency response using freqz2().

C3.1. Find out how freqz2() generates its plot and implement it manually using fft2().

C3.2. Filter the image with a Gaussian filter (16×16,σ=3)(16\times 16,\sigma=3) and replicate border processing.

A3.2. ….

HW3.2. Separable Block Transforms

C2.1. Implement function block_splitter with special support for border blocks. Image is like ∘△―∘{\boxed{\circ \underline\triangle\circ}}.

C2.2. Write a simple block_func that flips the image block left to right and decreases the brightness by substracting 0.3 from each pixel.

A2.1. Above operation ensures that pixel value is non-negative, so it's nonlinear.

C2.3. Apply given Haar transform matrix to the image.

C2.4. Reorder the transformed image to make it meaningful for human.

C2.5. Implement calculate_statis to calculate the mean and variance for each coefficient over all blocks.

A2.2. What can be observed by analyzing im_var?

C2.6. Implement an image blur using a symmetric impulse response of [0.25,0.5,0.25][0.25,0.5,0.25] to the image horizonally and vertically.

A2.3. Observe the blurred image and discuss the limitation of AXATAXA^T along the block borders.

HW4.1. Discrete Cosine Transform

C1.1. Transform images to DCT domain using block processing.

A1.1. What can be observed by analyzing im_mean and im_var of all blocks?

C1.2. Perform compression of a DCT-transformed image by keeping only the first NN coefficients of each block, others are set to zero. Return in image in spatial domain resulting from compression.

A1.2. Compare the compressed image and original image, what effects do you observe when using different compression level (N=9,16,64)(N=9,16,64)?

C1.3. Because of its separable property, DCT transformation can be efficiently extended to 33 dimensions. Implement apply_3ddct to a block along each dimension subsequently to get the 3D transformed DCT image.

C1.4. Apply above 3D DCT to any 3D sequence video. Ignore the padding. Return the DC and high-frequency components of the resulting sequence.

A1.3. Which DCT component of the 3D DCT transformed sequence contains the most energy? What is the best strategy for video compression?

HW4.2. Karhunen-Loeve-Transform

C2.1. Implement a block processor which reshapes a 32×3232\times 32 block into a vector, and adds 0.10.1 to the vector elements [529:544,272:32:784] and reshapes the result back to a block. Perform above block processor to an image.

C2.2. Calculate the KLT basis functions for 8×88\times 8 blocks using following steps:

A2.1. Compare KLT basis images when using different images for training.

C2.3. Project an image to the KLT basis above. Compare the back-transformed image with the original one and calculte the PSNR.

C2.4. Calculate the autocorrelation matrix of all blocks from KLT transformed image. (same as C2.2.)

A2.2. How should the autocorrelation matrix of KLT blocks look like?

A2.3. Calculate the statistics (mean and variance) of above KLT transformed image.

A2.4. Recall the variance plot for Haar transform, plot the variance of DCT and KLT, what do you observe?

HW5.1. Autoregressive Models

C1.1. Calculate the autocovariance of a natural image I, using neighboring 11×1111\times 11 for each pixel as follows:

A1.1. Discuss above values and explain the structure of the autocovariance matrix.

C1.2. A separable AR(1)AR(1) process can be used to generate a 2D signal, and has autocovariance

The model is implemented by a casual recursive filter according to

where nzn_z is a zero-mean Gaussian noise with standard deviation σz\sigma_z.

Compute the autocovariance of above generated image, compare it to the autocovariance obtained by the equation in C1.2.

A1.3. Repeat C1.1. with another image. Explain why this autocovariance matrix looks considerably different. How would an image generated by an AR(1)AR(1) process with parameters measured from this image look like?

HW5.2. Pyramid Representations

C2.1. A 3×33\times 3 approximation of Gaussian lowpass filter is given

Compute an appropriate 3×33\times 3 reconstruction filter G for 1:2 upsampling using linear interpolation.

C2.2. Generate a Laplacian pyramid of given image im using above filter with 55 levels. Level 11 holds the finest detail at full resolution 512×512512\times 512, while level 55 corresponds to the lowest level of detail 32×3232\times 32. The scaling factor is 2:12:1 horizontally and vertically.

C2.3. Obtain the 2D filter kernel for analysis and synthesis steps given 1D lowpass filter for 9-7 Cohen-Daubechies-Feauveau wavelet.

C2.4. Calculate the variances of each level of the pyramid. To neglect border effects, remove a border of 44 pixels.

A2.1. Explain the difference you observe for var_pyr when using 9-7 Cohen-Daubechies-Feauveau wavelet and Gaussian filters.

C2.5. Resonstruct image from lap_pyr. Calculate the PSNR, neglect a border of 44 pixels.

留下评论 取消回复

A1.2. Convert the data type of image to `doubleimd` $[0,1]$ .

$2$ pixel wide black border without changing the size, save the new image to `im_framed`.

$0.01$ $\sigma=0.005$ to the image. How many pixels do you expect to lie outside the toleranc range?

C1.2. Copy each second row and each third column of `im_framed`, save it to `im_part`.

A1.4. Draw a red frame around the image using `plot()` function.

C1.3. Use Kronecker product to create colorized copies of grayscale image `im`

C2.1. Compute atomic image elements from `im` specified by the periodicity matrix `pmloc_base` $(x,y)$ .

$x$ column/second $y$ in an image corresponds to the row/first index in a matrix.

$N$ is number of elements.

$N\times N$ $N=400$ $(1,1)$ $[1,512]$ $x$ $y$ .

C1.2. Given gray scale image `im`, use `interp2()` to find image intensity at non-integer coordinates `(gridX,gridY)`. Try different interpolation options.

C1.3. Given image intensity at non-integer coordinates `(gridX,gridY)`, reconstruct image with size `sz` from `im_scatter` using `TriScatteredInterp()`.

$(1,1)$ $d_x$ $d_y$ $x\in[1,511]$ $y\in[1,512]$ , given

C1.5. Interpolate image `im` using `interp2()` at hexagonal sampling positions above, then reconstruct the image using `TriScatteredInterp()`.

$5$ pixels to neglect border effect.

$N\times N$ image matrix whose entries represent the Euclidean distance in pixel unit to the top-left pixel. Then calculate zone plate image using

$k$ $\pi$ .

`Izp` $2:1$ $4:1$ , what effects do you observe?

A3.1. Analyze different filters, plot the filter using `surf()` and analyze its frequency response using `freqz2()`.

C3.1. Find out how `freqz2()` generates its plot and implement it manually using `fft2()`.

$(16\times 16,\sigma=3)$ and `replicate` border processing.

C2.1. Implement function `block_splitter`border blocks ${\boxed{\circ \underline\triangle\circ}}$ .

C2.2. Write a simple `block_func` that flips the image block left to right and decreases the brightness by substracting 0.3 from each pixel.

C2.5. Implement `calculate_statis` to calculate the mean and variance for each coefficient over all blocks.

A2.2. What can be observed by analyzing `im_var`?

$[0.25,0.5,0.25]$ to the image horizonally and vertically.

$AXA^T$ along the block borders.

A1.1. What can be observed by analyzing `im_mean` and `im_var` of all blocks?

$N$ coefficients of each block, others are set to zero. Return in image in spatial domain resulting from compression.

$(N=9,16,64)$ ?

$3$ dimensions. Implement `apply_3ddct` to a block along each dimension subsequently to get the 3D transformed DCT image.

C1.4. Apply above 3D DCT to any 3D sequence `video`. Ignore the padding. Return the DC and high-frequency components of the resulting sequence.

$32\times 32$ $0.1$ to the vector elements `[529:544,272:32:784]` and reshapes the result back to a block. Perform above block processor to an image.

$8\times 8$ blocks using following steps:

`I` $11\times 11$ for each pixel as follows:

$AR(1)$ process can be used to generate a 2D signal, and has autocovariance

$n_z$ $\sigma_z$ .

$AR(1)$ process with parameters measured from this image look like?

$3\times 3$ approximation of Gaussian lowpass filter is given

$3\times 3$ reconstruction filter `G` for 1:2 upsampling using linear interpolation.

`im` $5$ $1$ $512\times 512$ $5$ $32\times 32$ $2:1$ horizontally and vertically.

$4$ pixels.

A2.1. Explain the difference you observe for `var_pyr` when using 9-7 Cohen-Daubechies-Feauveau wavelet and Gaussian filters.

`lap_pyr` $4$ pixels.

留下评论取消回复