Download Image Pyramids and Scale Space Analysis in Computer Vision - Prof. Marshall Tappen and more Study notes Computer Science in PDF only on Docsity! CAP5415: Computer Vision Lecture 4: More on Sampling, Image Pyramids, Image Statistics, Denoising Fall 2009 Image Pyramids ● I've motivated the DFT as a transformation that allows you to see different aspects of the data ● Today we will look at a different transformation that gives you easier access to different kinds of information ● Specifically we will look at scale information How about now?
EX
A
e
New Task
¢ How many whiskers does the zebra have?
e>
"
Now you need high-resolution info
=)
(128 x 128 Image)
● Each level of the pyramid represents the image if it were blurred with a Gaussian. The levels vary by the std. dev. of the Gaussians When would it be useful? ● Useful whenever you need to work at multiple scales – Looking for an object that could be close or far ● Can eliminate distractions What's wrong with the Gaussian pyramid? ● It is redundant ● Each level contains all of the low- frequencies that are available at the lower levels. Frequency view ● Each level captures a band of spatial frequencies Level 1 Level 2 Level 3 What about orientation?
First component of
layer |
Laplacian Pyramid Oriented Pyramid
Application: Texture Synthesis
¢ Basic Problem
wm
Sample Texture
Larger Sample
Heeger and Bergen: SIGGRAPH95 ● Basic Assumption – Take two texture images – Decompose them into a steerable pyramid – If the histogram of the pyramid coefficients are similar, then the textures will appear similar ● Based on studies of human vision, see (Bergen and Adelson 88) or (Malik and Perona 89) Basic algorithm ● Decompose current image into steerable pyramid ● Modify each pyramid image so that its histogram matches the histogram of the corresponding image from the sample ● Invert the pyramid to recover the current texture image ● Repeat until the image converges Results Input Synthesized From (Heeger and Bergen 95) Failures From (Heeger and Bergen 95) Texture beyond histograms ● This approach, while interesting, ended up not being used widely in the graphics community ● Researchers found that they could generate more visually pleasing textures by replicating patches of texture in a smooth fashion – Efros and Leung – 1999 – Efros and Freeman – 2003 ● This approach is still very interesting from an analysis point of view We can use the steerable pyramid (From Adelson and Simoncelli - 1996) What do these histograms tell us? ● Pyramid coefficients from images: Usually zero, but big sometimes ● Noise coefficients: Usually close to zero, very rarely big (From Adelson and Simoncelli - 1996) Question ● What would my estimate of the coefficient be if: – It had a big value? – It had a small value (From Adelson and Simoncelli - 1996) Results (From Adelson and Simoncelli - 1996) Denoising ● One of the state-of-the-art algorithms (Portilla, Strela, Wainwright, and Simoncelli) extends this basic idea – Uses a better estimator of derivative values Fast Filtering ● All of these operations require a lot of filtering ● The FFT is one trick to making it faster. ● We can do more by choosing the filters correctly ● Technique called separable filtering Now, let's do a convolution * The result Result from 2 1-D convolution Original Filter What does this mean? ● Using associativity, we can get the same result by filtering with a 81x81 filter as with convolving by two 1x81 filters ● 81x81 filter=81*81=6561 additions and multiplications ● 2 1x81 filters = 2*81=162 operations ● Much faster