The new approach I will try consists of:
- Compute correspondences using SIFT.
- Compute a homography using RANSAC
- Detect pixels which do not agree with the homography.
- Apply graph cuts to obtain piece-wise contiguous and smooth regions
For step 3 above (detecting pixels that don't agree with the homography), my first guess was to simply compute the difference between the reference image and the warped image, but it appears the computed homography is not very precise, which results in a lot of noise:
As in the WWW paper, I tried a second round of RANSAC with a tighter threshold. The inliers and difference (between reference image and warped image) are shown below.