Skip to content

[GSoC 2019] Final: Information Flow Alphamatting #2245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 25 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 60 additions & 0 deletions modules/alphamat/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Designing Effective Inter-Pixel Information Flow for Natural Image Matting:

Alphamatting is the problem of extracting the foreground from an image. Given the input of image and its corresponding trimap, we try to extract the foreground from the background. Following is an example -

Input Image | Input trimap | Ouput Alpha matte
:-------------------------:|:-------------------------:|:-------------------------:
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/net.png" alt="alt text" width="300" height="200"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/trimap/net.png" alt="alt text" width="300" height="200"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_net.png" alt="alt text" width="300" height="200">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Try to use relative URLs for images.
  2. Use markdown for embedding images.


This project is implementation of Information-Flow Matting [Yağız Aksoy, Tunç Ozan Aydın, Marc Pollefeys] [1]. It required implementation of some parts of other papers [2,3].

This is a pixel-affinity based alpha matting algorithm which solves a linear system of equations using preconditioned conjugate gradient method. Affinity-based methods operate by propagating opacity information from known opacity regions(K) into unknown opacity regions(U) using a variety of affinity definitions mentioned as -
* Color mixture information flow - Opacity transitions in a matte occur as a result of the original colors in the image getting mixed with each other due to transparency or intricate parts of an object. They make use of this fact by representing each pixel in U as a mixture of similarly-colored pixels and the difference is the energy term ECM, which is to be reduced. This is coded in **cm.hpp**
* K-to-U information flow - Connections from every pixel in U to both F(foreground pixels) and B(background pixels) are made to facilitate direct information flow from known-opacity regions to even the most remote opacity-transition regions in the image. This is coded in **KtoU.hpp**
* Intra U information flow - They distribute the information inside U effectively by encouraging pixels with similar colors inside U to have similar opacity. This is coded in **intraU.hpp**
* Local information flow - Spatial connectivity is one of the main cues for information flow which is achieved by connecting each pixel in U to its immediate neighbors to ensure spatially smooth mattes. This is coded in **local_info.hpp**

Using these information flow, energy/error(E) is obtained as a weighted local composite of E<sub>CM</sub>, E<sub>KU</sub>(K-to-U information flow), E<sub>UU</sub>(Intra U information flow), E<sub>L</sub>(Local information flow).
E represents the deviation of unknown pixels opacity or colour from what we predict it to be using other pixels. So, the algorithm aims at minimizing this error. This is coded in **alphac.cpp**

Pre-processing and post-processing is implemented in **trimming.hpp**

To run the code -
1. **g++ -std=c++11 alphac.cpp \`pkg-config --cflags --libs opencv\`**
1. **./a.out \<path to image> \<path to corresponding trimap>**

Sample image and trimap are in opencv_contrib/modules/alphamat/src/img and opencv_contrib/modules/alphamat/src/trimap

## Results

Results for input_lowres are available here -
https://docs.google.com/document/d/1BJG4633_U5K-Z0QLp3RTi43q25NI0hrTw-Q4w_85NrA/edit?usp=sharing

Input Image | Ouput Alpha matte
:-------------------------:|:-------------------------:
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/net.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_net.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/doll.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_doll.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/donkey.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_donkey.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/elephant.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_elephant.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/pineapple.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_pineapple.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/plant.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_plant.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/plasticbag.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_plasticbag.png" alt="alt text" width="200" height="155">
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/troll.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_troll.png" alt="alt text" width="200" height="155">

Average time taken to compute the different flows is 40s, but solving of linear equations using preconditioned conjugate gradient method takes another 2-3 min, which can be lessened by allowing lesser iterations.

## TO DO

* Results need to be improved by extensively comparing each flow's matrix with yaksoy MATLAB implementation [4].
* Runtime needs improvement.
* Third part library(Eigen, nanoflann) dependencies can be removed.

## References

[1] Yagiz Aksoy, Tunc Ozan Aydin, Marc Pollefeys, "Designing Effective Inter-Pixel Information Flow for Natural Image Matting", CVPR, 2017. [[link](http://people.inf.ethz.ch/aksoyy/ifm/)]

[2] Roweis, Sam T., and Lawrence K. Saul. "Nonlinear dimensionality reduction by locally linear embedding." science 290.5500 (2000): 2323-2326.[[link](https://science.sciencemag.org/content/290/5500/2323)]

[3] Ehsan Shahrian, Deepu Rajan, Brian Price, Scott Cohen, "Improving Image Matting using Comprehensive Sampling Sets", CVPR 2013 [[paper](http://www.cv-foundation.org/openaccess/content_cvpr_2013/papers/Shahrian_Improving_Image_Matting_2013_CVPR_paper.pdf)]

[4] Affinity Based Matting Toolbox by Yagiz Aksoy[[link](https://github.com/yaksoy/AffinityBasedMattingToolbox)]
Binary file added modules/alphamat/Result/result_doll.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_donkey.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_elephant.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_net.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_pineapple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_plant.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_plasticbag.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/Result/result_troll.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/doll.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/donkey.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/elephant.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/net.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/pineapple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/plant.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/plasticbag.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added modules/alphamat/img/troll.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 39 additions & 0 deletions modules/alphamat/perf/perf_infoflow.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
using namespace std;
using namespace cv;
using namespace perf;

#include "perf_precomp.hpp"

namespace opencv_test
{

typedef std::tr1::tuple<Size, MatType, MatDepth> Size_MatType_OutMatDepth_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typedef perf::TestBaseWithParam<Size_MatType_OutMatDepth_t> Size_MatType_OutMatDepth;

/* 2. Declare the testsuite */
PERF_TEST_P( Size_MatType_OutMatDepth, integral1,
testing::Combine(
testing::Values( TYPICAL_MAT_SIZES ),
testing::Values( CV_8UC1, CV_8UC4 ),
testing::Values( CV_32S, CV_32F, CV_64F ) ) )
{
string folder = "cv/alphamat/";
string image_path = folder + "img/elephant.png";
string trimap_path = folder + "trimap/elephant.png";
string reference_path = folder + "reference/elephant.png";

Mat image = imread(getDataPath(image_path), IMREAD_COLOR);
Mat trimap = imread(getDataPath(trimap_path), IMREAD_COLOR);
Mat reference = imread(getDataPath(reference_path), IMREAD_GRAYSCALE);

Size sz = get<0>(GetParam());
int inpaintingMethod = get<1>(GetParam());

Mat result;
declare.in(image, trimap).out(result).time(120);

TEST_CYCLE() infoFlow(image, trimap, result, false, true);

SANITY_CHECK_NOTHING();
}
} // namespace
7 changes: 7 additions & 0 deletions modules/alphamat/perf/perf_main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#include "perf_precomp.hpp"

#if defined(HAVE_HPX)
#include <hpx/hpx_main.hpp>
#endif

CV_PERF_TEST_MAIN(stitching)
6 changes: 6 additions & 0 deletions modules/alphamat/perf/perf_precomp.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
#ifndef __OPENCV_PERF_PRECOMP_HPP__
#define __OPENCV_PERF_PRECOMP_HPP__

#include "opencv2/ts.hpp"

#endif
116 changes: 116 additions & 0 deletions modules/alphamat/src/KDTreeVectorOfVectorsAdaptor.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
/***********************************************************************
* Software License Agreement (BSD License)
*
* Copyright 2011-16 Jose Luis Blanco ([email protected]).
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions
* are met:
*
* 1. Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT,
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*************************************************************************/

#pragma once

#include "nanoflann.hpp"

#include <vector>

// ===== This example shows how to use nanoflann with these types of containers: =======
//typedef std::vector<std::vector<double> > my_vector_of_vectors_t;
//typedef std::vector<Eigen::VectorXd> my_vector_of_vectors_t; // This requires #include <Eigen/Dense>
// =====================================================================================


/** A simple vector-of-vectors adaptor for nanoflann, without duplicating the storage.
* The i'th vector represents a point in the state space.
*
* \tparam DIM If set to >0, it specifies a compile-time fixed dimensionality for the points in the data set, allowing more compiler optimizations.
* \tparam num_t The type of the point coordinates (typically, double or float).
* \tparam Distance The distance metric to use: nanoflann::metric_L1, nanoflann::metric_L2, nanoflann::metric_L2_Simple, etc.
* \tparam IndexType The type for indices in the KD-tree index (typically, size_t of int)
*/
template <class VectorOfVectorsType, typename num_t = double, int DIM = -1, class Distance = nanoflann::metric_L2, typename IndexType = size_t>
struct KDTreeVectorOfVectorsAdaptor
{
typedef KDTreeVectorOfVectorsAdaptor<VectorOfVectorsType, num_t, DIM,Distance> self_t;
typedef typename Distance::template traits<num_t, self_t>::distance_t metric_t;
typedef nanoflann::KDTreeSingleIndexAdaptor< metric_t, self_t, DIM, IndexType> index_t;

index_t* index; //! The kd-tree index for the user to call its methods as usual with any other FLANN index.

/// Constructor: takes a const ref to the vector of vectors object with the data points
KDTreeVectorOfVectorsAdaptor(const size_t /* dimensionality */, const VectorOfVectorsType &mat, const int leaf_max_size = 10) : m_data(mat)
{
assert(mat.size() != 0 && mat[0].size() != 0);
const size_t dims = mat[0].size();
if (DIM>0 && static_cast<int>(dims) != DIM)
throw std::runtime_error("Data set dimensionality does not match the 'DIM' template argument");
index = new index_t( static_cast<int>(dims), *this /* adaptor */, nanoflann::KDTreeSingleIndexAdaptorParams(leaf_max_size ) );
index->buildIndex();
}

~KDTreeVectorOfVectorsAdaptor() {
delete index;
}

const VectorOfVectorsType &m_data;

/** Query for the \a num_closest closest points to a given point (entered as query_point[0:dim-1]).
* Note that this is a short-cut method for index->findNeighbors().
* The user can also call index->... methods as desired.
* \note nChecks_IGNORED is ignored but kept for compatibility with the original FLANN interface.
*/
inline void query(const num_t *query_point, const size_t num_closest, IndexType *out_indices, num_t *out_distances_sq, const int nChecks_IGNORED = 10) const
{
nanoflann::KNNResultSet<num_t, IndexType> resultSet(num_closest);
resultSet.init(out_indices, out_distances_sq);
index->findNeighbors(resultSet, query_point, nanoflann::SearchParams());
}

/** @name Interface expected by KDTreeSingleIndexAdaptor
* @{ */

const self_t & derived() const {
return *this;
}
self_t & derived() {
return *this;
}

// Must return the number of data points
inline size_t kdtree_get_point_count() const {
return m_data.size();
}

// Returns the dim'th component of the idx'th point in the class:
inline num_t kdtree_get_pt(const size_t idx, const size_t dim) const {
return m_data[idx][dim];
}

// Optional bounding-box computation: return false to default to a standard bbox computation loop.
// Return true if the BBOX was already computed by the class and returned in "bb" so it can be avoided to redo it again.
// Look at bb.size() to find out the expected dimensionality (e.g. 2 or 3 for point clouds)
template <class BBOX>
bool kdtree_get_bbox(BBOX & /*bb*/) const {
return false;
}

/** @} */
}; // end of KDTreeVectorOfVectorsAdaptor
Loading