-
Notifications
You must be signed in to change notification settings - Fork 5.8k
[GSoC 2019] Final: Information Flow Alphamatting #2245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
17207fa
Add files via upload
muskaankularia 41ceff9
Add files via upload
muskaankularia 5e3e999
Add files via upload
muskaankularia 0847836
Delete doll.png
muskaankularia c67a32f
Delete donkey.png
muskaankularia d782987
Delete elephant.png
muskaankularia 158e938
Delete net.png
muskaankularia 2477942
Delete pineapple.png
muskaankularia 3fe589e
Delete plant.png
muskaankularia b056e35
Delete plasticbag.png
muskaankularia f994020
Delete troll.png
muskaankularia 4359adc
Delete elephant.png
muskaankularia 491b51f
Delete net.png
muskaankularia fac36ae
Delete pineapple.png
muskaankularia 0b06ed5
Delete plant.png
muskaankularia 8cb8c13
Delete plasticbag.png
muskaankularia 8dca8a5
Delete alphamat.bib
muskaankularia 1a38add
Delete summary_Information_Flow.docx
muskaankularia 2727519
Update README.md
muskaankularia 9b2e395
Update README.md
muskaankularia 09db622
Update README.md
muskaankularia c5056ca
Update README.md
muskaankularia 0e55f0e
Update README.md
muskaankularia 6096b31
remove bits/stdc++ header
muskaankularia aa3c34a
Add files via upload
muskaankularia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# Designing Effective Inter-Pixel Information Flow for Natural Image Matting: | ||
|
||
Alphamatting is the problem of extracting the foreground from an image. Given the input of image and its corresponding trimap, we try to extract the foreground from the background. Following is an example - | ||
|
||
Input Image | Input trimap | Ouput Alpha matte | ||
:-------------------------:|:-------------------------:|:-------------------------: | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/net.png" alt="alt text" width="300" height="200"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/trimap/net.png" alt="alt text" width="300" height="200"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_net.png" alt="alt text" width="300" height="200"> | ||
|
||
This project is implementation of Information-Flow Matting [Yağız Aksoy, Tunç Ozan Aydın, Marc Pollefeys] [1]. It required implementation of some parts of other papers [2,3]. | ||
|
||
This is a pixel-affinity based alpha matting algorithm which solves a linear system of equations using preconditioned conjugate gradient method. Affinity-based methods operate by propagating opacity information from known opacity regions(K) into unknown opacity regions(U) using a variety of affinity definitions mentioned as - | ||
* Color mixture information flow - Opacity transitions in a matte occur as a result of the original colors in the image getting mixed with each other due to transparency or intricate parts of an object. They make use of this fact by representing each pixel in U as a mixture of similarly-colored pixels and the difference is the energy term ECM, which is to be reduced. This is coded in **cm.hpp** | ||
* K-to-U information flow - Connections from every pixel in U to both F(foreground pixels) and B(background pixels) are made to facilitate direct information flow from known-opacity regions to even the most remote opacity-transition regions in the image. This is coded in **KtoU.hpp** | ||
* Intra U information flow - They distribute the information inside U effectively by encouraging pixels with similar colors inside U to have similar opacity. This is coded in **intraU.hpp** | ||
* Local information flow - Spatial connectivity is one of the main cues for information flow which is achieved by connecting each pixel in U to its immediate neighbors to ensure spatially smooth mattes. This is coded in **local_info.hpp** | ||
|
||
Using these information flow, energy/error(E) is obtained as a weighted local composite of E<sub>CM</sub>, E<sub>KU</sub>(K-to-U information flow), E<sub>UU</sub>(Intra U information flow), E<sub>L</sub>(Local information flow). | ||
E represents the deviation of unknown pixels opacity or colour from what we predict it to be using other pixels. So, the algorithm aims at minimizing this error. This is coded in **alphac.cpp** | ||
|
||
Pre-processing and post-processing is implemented in **trimming.hpp** | ||
|
||
To run the code - | ||
1. **g++ -std=c++11 alphac.cpp \`pkg-config --cflags --libs opencv\`** | ||
1. **./a.out \<path to image> \<path to corresponding trimap>** | ||
|
||
Sample image and trimap are in opencv_contrib/modules/alphamat/src/img and opencv_contrib/modules/alphamat/src/trimap | ||
|
||
## Results | ||
|
||
Results for input_lowres are available here - | ||
https://docs.google.com/document/d/1BJG4633_U5K-Z0QLp3RTi43q25NI0hrTw-Q4w_85NrA/edit?usp=sharing | ||
|
||
Input Image | Ouput Alpha matte | ||
:-------------------------:|:-------------------------: | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/net.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_net.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/doll.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_doll.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/donkey.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_donkey.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/elephant.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_elephant.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/pineapple.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_pineapple.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/plant.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_plant.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/plasticbag.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_plasticbag.png" alt="alt text" width="200" height="155"> | ||
<img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/img/troll.png" alt="alt text" width="200" height="155"> | <img src="https://github.com/muskaankularia/opencv_contrib/blob/alphamatting/modules/alphamat/Result/result_troll.png" alt="alt text" width="200" height="155"> | ||
|
||
Average time taken to compute the different flows is 40s, but solving of linear equations using preconditioned conjugate gradient method takes another 2-3 min, which can be lessened by allowing lesser iterations. | ||
|
||
## TO DO | ||
|
||
* Results need to be improved by extensively comparing each flow's matrix with yaksoy MATLAB implementation [4]. | ||
* Runtime needs improvement. | ||
* Third part library(Eigen, nanoflann) dependencies can be removed. | ||
|
||
## References | ||
|
||
[1] Yagiz Aksoy, Tunc Ozan Aydin, Marc Pollefeys, "Designing Effective Inter-Pixel Information Flow for Natural Image Matting", CVPR, 2017. [[link](http://people.inf.ethz.ch/aksoyy/ifm/)] | ||
|
||
[2] Roweis, Sam T., and Lawrence K. Saul. "Nonlinear dimensionality reduction by locally linear embedding." science 290.5500 (2000): 2323-2326.[[link](https://science.sciencemag.org/content/290/5500/2323)] | ||
|
||
[3] Ehsan Shahrian, Deepu Rajan, Brian Price, Scott Cohen, "Improving Image Matting using Comprehensive Sampling Sets", CVPR 2013 [[paper](http://www.cv-foundation.org/openaccess/content_cvpr_2013/papers/Shahrian_Improving_Image_Matting_2013_CVPR_paper.pdf)] | ||
|
||
[4] Affinity Based Matting Toolbox by Yagiz Aksoy[[link](https://github.com/yaksoy/AffinityBasedMattingToolbox)] |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
using namespace std; | ||
using namespace cv; | ||
using namespace perf; | ||
|
||
#include "perf_precomp.hpp" | ||
|
||
namespace opencv_test | ||
{ | ||
|
||
typedef std::tr1::tuple<Size, MatType, MatDepth> Size_MatType_OutMatDepth_t; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Not allowed: https://github.com/opencv/opencv/wiki/Coding_Style_Guide#implementing-tests |
||
typedef perf::TestBaseWithParam<Size_MatType_OutMatDepth_t> Size_MatType_OutMatDepth; | ||
|
||
/* 2. Declare the testsuite */ | ||
PERF_TEST_P( Size_MatType_OutMatDepth, integral1, | ||
testing::Combine( | ||
testing::Values( TYPICAL_MAT_SIZES ), | ||
testing::Values( CV_8UC1, CV_8UC4 ), | ||
testing::Values( CV_32S, CV_32F, CV_64F ) ) ) | ||
{ | ||
string folder = "cv/alphamat/"; | ||
string image_path = folder + "img/elephant.png"; | ||
string trimap_path = folder + "trimap/elephant.png"; | ||
string reference_path = folder + "reference/elephant.png"; | ||
|
||
Mat image = imread(getDataPath(image_path), IMREAD_COLOR); | ||
Mat trimap = imread(getDataPath(trimap_path), IMREAD_COLOR); | ||
Mat reference = imread(getDataPath(reference_path), IMREAD_GRAYSCALE); | ||
|
||
Size sz = get<0>(GetParam()); | ||
int inpaintingMethod = get<1>(GetParam()); | ||
|
||
Mat result; | ||
declare.in(image, trimap).out(result).time(120); | ||
|
||
TEST_CYCLE() infoFlow(image, trimap, result, false, true); | ||
|
||
SANITY_CHECK_NOTHING(); | ||
} | ||
} // namespace |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
#include "perf_precomp.hpp" | ||
|
||
#if defined(HAVE_HPX) | ||
#include <hpx/hpx_main.hpp> | ||
#endif | ||
|
||
CV_PERF_TEST_MAIN(stitching) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
#ifndef __OPENCV_PERF_PRECOMP_HPP__ | ||
#define __OPENCV_PERF_PRECOMP_HPP__ | ||
|
||
#include "opencv2/ts.hpp" | ||
|
||
#endif |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
/*********************************************************************** | ||
* Software License Agreement (BSD License) | ||
* | ||
* Copyright 2011-16 Jose Luis Blanco ([email protected]). | ||
* All rights reserved. | ||
* | ||
* Redistribution and use in source and binary forms, with or without | ||
* modification, are permitted provided that the following conditions | ||
* are met: | ||
* | ||
* 1. Redistributions of source code must retain the above copyright | ||
* notice, this list of conditions and the following disclaimer. | ||
* 2. Redistributions in binary form must reproduce the above copyright | ||
* notice, this list of conditions and the following disclaimer in the | ||
* documentation and/or other materials provided with the distribution. | ||
* | ||
* THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR | ||
* IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES | ||
* OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. | ||
* IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, | ||
* INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT | ||
* NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, | ||
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY | ||
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF | ||
* THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
*************************************************************************/ | ||
|
||
#pragma once | ||
|
||
#include "nanoflann.hpp" | ||
|
||
#include <vector> | ||
|
||
// ===== This example shows how to use nanoflann with these types of containers: ======= | ||
//typedef std::vector<std::vector<double> > my_vector_of_vectors_t; | ||
//typedef std::vector<Eigen::VectorXd> my_vector_of_vectors_t; // This requires #include <Eigen/Dense> | ||
// ===================================================================================== | ||
|
||
|
||
/** A simple vector-of-vectors adaptor for nanoflann, without duplicating the storage. | ||
* The i'th vector represents a point in the state space. | ||
* | ||
* \tparam DIM If set to >0, it specifies a compile-time fixed dimensionality for the points in the data set, allowing more compiler optimizations. | ||
* \tparam num_t The type of the point coordinates (typically, double or float). | ||
* \tparam Distance The distance metric to use: nanoflann::metric_L1, nanoflann::metric_L2, nanoflann::metric_L2_Simple, etc. | ||
* \tparam IndexType The type for indices in the KD-tree index (typically, size_t of int) | ||
*/ | ||
template <class VectorOfVectorsType, typename num_t = double, int DIM = -1, class Distance = nanoflann::metric_L2, typename IndexType = size_t> | ||
struct KDTreeVectorOfVectorsAdaptor | ||
{ | ||
typedef KDTreeVectorOfVectorsAdaptor<VectorOfVectorsType, num_t, DIM,Distance> self_t; | ||
typedef typename Distance::template traits<num_t, self_t>::distance_t metric_t; | ||
typedef nanoflann::KDTreeSingleIndexAdaptor< metric_t, self_t, DIM, IndexType> index_t; | ||
|
||
index_t* index; //! The kd-tree index for the user to call its methods as usual with any other FLANN index. | ||
|
||
/// Constructor: takes a const ref to the vector of vectors object with the data points | ||
KDTreeVectorOfVectorsAdaptor(const size_t /* dimensionality */, const VectorOfVectorsType &mat, const int leaf_max_size = 10) : m_data(mat) | ||
{ | ||
assert(mat.size() != 0 && mat[0].size() != 0); | ||
const size_t dims = mat[0].size(); | ||
if (DIM>0 && static_cast<int>(dims) != DIM) | ||
throw std::runtime_error("Data set dimensionality does not match the 'DIM' template argument"); | ||
index = new index_t( static_cast<int>(dims), *this /* adaptor */, nanoflann::KDTreeSingleIndexAdaptorParams(leaf_max_size ) ); | ||
index->buildIndex(); | ||
} | ||
|
||
~KDTreeVectorOfVectorsAdaptor() { | ||
delete index; | ||
} | ||
|
||
const VectorOfVectorsType &m_data; | ||
|
||
/** Query for the \a num_closest closest points to a given point (entered as query_point[0:dim-1]). | ||
* Note that this is a short-cut method for index->findNeighbors(). | ||
* The user can also call index->... methods as desired. | ||
* \note nChecks_IGNORED is ignored but kept for compatibility with the original FLANN interface. | ||
*/ | ||
inline void query(const num_t *query_point, const size_t num_closest, IndexType *out_indices, num_t *out_distances_sq, const int nChecks_IGNORED = 10) const | ||
{ | ||
nanoflann::KNNResultSet<num_t, IndexType> resultSet(num_closest); | ||
resultSet.init(out_indices, out_distances_sq); | ||
index->findNeighbors(resultSet, query_point, nanoflann::SearchParams()); | ||
} | ||
|
||
/** @name Interface expected by KDTreeSingleIndexAdaptor | ||
* @{ */ | ||
|
||
const self_t & derived() const { | ||
return *this; | ||
} | ||
self_t & derived() { | ||
return *this; | ||
} | ||
|
||
// Must return the number of data points | ||
inline size_t kdtree_get_point_count() const { | ||
return m_data.size(); | ||
} | ||
|
||
// Returns the dim'th component of the idx'th point in the class: | ||
inline num_t kdtree_get_pt(const size_t idx, const size_t dim) const { | ||
return m_data[idx][dim]; | ||
} | ||
|
||
// Optional bounding-box computation: return false to default to a standard bbox computation loop. | ||
// Return true if the BBOX was already computed by the class and returned in "bb" so it can be avoided to redo it again. | ||
// Look at bb.size() to find out the expected dimensionality (e.g. 2 or 3 for point clouds) | ||
template <class BBOX> | ||
bool kdtree_get_bbox(BBOX & /*bb*/) const { | ||
return false; | ||
} | ||
|
||
/** @} */ | ||
}; // end of KDTreeVectorOfVectorsAdaptor |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.