How to deal with… non-submodular and higher-order energies (Part 1) Carsten Rother 27/06/2014 Machine Learning 2 Advertisement Theoretical Side: • Optimization and Learning in discrete-domain models (CRFs, Higher-order models, continuous label space, loss based learning, etc) Application Side: • Scene recovery from multiple images • 3D Scene understanding • Bio Imaging Main Research Theme: • Combining physics-based vision with machine learning: Generative models meet discriminative models 27/06/2014 Machine Learning 2 2 State-of-the art CRF models 1 𝒆−𝐸 Gibbs distribution: 𝑝 𝒚|𝒙, 𝒘 = 𝑍 𝒙, 𝒘 Energy: 𝐸 𝒚, 𝒙, 𝒘 = 𝒚,𝒙,𝒘 𝐸𝐹 𝑦𝐹 , 𝒙, 𝑤𝐹 𝐹 Factor graph - compact: Factors graph: yi 27/06/2014 Machine Learning 2 3 Deconvolution Combine physics and machine learning: 1) Using physics: Add Gaussian “likelihood” (x-K*y)2 2) Put into deep learning appraoch x y1 RTF1 RTF2 y2 … (Stacked RTFs) Input x = K*y 27/06/2014 Output y [Schmidt, Rother, Nowozin, Jancsary, Roth, CVPR 2013. Best student paper award] Machine Learning 2 4 Scene recovery from multiple images 2 RGBD Input 27/06/2014 Machine Learning 2 5 Scene recovery from single images [NIPS 13, joint work with Oxford University] 27/06/2014 Machine Learning 2 6 BioImaging Joint work with Myers group (Dagmar, Florian, and others) Atlas Instance 27/06/2014 Machine Learning 2 7 3D Scene Understanding • Training time: 3D objects • Test time: 27/06/2014 Machine Learning 2 8 Advertisement • If you are excited about any these topics … come to us for a “forschungspraktikum”, master thesis, diploma thesis, etc • If you want to collaborate with top industry labs or university … come to us. Examples: • BMW, Adobe, Microsoft Research, Daimler, etc. • Top universities: in Israel, Oxford, Heidelberg, etc. 27/06/2014 Machine Learning 2 9 Advertisement Joint project with “Institut für Luftfahrt und Logistik“ Lidar scanner Smart 3D point cloud processing: - 3D fine-grained recognition: type of aircraft, vehicle, objects,… - Tracking: 3D models with varying degree of information - Structured data: how to define a CRF/RTF? - Combine physics based vision (generative models) with machine learning There is an opening for a master project / PhD student – if you are interested talk to me after lecture! 27/06/2014 Machine Learning 2 10 Reminder: Pairwise energies 𝐸 𝑥 = 𝜃𝑖 (𝑥𝑖 ) + 𝑖∈𝑉 𝜃𝑖𝑗 (𝑥𝑖 , 𝑥𝑗 ) + 𝜃𝑐𝑜𝑛𝑠𝑡 For now, 𝑥 ∈ {0,1} 𝑖,𝑗 ∈𝐸 𝐺 = (𝑉, 𝐸) undirected graph Visualization of the full energy: 𝑥𝑖 = 0 𝜃𝑖 (0) 𝑥𝑖 = 0 𝑥𝑖 = 1 𝜃𝑖 (1) 𝑥𝑖 = 1 𝑥𝑗 = 0 𝑥𝑗 = 1 𝜃𝑖𝑗 (0,0) 𝜃𝑖𝑗 (0,1) 𝜃𝑖𝑗 (1,0) 𝜃𝑖𝑗 (1,1) 𝜃𝑖𝑗 (0,0) also sometimes written as: 𝜃𝑖𝑗;00 Submodular Condition: 𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 ≤ 𝜃𝑖𝑗 1,0 + 𝜃𝑖𝑗 0,1 • If all terms are submodular then global optimum can be computed in polynomial time with graph cut • If not…this lecture 27/06/2014 Machine Learning 2 11 How often do we have submodular terms? 𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 ≤ 𝜃𝑖𝑗 1,0 + 𝜃𝑖𝑗 0,1 Label smoothness is often the natural condition: Neigboring pixels have more often than not the same label. We may choose: 𝜃𝑖𝑗 0,0 =𝜃𝑖𝑗 1,1 = 0; 𝜃𝑖𝑗 1,0 =𝜃𝑖𝑗 0,1 ≥ 0 In alpha expansion (reminder later) energy is often “naturally” submodular: 𝑐𝑜𝑠𝑡 Image – left(a) Image – right(b) labelling |𝑥𝑖 − 𝑥𝑗 | 27/06/2014 Machine Learning 2 12 Importance of good optimization Input: Image sequence [Data courtesy from Oliver Woodford] Output: New view Problem: Minimize a binary 4-connected energy (non-submodular) (choose a colour-mode at each pixel) 27/06/2014 Machine Learning 2 13 Importance of good optimization Ground Truth Graph Cut with truncation Belief Propagation [Rother et al ‘05] QPBOP QPBO [Hammer ‘84] (black unknown) 27/06/2014 ICM, Simulated Annealing [Boros ’06, see Rother ‘07] Global Minimum Machine Learning 2 14 Most simple idea to deal with non-submodular terms • Truncate all non-submodular terms: 𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 > 𝜃𝑖𝑗 1,0 + 𝜃𝑖𝑗 0,1 𝜃𝑖𝑗 0,0 − 𝛿 + 𝜃𝑖𝑗 1,1 − 𝛿 = 𝜃𝑖𝑗 1,0 + 𝛿 + 𝜃𝑖𝑗 0,1 + 𝛿 1 𝛿 = [𝜃𝑖𝑗 0,0 + 𝜃𝑖𝑗 1,1 − 𝜃𝑖𝑗 1,0 − 𝜃𝑖𝑗 0,1 ] 4 Better techniques to come… 27/06/2014 Machine Learning 2: QPBO and Dual-Decomposition 15 How often do we have non-submodular terms? • Learning (unconstraint parameters) MRF DTF Red: non-submodular Training Data Test Data blue: submodular Graph connectivity: 64 27/06/2014 Machine Learning 2 16 Texture Denoising Training images Result MRF 4-connected (neighbours) 27/06/2014 Test image Test image (60% Noise) Result MRF 9-connected Result MRF 4-connected (7 attractive; 2 repulsive) Machine Learning 2 17 How often do we have non-submodular terms? Deconvolution: Hand-crafted scenarios: Input Image User Input Global optimum Many more examples later: Diagram recognition, fusion move, etc. 27/06/2014 Machine Learning 2 18 Reparametrization Two reparametrizations we need: +𝛿 𝜃𝑐𝑜𝑛𝑠𝑡 − 𝛿 +𝛿 Pairwise transform unary transform [Minimizing non-submodular energies with graph cut, Kolmogorov, Rother, PAMI 2007] 27/06/2014 Machine Learning 2 19 Put energies into “normal form” 1) Apply all pairwise transformations until For all pairs of incoming edges it is: min 𝜃𝑝𝑞0𝑗 , 𝜃𝑝𝑞1𝑗 = 0 for all directed edges p->q and all 𝑗 ∈ 0,1 2) Apply all unary transform until: min 𝜃𝑝0 , 𝜃𝑝1 = 0 for all p 27/06/2014 Machine Learning 2 20 Construct the graph Minimum Cut through the graph gives the solution 𝑥 ∗ = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐸(𝑥) 27/06/2014 Machine Learning 2 21 Construct the graph Minimum Cut through the graph gives the solution 𝑥 ∗ = 𝑎𝑟𝑔𝑚𝑖𝑛 𝐸(𝑥) 27/06/2014 Machine Learning 2 22 QPBO method E ({x p }) E p ( x p ) E ' ({x p }, {x p }) E p ( x p ) E p (1 x p ) unary E pq ( x p , xq ) (sub.) E pq ( x p , xq ) (non-sub.) 2 E pq ( x p , xq ) E pq (1 x p ,1 xq ) pairwise submodular 2 E pq ( x p ,1 xq ) E pq (1 x p , xq ) 2 pairwise non-submodular • Double number of variables: x p x p , x p • • is submodular! • Construct graph and solve with graph cut: less than double the runtime for graph cut • Method is called QPBO: Quadratic Peusdo Boolean Optimization (not good name) [Hammer et al. ’84, Boros et al ’91; see Kolmogorov, Rother ‘07] 27/06/2014 Machine Learning 2 23 Read out the solution • Assign labels based on minimum cut in auxiliary graph: x p 1; x p 0 xp 1 x p 0; x p 1 xp 0 x p 0; x p 0 xp ? x p 1; x p 1 xp ? 27/06/2014 Machine Learning 2 24 Properties 1 1 ? ? 0 0 0 0 1 1 0 0 1 1 ? ? 0 0 0 0 1 1 0 0 1 1 ? ? 0 0 0 0 1 1 0 0 x (partial) y (any complete) z = FUSE(x,y) Global optimum • Autarky(Persistency) Property: • Partial Optimality: labeled pixels in x belong to a global minimum • Labeled nodes have the same result as LP relaxation of the problem E (but QPBO is a very fast solver) [Hammer et al ’84, Schlesinger ‘76, Werner ’07, Kolmogorov, Wainright ’05; Kolmogorov ’06] 27/06/2014 Machine Learning 2 25 When do we get all nodes labeled? • function is submodular •t • If there exist a flipping that makes the energy fully submodular, then QPBO will find it • We can be simply “lucky” • What to do with unlabelled nodes: run some other method (e.g. BP) 27/06/2014 Machine Learning 2 26 Extension: QPBOP (“P” standard for “Probing”) QPBO: 0 ? p ? q ? r ? s ? t Probe Node p: 0 0 0 0 ? ? p q r s t • xq 0 for a global minimum • x p xr 0 1 p 0 1 0 q r s ? t remove node xq from energy remove node xr from energy • x p , xs add directed link • Why did QPBO not find this solution? Enforce integer constraint on p (tighter relaxation) 27/06/2014 Machine Learning 2 27 Two extensions: QPBOP, QPBOI 1. 2. 3. 4. 5. Run QPBO - gives set of unlabeled nodes U Probe a p U Simplify energy: Remove nodes and add links Run QPBO, update U Stop if energy stays for all p U otherwise go to 2. Properties: - New energy preserves global optimality and (sometimes) gives the global minimum - Order may effect result 27/06/2014 Machine Learning 2 28 QPBO versus QPBOP QPBO 73% unlabeled (0.08sec) 27/06/2014 QPBOP Global Minimum (0.4sec) Machine Learning 2 29 Extension: QPBOI (“I” standard for “Improve”) 0 0 0 0 0 0? 0? 1? 0 0 0 1 0 0 0 0 0? 0? 1? ? 0 0 1 0 0 0 0 0 ? ? ? ? 0 0 0 0 y (e.g. from BP) • Property: 27/06/2014 x (partial) y’ = FUSE(x,y) [persistency property] Machine Learning 2 30 Extension: QPBOI (“I” standard for “Improve”) 0 0 0 1 0 0 ? 0 ? 1 ? 0 0 0 1 0 0 1 0 0 ? 0 ? 1 ? ? 0 0 1 0 0 0 0 0 0 ? 0 ? 1 ? 1 ? 0 0 1 1 y’ x (partial) • Property: y’’ = FUSE(x,y’) [autarky property] • QPBOI-algorithm: choose sequence of nested sets • QPBO-stable: No set changes labelling - sometimes global minima 27/06/2014 Machine Learning 2 31 Results Three important factors: • Degree of non-submodularity (NS) • Unary strength • Connectivity (av. degree of a node) 27/06/2014 Machine Learning 2 32 Results – Diagram Recognition • 2700 test cases: QPBOP solved all Ground truth BP E=25 (0 sec) 27/06/2014 QPBOP (0sec) - Global Min. P+BP+I, BP+I E=0 (0sec) Sim. Ann. E=0 (0.28sec) QPBO: 56.3% unlabeled (0 sec) GrapCut E= 119 (0 sec) Machine Learning 2 ICM E=999 (0 sec) 33 Results - Deconvolution Ground Truth Input QPBO-C 43% unlab. (red) (0.4sec) ICM E=14 (0sec) BP E=5 (0.5sec) BP+I E=3.6 (1sec) 27/06/2014 Machine Learning 2 QPBO 45% unlab. (red) (0.01sec) GC E=999 (0sec) C+BP+I, Sim. Ann. E=0 (0.4sec) 34 Move on to multi-label • Let’s apply QPBO(P/I) methods to multi-label problems • In particular alpha expansion 27/06/2014 Machine Learning 2: QPBO and Dual-Decomposition 35 Reminder: Alpha expansion • Variables take label a or retain current label Status: Tree Ground House Sky InitializeSky Expand Ground House with Tree [Boykov , Veksler and Zabih 2001] 27/06/2014 Machine Learning 2 36 Reminder: Alpha expansion • Given the original energy 𝐸(𝑥) • At each step we have two solutions: 𝒙𝟎 , 𝒙𝟏 𝑥1 𝑥0 = (1 − 𝑥𝑖′ ) 𝑥𝑖0 + 𝑥𝑖′ 𝑥𝑖1 • Define the (variable-wise) combination: 𝑥𝑖01 (where 𝒙′ ∈ {0,1} is selection variable) • Construct a new energy 𝐸′ such that 𝐸’(𝒙’) = 𝐸(𝒙𝟎𝟏 ) • The move energy 𝐸’(𝒙’) is submodular if: θij (xa,xb) = 0 iff xa=xb θij (xa,xb) = θij (xb,xa) ≥ 0 θij (xa,xb) + θij (xb,xc) ≥ θij (xa,xc) Examples: Potts model, Truncated linear (not truncated quadratic) Other moves strategies: alpha-beta swap, range move, etc. [Boykov , Veksler and Zabih 2001] Machine Learning 2 37 Reminder: Alpha Expansion • What to do if non-submodular? • Run QPBO • For unlabeled pixels: • choose solution (𝑥 0 or 𝑥 1 ) that has lower energy 𝐸 • Replace unlabeled nodes with chosen solution • Guarantees that new solution has equal or better energy than both 𝐸 𝑥 0 and 𝐸 𝑥 1 (see Persistency property) 27/06/2014 Machine Learning 2 38 Fusion Move • Given the original energy 𝐸(𝑥) • At each step we have two arbitrary solutions: 𝑥 0 , 𝑥 1 𝑥0 𝑥1 • Define the (variable wise) combination: 𝑥𝑖01 = (1 − 𝑥𝑖′ ) .∗ 𝑥𝑖0 + 𝑥𝑖′ .∗ 𝑥𝑖1 (where 𝑥′ ∈ {0,1} is selection variable) • Construct a new energy 𝐸′ such that 𝐸’(𝑥’) = 𝐸(𝑥 01 ) • Run QPBO an fix unlabeled nodes as above • Comment, in practice often submodular if both solutions are good (since energy prefers neighboring node to be similar) 27/06/2014 Machine Learning 2 39 Fusion move to make alpha expansion parallel • One processor needs 7 sequential alpha expansions for 8 labels: 1,2,3,4,5,6,7,8 • Four processors need only 3 sequential steps (still 7 alpha expansions): p1 p2 ∎(1-2) ∎(3-4) p3 p4 ∎(5-6) ∎(7-8) ∎(1-4) ∎(5-8) ∎(1-8) ∎ means fusion 27/06/2014 Machine Learning 2 40 Fusion move for continuous label-spaces Local gradient cost: 𝑥𝑖 − 𝑥𝑖+1 Victor Lempitsky, Stefan Roth, and Carsten Rother, Fusion Flow:DiscreteContinuous Optimization for Optical Flow Estimation, CVPR 2008 27/06/2014 Machine Learning 2 41 FusionFlow - comparisons 27/06/2014 Machine Learning 2 42 LogCut – Dealing efficiently with large label spaces Optical flow: 1024 discrete labels Ground truth 27/06/2014 Victor Lempitsky, Carsten Rother, and Andrew Blake, LogCut- Efficient Graph Cut Optimization for Markov Random Fields, in ICCV, 2007 Machine Learning 2 43 Log Cut – basic idea E (x) E p ( x p ) E pq ( x p , xq ) p with x p [0, K ] p ,q • Alpha Expansion: we need 𝐾-1 binary decision to get a labeling out • Encode label space 𝐾 (e.g. 𝐾=64) with log 𝐾 (e.g. 6 bits): Example: 44 = 101100 We only need log 𝐾(here 6) binary decision to get a labeling out 27/06/2014 Machine Learning 2 44 Example stereo matching Stereo (Tsukuba) - 16 Labels: Bit 4: 0xxx 00xx 0-7 versus 8-15 27/06/2014 Bit 1 Bit 2: Bit 3: 0-3 versus 4-7 001x 0-1 versus 2-3 Machine Learning 2 0010 2 versus 3 45 How to choose the energy? e.g. bit 3: E ' (x' ) E ' p ( x' p ) E ' pq ( x' p , x'q ) with x' p {0,1} p Unary: p ,q x p [0,3] x p [4,7] E ' p (0) min ( E p ( x p )) x p [ 0 , 3] E’ lower bound of E (tight if no pairwise terms) 27/06/2014 Machine Learning 2 46 How to choose the energy? Pairwise: E ' p ,q (0,0) E p ,q ( x p , xq ) a min[ ( x p xq ) p , b] E ' p , q ( x ' p , x'q ) E p ,q (0,0) E b | x p xq | 0 1 2 3 3 3 3 3 1 0 1 2 3 3 3 3 2 1 0 1 2 3 3 3 3 2 1 0 1 2 3 3 3 3 2 1 0 1 2 3 3 3 3 2 1 0 1 2 3 3 3 3 2 1 0 1 3 3 3 3 3 2 1 0 Approximations: 1. 2. 3. 4. 5. Choose One Min Mean Weighted Mean Training 27/06/2014 Machine Learning 2 47 Comparison Image Restoration (2 different models): One Min Mean weight Training aExp 27/06/2014 Mean One Min Mean weight Training aExp Machine Learning 2 Mean 48 LogCut Iterative LogCut: 1. One Sweep – log(K) optimizations 2. Shift Labels 3. One Sweep – log(K) optimizations 4. Fuse with current solution 5. Go to 2. Energy Labels: 1,2,3,4,5,6,7,8 no shift Shift by 3: 6,7,8,1,2,3,4,5 ½ shift 27/06/2014 full shift Machine Learning 2 49 Results Training LogCut (2 iter); 8sec E=8767 AExp (6 iter); 390sec E=8773 Test LogCut (64 iter); 150sec E=8469 Ground Truth Speed-up factor: 20.7 27/06/2014 Machine Learning 2 50 Results Train (out of 10) Test (out of 10) 27/06/2014 LogCut 1.5sec Effic. BP 2.1sec Machine Learning 2 AExp 4.7sec TRW 90sec 51
© Copyright 2025