{"id":1027391,"date":"2023-08-06T16:56:35","date_gmt":"2023-08-06T20:56:35","guid":{"rendered":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/uncategorized\/use-cases-of-stereo-matching-part7machine-learning-ai-medium.php"},"modified":"2023-08-06T16:56:35","modified_gmt":"2023-08-06T20:56:35","slug":"use-cases-of-stereo-matching-part7machine-learning-ai-medium","status":"publish","type":"post","link":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/machine-learning\/use-cases-of-stereo-matching-part7machine-learning-ai-medium.php","title":{"rendered":"Use cases of Stereo Matching part7(Machine Learning + AI) &#8211; Medium"},"content":{"rendered":"<p><p>    Author : Philippe    Weinzaepfel, Thomas Lucas,    Vincent Leroy,    Yohann Cabon,    Vaibhav Arora,    Romain Brgier,    Gabriela    Csurka, Leonid    Antsfeld, Boris    Chidlovskii, Jrme Revaud  <\/p>\n<p>    Abstract : Despite impressive performance for high-level    downstream tasks, self-supervised pre-training methods have not    yet fully delivered on dense geometric vision tasks such as    stereo matching or optical flow. The application of    selfsupervised concepts, such as instance discrimination or    masked image modeling, to geometric tasks is an active area of    research. In this work, we build on the recent crossview    completion framework, a variation of masked image modeling that    leverages a second view from the same scene which makes it well    suited for binocular downstream tasks. The applicability of    this concept has so far been limited in at least two ways: (a)    by the difficulty of collecting realworld image pairs  in    practice only synthetic data have been used  and (b) by the    lack of generalization of vanilla transformers to dense    downstream tasks for which relative position is more meaningful    than absolute position. We explore three avenues of    improvement: first, we introduce a method to collect suitable    real-world image pairs at large scale. Second, we experiment    with relative positional embeddings and show that they enable    vision transformers to perform substantially better. Third, we    scale up vision transformer based cross-completion    architectures, which is made possible by the use of large    amounts of data. With these improvements, we show for the first    time that stateof-the-art results on stereo matching and    optical flow can be reached without using any classical    task-specific techniques like correlation volume, iterative    estimation, image warping or multi-scale reasoning, thus paving    the way towards universal vision models.  <\/p>\n<p>    2. Self-Supervised Intensity-Event Stereo Matching(arXiv)  <\/p>\n<p>    Author : Jinjin Gu,    Jinan Zhou,    Ringo Sai Wo    Chu, Yan Chen,    Jiawei Zhang,    Xuanye Cheng,    Song Zhang,    Jimmy S. Ren  <\/p>\n<p>    Abstract : Event cameras are novel bio-inspired vision sensors    that output pixel-level intensity changes in microsecond    accuracy with a high dynamic range and low power consumption.    Despite these advantages, event cameras cannot be directly    applied to computational imaging tasks due to the inability to    obtain high-quality intensity and events simultaneously. This    paper aims to connect a standalone event camera and a modern    intensity camera so that the applications can take advantage of    both two sensors. We establish this connection through a    multi-modal stereo matching task. We first convert events to a    reconstructed image and extend the existing stereo networks to    this multi-modality condition. We propose a self-supervised    method to train the multi-modal stereo network without using    ground truth disparity data. The structure loss calculated on    image gradients is used to enable self-supervised learning on    such multi-modal data. Exploiting the internal stereo    constraint between views with different modalities, we    introduce general stereo loss functions, including disparity    cross-consistency loss and internal disparity loss, leading to    improved performance and robustness compared to existing    approaches. The experiments demonstrate the effectiveness of    the proposed method, especially the proposed general stereo    loss functions, on both synthetic and real datasets. At last,    we shed light on employing the aligned events and intensity    images in downstream tasks, e.g., video interpolation    application.  <\/p>\n<p><!-- Auto Generated --><\/p>\n<p>Read the original:<\/p>\n<p><a target=\"_blank\" rel=\"nofollow noopener\" href=\"https:\/\/medium.com\/@monocosmo77\/use-cases-of-stereo-matching-part7-machine-learning-ai-1fc4cab5baa5\" title=\"Use cases of Stereo Matching part7(Machine Learning + AI) - Medium\">Use cases of Stereo Matching part7(Machine Learning + AI) - Medium<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p> Author : Philippe Weinzaepfel, Thomas Lucas, Vincent Leroy, Yohann Cabon, Vaibhav Arora, Romain Brgier, Gabriela Csurka, Leonid Antsfeld, Boris Chidlovskii, Jrme Revaud Abstract : Despite impressive performance for high-level downstream tasks, self-supervised pre-training methods have not yet fully delivered on dense geometric vision tasks such as stereo matching or optical flow. The application of selfsupervised concepts, such as instance discrimination or masked image modeling, to geometric tasks is an active area of research <a href=\"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/machine-learning\/use-cases-of-stereo-matching-part7machine-learning-ai-medium.php\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"limit_modified_date":"","last_modified_date":"","_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[1231415],"tags":[],"class_list":["post-1027391","post","type-post","status-publish","format-standard","hentry","category-machine-learning"],"modified_by":null,"_links":{"self":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1027391"}],"collection":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/comments?post=1027391"}],"version-history":[{"count":0,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/posts\/1027391\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/media?parent=1027391"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/categories?post=1027391"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.euvolution.com\/futurist-transhuman-news-blog\/wp-json\/wp\/v2\/tags?post=1027391"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}