Explanipedia

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks Open

Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros · 2017

Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training d…

YOLOv4: Optimal Speed and Accuracy of Object Detection Open

Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao · 2020

Computer science

There are a huge number of features which are said to improve Convolutional Neural Network (CNN) accuracy. Practical testing of combinations of such features on large datasets, and theoretical justification of the result, is required. Some…

Learning Transferable Visual Models From Natural Language Supervision Open

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh , et al. · 2021

Computer science Psychology Economics

State-of-the-art computer vision systems are trained to predict a fixed set of predetermined object categories. This restricted form of supervision limits their generality and usability since additional labeled data is needed to specify an…

Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation Open

Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, Hartwig Adam · 2018

Computer science Mathematics Chemistry

Spatial pyramid pooling module or encode-decoder structure are used in deep neural networks for semantic segmentation task. The former networks are able to encode multi-scale contextual information by probing the incoming features with fil…

Simple online and realtime tracking Open

Alex Bewley, Zongyuan Ge, Lionel Ott, Fábio Ramos, Ben Upcroft · 2016

Computer science Physics Philosophy

This paper explores a pragmatic approach to multiple object tracking where the main focus is to associate objects efficiently for online and realtime applications. To this end, detection quality is identified as a key factor influencing tr…

You Only Look Once: Unified, Real-Time Object Detection Open

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi · 2016

Computer science Mathematics

We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associate…

Deep Learning for Computer Vision: A Brief Review Open

Athanasios Voulodimos, Nikolaos Doulamis, Anastasios Doulamis, Eftychios Protopapadakis · 2018

Computer science Physics

Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief…

Scene Parsing through ADE20K Dataset Open

Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso , et al. · 2017

Computer science Business Chemistry

Scene parsing, or recognizing and segmenting objects and stuff in an image, is one of the key problems in computer vision. Despite the community's efforts in data collection, there are still few image datasets covering a wide range of scen…

DOTA: A Large-Scale Dataset for Object Detection in Aerial Images Open

Gui-Song Xia, Xiang Bai, Jian Ding, Zhen Zhu, Serge Belongie , et al. · 2018

Computer science Geography Mathematics

Object detection is an important and challenging problem in computer vision. Although the past decade has witnessed major advances in object detection in natural scenes, such successes have been slow to aerial imagery, not only because of …

Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks Open

Aditya Chattopadhay, Anirban Sarkar, Prantik Howlader, Vineeth N Balasubramanian · 2018

Computer science Philosophy

Over the last decade, Convolutional Neural Network (CNN) models have been\nhighly successful in solving complex vision problems. However, these deep\nmodels are perceived as "black box" methods considering the lack of\nunderstanding of the…

Random Erasing Data Augmentation Open

Zhun Zhong, Liang Zheng, Guoliang Kang, Shaozi Li, Yi Yang · 2020

Computer science Mathematics Biology

In this paper, we introduce Random Erasing, a new data augmentation method for training the convolutional neural network (CNN). In training, Random Erasing randomly selects a rectangle region in an image and erases its pixels with random v…

Annual Review of Sociology Open

2017

Geography Sociology Political science

Outside of Indigenous studies, sociologists tend to treat land in the United States as governed exclusively by an entrenched private-property regime: Land is a commodity and an object for individual control. This review presents land in th…

Deep Learning for Generic Object Detection: A Survey Open

Li Liu, Wanli Ouyang, Xiaogang Wang, Paul Fieguth, Jie Chen , et al. · 2019

Computer science Geography Mathematics

Object detection, one of the most fundamental and challenging problems in computer vision, seeks to locate object instances from a large number of predefined categories in natural images. Deep learning techniques have emerged as a powerful…

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks Open

Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros · 2017

Computer science Mathematics Chemistry

Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training d…

A Review of Yolo Algorithm Developments Open

Peiyuan Jiang, Daji Ergu, Fangyao Liu, Ying Cai, Bo Ma · 2022

Computer science Philosophy Mathematics

Object detection techniques are the foundation for the artificial intelligence field. This research paper gives a brief overview of the You Only Look Once (YOLO) algorithm and its subsequent advanced versions. Through the analysis, we reac…

RU-AI: A Large Multimodal Dataset for Machine Generated Content Detection Open

Tsung-Yi Lin, Michael Maire, Serge Belongie, Lubomir Bourdev, Ross Girshick · 2024

Computer science Geography Geology

This repository contains all the collected and aligned data for RU-AI dataset. It is constructed based on three large publicly available datasets: Flickr8K, COCO, and Places205, by adding their corresponding machine-generated pairs from fi…

PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes Open

Xiang Yu, Tanner Schmidt, Venkatraman Narayanan, Dieter Fox · 2018

Computer science Mathematics Chemistry

Estimating the 6D pose of known objects is important for robots to interact with the real world.The problem is challenging due to the variety of objects as well as the complexity of a scene caused by clutter and occlusions between objects.…

Densely Connected Convolutional Networks Open

Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger · 2016

Computer science Engineering Philosophy

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we emb…

Cantera: An Object-oriented Software Toolkit for Chemical Kinetics, Thermodynamics, and Transport Processes Open

David G. Goodwin, Raymond L. Speth, Harry K. Moffat, Bryan W. Weber · 2018

Computer science Physics

Cantera is a suite of object-oriented software tools for problems involving chemical kinetics, thermodynamics, and/or transport processes. Cantera provides types (or classes) of objects representing phases of matter, interfaces between the…

GOT-10k: A Large High-Diversity Benchmark for Generic Object Tracking in the Wild Open

Lianghua Huang, Xin Zhao, Kaiqi Huang · 2019

Computer science Geography Sociology

We introduce here a large tracking database that offers an unprecedentedly wide coverage of common moving objects in the wild, called GOT-10k. Specifically, GOT-10k is built upon the backbone of WordNet structure [1] and it populates the m…

Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling Open

Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T. Freeman, Joshua B. Tenenbaum · 2016

Computer science Physics

We study the problem of 3D object generation. We propose a novel framework, namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects from a probabilistic space by leveraging recent advances in volumetric convolutional …

Learning dexterous in-hand manipulation Open

OpenAI Marcin Andrychowicz, Bowen Baker, Maciek Chociej, Rafał Józefowicz, Bob McGrew , et al. · 2019

Computer science Psychology

We use reinforcement learning (RL) to learn dexterous in-hand manipulation policies that can perform vision-based object reorientation on a physical Shadow Dexterous Hand. The training is performed in a simulated environment in which we ra…

Mask R-CNN Open

Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick · 2017

Computer science Philosophy History

We present a conceptually simple, flexible, and general framework for object instance segmentation. Our approach efficiently detects objects in an image while simultaneously generating a high-quality segmentation mask for each instance. Th…

The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale Open

Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin , et al. · 2018

Computer science Geography Economics

We present Open Images V4, a dataset of 9.2M images with unified annotations for image classification, object detection and visual relationship detection. The images have a Creative Commons Attribution license that allows to share and adap…

MOT16: A Benchmark for Multi-Object Tracking Open

Anton Milan, Laura Leal-Taixé, Ian Reid, Stefan Roth, Konrad Schindler · 2016

Computer science Psychology Physics

Standardized benchmarks are crucial for the majority of computer vision applications. Although leaderboards and ranking tables should not be over-claimed, benchmarks often provide the most objective measure of performance and are therefore…

Focal Loss for Dense Object Detection Open

Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár · 2017

Computer science

The highest accuracy object detectors to date are based on a two-stage approach popularized by R-CNN, where a classifier is applied to a sparse set of candidate object locations. In contrast, one-stage detectors that are applied over a reg…

Interactive Attention Networks for Aspect-Level Sentiment Classification Open

Dehong Ma, Sujian Li, Xiaodong Zhang, Houfeng Wang · 2017

Computer science Biology Economics

Aspect-level sentiment classification aims at identifying the sentiment polarity of specific target in its context. Previous approaches have realized the importance of targets in sentiment classification and developed various methods with …

YOLOv10: Real-Time End-to-End Object Detection Open

Ao Wang, Hui Chen, Lihao Liu, Kai Chen, Zijia Lin , et al. · 2024

Computer science Mathematics

Over the past years, YOLOs have emerged as the predominant paradigm in the field of real-time object detection owing to their effective balance between computational cost and detection performance. Researchers have explored the architectur…

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object Open

Xue Yang, Junchi Yan, Zi‐Ming Feng, Tao He · 2021

Computer science Geography Economics

Rotation detection is a challenging task due to the difficulties of locating the multi-angle objects and separating them effectively from the background. Though considerable progress has been made, for practical settings, there still exist…

A Neural Algorithm of Artistic Style Open

Leon A. Gatys, Alexander S. Ecker, Matthias Bethge · 2016

Computer science Art Psychology

In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown…

Object (grammar) ≈ Object (grammar)