Descriptive

329 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Descriptive

Trend	Dataset	Best Model	Paper	Code	Compare
	CRIPP-VQA	Aloe*+BERT			See all

Datasets

CRIPP-VQA

Most implemented papers

Most implemented Social Latest No code

Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books

soskek/homemade_bookcorpus • ICCV 2015

Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story.

Paper
Code

Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text

TejInaco/multimodalML • EMNLP 2016

This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos.

Paper
Code

A Hierarchical Approach for Generating Descriptive Image Paragraphs

chenxinpeng/im2p • • CVPR 2017

Recent progress on image captioning has made it possible to generate novel sentences describing images in natural language, but compressing an image into a single sentence can describe visual content in only coarse detail.

Paper
Code

PL-SLAM: a Stereo SLAM System through the Combination of Points and Line Segments

rubengooj/pl-slam • 26 May 2017

This paper proposes PL-SLAM, a stereo visual SLAM system that combines both points and line segments to work robustly in a wider variety of scenarios, particularly in those where point features are scarce or not well-distributed in the image.

Paper
Code

CLEVRER: CoLlision Events for Video REpresentation and Reasoning

chuangg/CLEVRER • • ICLR 2020

While these models thrive on the perception-based task (descriptive), they perform poorly on the causal tasks (explanatory, predictive and counterfactual), suggesting that a principled approach for causal reasoning should incorporate the capability of both perceiving complex visual and language inputs, and understanding the underlying dynamics and causal relations.

Paper
Code

Uninformed Students: Student-Teacher Anomaly Detection with Discriminative Latent Embeddings

denguir/student-teacher-anomaly-detection • • CVPR 2020

Our experiments demonstrate improvements over state-of-the-art methods on a number of real-world datasets, including the recently introduced MVTec Anomaly Detection dataset that was specifically designed to benchmark anomaly segmentation algorithms.

Paper
Code

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

galatolofederico/clip-glass • • 2 Feb 2021

In this research work we present CLIP-GLaSS, a novel zero-shot framework to generate an image (or a caption) corresponding to a given caption (or image).

Paper
Code

Visual Classification via Description from Large Language Models

sachit-menon/classify_by_description_release • • 13 Oct 2022

By basing decisions on these descriptors, we can provide additional cues that encourage using the features we want to be used.

Paper
Code

Music transcription modelling and composition using deep learning

IraKorshunova/folk-rnn • 29 Apr 2016

We apply deep learning methods, specifically long short-term memory (LSTM) networks, to music transcription modelling and composition.

Paper
Code

Picture It In Your Mind: Generating High Level Visual Representations From Textual Descriptions

AlexMoreo/tensorflow-Tex2Vis • • 23 Jun 2016

We choose to implement the actual search process as a similarity search in a visual feature space, by learning to translate a textual query into a visual representation.

Paper
Code

Descriptive

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result