Search Results for author: Nadav Borenstein

Found 8 papers, 6 papers with code

What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages

no code implementations • 6 Jun 2024 • Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell

We find that the RLM rank, which corresponds to the size of linear space spanned by the logits of its conditional distributions, and the expected length of sampled strings are strong and significant predictors of learnability for both RNNs and Transformers.

Language Modelling

Paper
Add Code

Imitation of Life: A Search Engine for Biologically Inspired Design

1 code implementation • 20 Dec 2023 • Hen Emuna, Nadav Borenstein, Xin Qian, Hyeonsu Kang, Joel Chan, Aniket Kittur, Dafna Shahaf

We release data and code; we view BARcode as a step towards addressing the challenges that have historically hindered the practical application of BID to engineering innovation.

Natural Language Understanding

Paper
Code

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

2 code implementations • 15 Nov 2023 • Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, Arnav Arora, Aleksandr Rubashevskii, Jiahui Geng, Osama Mohammed Afzal, Liangming Pan, Nadav Borenstein, Aditya Pillai, Isabelle Augenstein, Iryna Gurevych, Preslav Nakov

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs.

Fact Checking Sentence

Paper
Code

PHD: Pixel-Based Language Modeling of Historical Documents

1 code implementation • 22 Oct 2023 • Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

We then pre-train our model, PHD, on a combination of synthetic scans and real historical newspapers from the 1700-1900 period.

Language Modelling Optical Character Recognition (OCR)

Paper
Code

Measuring Intersectional Biases in Historical Documents

1 code implementation • 21 May 2023 • Nadav Borenstein, Karolina Stańczak, Thea Rolskov, Natália da Silva Perez, Natacha Klein Käfer, Isabelle Augenstein

We find that there is a trade-off between the stability of the word embeddings and their compatibility with the historical dataset.

Optical Character Recognition Optical Character Recognition (OCR) +1

Paper
Code

Multilingual Event Extraction from Historical Newspaper Adverts

1 code implementation • 18 May 2023 • Nadav Borenstein, Natalia da Silva Perez, Isabelle Augenstein

We find that: 1) even with scarce annotated data, it is possible to achieve surprisingly good results by formulating the problem as an extractive QA task and leveraging existing datasets and models for modern languages; and 2) cross-lingual low-resource learning for historical languages is highly challenging, and machine translation of the historical datasets to the considered target languages is, in practice, often the best-performing solution.

Event Extraction Machine Translation

Paper
Code

Temporally stable video segmentation without video annotations

no code implementations • 17 Oct 2021 • Aharon Azulay, Tavi Halperin, Orestis Vantzos, Nadav Borenstein, Ofir Bibi

Temporally consistent dense video annotations are scarce and hard to collect.

Decoder Image Segmentation +5

Paper
Add Code

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements

1 code implementation • ACL 2021 • Chen Shani, Nadav Borenstein, Dafna Shahaf

We construct a dataset containing thousands of funny papers and use it to learn classifiers, combining findings from psychology and linguistics with recent advances in NLP.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.