Search Results for author: Nadav Borenstein

Found 8 papers, 6 papers with code

What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages

no code implementations6 Jun 2024 Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell

We find that the RLM rank, which corresponds to the size of linear space spanned by the logits of its conditional distributions, and the expected length of sampled strings are strong and significant predictors of learnability for both RNNs and Transformers.

Language Modelling

Imitation of Life: A Search Engine for Biologically Inspired Design

1 code implementation20 Dec 2023 Hen Emuna, Nadav Borenstein, Xin Qian, Hyeonsu Kang, Joel Chan, Aniket Kittur, Dafna Shahaf

We release data and code; we view BARcode as a step towards addressing the challenges that have historically hindered the practical application of BID to engineering innovation.

Natural Language Understanding

PHD: Pixel-Based Language Modeling of Historical Documents

1 code implementation22 Oct 2023 Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

We then pre-train our model, PHD, on a combination of synthetic scans and real historical newspapers from the 1700-1900 period.

Language Modelling Optical Character Recognition (OCR)

Multilingual Event Extraction from Historical Newspaper Adverts

1 code implementation18 May 2023 Nadav Borenstein, Natalia da Silva Perez, Isabelle Augenstein

We find that: 1) even with scarce annotated data, it is possible to achieve surprisingly good results by formulating the problem as an extractive QA task and leveraging existing datasets and models for modern languages; and 2) cross-lingual low-resource learning for historical languages is highly challenging, and machine translation of the historical datasets to the considered target languages is, in practice, often the best-performing solution.

Event Extraction Machine Translation

How Did This Get Funded?! Automatically Identifying Quirky Scientific Achievements

1 code implementation ACL 2021 Chen Shani, Nadav Borenstein, Dafna Shahaf

We construct a dataset containing thousands of funny papers and use it to learn classifiers, combining findings from psychology and linguistics with recent advances in NLP.

Cannot find the paper you are looking for? You can Submit a new open access paper.