Search Results for author: Dexun Li

Found 9 papers, 0 papers with code

Meta-Task Planning for Language Agents

no code implementations26 May 2024 Cong Zhang, Derrick Goh Xin Deik, Dexun Li, Hao Zhang, Yong liu

Effective planning is crucial for the success of LLM agents in real-world tasks, making it a highly pursued topic in the community.

Aligning Crowd Feedback via Distributional Preference Reward Modeling

no code implementations15 Feb 2024 Dexun Li, Cong Zhang, Kuicai Dong, Derrick Goh Xin Deik, Ruiming Tang, Yong liu

We propose the Distributional Preference Reward Model (DPRM), a simple yet effective framework to align large language models with diverse human preferences.

Enhancing the Hierarchical Environment Design via Generative Trajectory Modeling

no code implementations30 Sep 2023 Dexun Li, Pradeep Varakantham

Unsupervised Environment Design (UED) is a paradigm for automatically generating a curriculum of training environments, enabling agents trained in these environments to develop general capabilities, i. e., achieving good zero-shot transfer performance.

Trajectory Modeling

Diversity Induced Environment Design via Self-Play

no code implementations4 Feb 2023 Dexun Li, Wenjun Li, Pradeep Varakantham

In this paper, we aim to introduce diversity in the Unsupervised Environment Design (UED) framework.

Generalization through Diversity: Improving Unsupervised Environment Design

no code implementations19 Jan 2023 Wenjun Li, Pradeep Varakantham, Dexun Li

Agent decision making using Reinforcement Learning (RL) heavily relies on either a model or simulator of the environment (e. g., moving in an 8x8 maze with three rooms, playing Chess on an 8x8 board).

Decision Making Reinforcement Learning (RL)

Hidden State Approximation in Recurrent Neural Networks Using Continuous Particle Filtering

no code implementations18 Dec 2022 Dexun Li

Using historical data to predict future events has many applications in the real world, such as stock price prediction; the robot localization.

Decoder Stock Price Prediction

Towards Soft Fairness in Restless Multi-Armed Bandits

no code implementations27 Jul 2022 Dexun Li, Pradeep Varakantham

To avoid starvation in the executed interventions across individuals/regions/communities, we first provide a soft fairness constraint and then provide an approach to enforce the soft fairness constraint in RMABs.

Fairness Multi-Armed Bandits

Efficient Resource Allocation with Fairness Constraints in Restless Multi-Armed Bandits

no code implementations8 Jun 2022 Dexun Li, Pradeep Varakantham

In this paper, we are interested in ensuring that RMAB decision making is also fair to different arms while maximizing expected value.

Decision Making Fairness +1

CLAIM: Curriculum Learning Policy for Influence Maximization in Unknown Social Networks

no code implementations8 Jul 2021 Dexun Li, Meghna Lowalekar, Pradeep Varakantham

Influence maximization is the problem of finding a small subset of nodes in a network that can maximize the diffusion of information.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.