Search Results for author: Sirui Zhao

Found 11 papers, 6 papers with code

Exploring User Retrieval Integration towards Large Language Models for Cross-Domain Sequential Recommendation

no code implementations5 Jun 2024 Tingjia Shen, Hao Wang, Jiaqing Zhang, Sirui Zhao, Liangyue Li, Zulong Chen, Defu Lian, Enhong Chen

To this end, we propose a novel framework named URLLM, which aims to improve the CDSR performance by exploring the User Retrieval approach and domain grounding on LLM simultaneously.

Contrastive Learning Language Modelling +4

Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis

no code implementations31 May 2024 Chaoyou Fu, Yuhan Dai, Yondong Luo, Lei LI, Shuhuai Ren, Renrui Zhang, Zihan Wang, Chenyu Zhou, Yunhang Shen, Mengdan Zhang, Peixian Chen, Yanwei Li, Shaohui Lin, Sirui Zhao, Ke Li, Tong Xu, Xiawu Zheng, Enhong Chen, Rongrong Ji, Xing Sun

With Video-MME, we extensively evaluate various state-of-the-art MLLMs, including GPT-4 series and Gemini 1. 5 Pro, as well as open-source image models like InternVL-Chat-V1. 5 and video models like LLaVA-NeXT-Video.

Dataset Regeneration for Sequential Recommendation

no code implementations28 May 2024 Mingjia Yin, Hao Wang, Wei Guo, Yong liu, Suojuan Zhang, Sirui Zhao, Defu Lian, Enhong Chen

The sequential recommender (SR) system is a crucial component of modern recommender systems, as it aims to capture the evolving preferences of users.

Sequential Recommendation

Learning Partially Aligned Item Representation for Cross-Domain Sequential Recommendation

no code implementations21 May 2024 Mingjia Yin, Hao Wang, Wei Guo, Yong liu, Zhi Li, Sirui Zhao, Defu Lian, Enhong Chen

Cross-domain sequential recommendation (CDSR) aims to uncover and transfer users' sequential preferences across multiple recommendation domains.

Multi-Task Learning Self-Supervised Learning +1

APGL4SR: A Generic Framework with Adaptive and Personalized Global Collaborative Information in Sequential Recommendation

1 code implementation6 Nov 2023 Mingjia Yin, Hao Wang, Xiang Xu, Likang Wu, Sirui Zhao, Wei Guo, Yong liu, Ruiming Tang, Defu Lian, Enhong Chen

To this end, we propose a graph-driven framework, named Adaptive and Personalized Graph Learning for Sequential Recommendation (APGL4SR), that incorporates adaptive and personalized global collaborative information into sequential recommendation systems.

Graph Learning Multi-Task Learning +1

Woodpecker: Hallucination Correction for Multimodal Large Language Models

1 code implementation24 Oct 2023 Shukang Yin, Chaoyou Fu, Sirui Zhao, Tong Xu, Hao Wang, Dianbo Sui, Yunhang Shen, Ke Li, Xing Sun, Enhong Chen

Hallucination is a big shadow hanging over the rapidly evolving Multimodal Large Language Models (MLLMs), referring to the phenomenon that the generated text is inconsistent with the image content.

Hallucination

A Solution to CVPR'2023 AQTC Challenge: Video Alignment for Multi-Step Inference

1 code implementation26 Jun 2023 Chao Zhang, Shiwei Wu, Sirui Zhao, Tong Xu, Enhong Chen

In this paper, we present a solution for enhancing video alignment to improve multi-step inference.

Video Alignment

A Survey on Multimodal Large Language Models

1 code implementation23 Jun 2023 Shukang Yin, Chaoyou Fu, Sirui Zhao, Ke Li, Xing Sun, Tong Xu, Enhong Chen

Recently, Multimodal Large Language Model (MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful Large Language Models (LLMs) as a brain to perform multimodal tasks.

Hallucination In-Context Learning +5

AU-aware graph convolutional network for Macro- and Micro-expression spotting

1 code implementation16 Mar 2023 Shukang Yin, Shiwei Wu, Tong Xu, Shifeng Liu, Sirui Zhao, Enhong Chen

Automatic Micro-Expression (ME) spotting in long videos is a crucial step in ME analysis but also a challenging task due to the short duration and low intensity of MEs.

Micro-Expression Spotting

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

no code implementations3 Jan 2023 Sirui Zhao, Huaying Tang, Xinglong Mao, Shifeng Liu, Hanqing Tao, Hao Wang, Tong Xu, Enhong Chen

To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7, 526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years.

Cannot find the paper you are looking for? You can Submit a new open access paper.