Search Results for author: Weijun Wang

Found 13 papers, 8 papers with code

LoRA-Switch: Boosting the Efficiency of Dynamic LLM Adapters via System-Algorithm Co-design

no code implementations • 28 May 2024 • Rui Kong, Qiyang Li, Xinyu Fang, Qingtian Feng, Qingfeng He, Yazhu Dong, Weijun Wang, Yuanchun Li, Linghe Kong, Yunxin Liu

Recent literature has found that an effective method to customize or further improve large language models (LLMs) is to add dynamic adapters, such as low-rank adapters (LoRA) with Mixture-of-Experts (MoE) structures.

Paper
Add Code

MobileNetV4 - Universal Models for the Mobile Ecosystem

2 code implementations • 16 Apr 2024 • Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard

We present the latest generation of MobileNets, known as MobileNetV4 (MNv4), featuring universally efficient architecture designs for mobile devices.

Neural Architecture Search

76,675

Paper
Code

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

2 code implementations • 10 Jan 2024 • Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.

252

Paper
Code

BiSwift: Bandwidth Orchestrator for Multi-Stream Video Analytics on Edge

no code implementations • 25 Dec 2023 • Lin Sun, Weijun Wang, Tingting Yuan, Liang Mi, Haipeng Dai, Yunxin Liu, XiaoMing Fu

To achieve this goal, we propose BiSwift, a bi-level framework that scales the concurrent real-time video analytics by a novel adaptive hybrid codec integrated with multi-level pipelines, and a global bandwidth controller for multiple video streams.

Fairness Management +3

Paper
Add Code

SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget

no code implementations • 29 Aug 2023 • Rui Kong, Yuanchun Li, Qingtian Feng, Weijun Wang, Xiaozhou Ye, Ye Ouyang, Linghe Kong, Yunxin Liu

Mixture of experts (MoE) is a popular technique to improve capacity of Large Language Models (LLMs) with conditionally-activated parallel experts.

object-detection Object Detection

Paper
Add Code

ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

1 code implementation • NeurIPS 2023 • Shuyang Sun, Weijun Wang, Qihang Yu, Andrew Howard, Philip Torr, Liang-Chieh Chen

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment.

Panoptic Segmentation Segmentation

991

Paper
Code

AccDecoder: Accelerated Decoding for Neural-enhanced Video Analytics

no code implementations • 20 Jan 2023 • Tingting Yuan, Liang Mi, Weijun Wang, Haipeng Dai, XiaoMing Fu

The quality of the video stream is key to neural network-based video analytics.

Decoder Super-Resolution

Paper
Add Code

MOSAIC: Mobile Segmentation via decoding Aggregated Information and encoded Context

1 code implementation • 22 Dec 2021 • Weijun Wang, Andrew Howard

We present a next-generation neural network architecture, MOSAIC, for efficient and accurate semantic image segmentation on mobile devices.

Decoder Image Segmentation +1

76,678

Paper
Code

Discovering Multi-Hardware Mobile Models via Architecture Search

no code implementations • 18 Aug 2020 • Grace Chu, Okan Arikan, Gabriel Bender, Weijun Wang, Achille Brighton, Pieter-Jan Kindermans, Hanxiao Liu, Berkin Akin, Suyog Gupta, Andrew Howard

Hardware-aware neural architecture designs have been predominantly focusing on optimizing model performance on single hardware and model development complexity, where another important factor, model deployment complexity, has been largely ignored.

Neural Architecture Search

Paper
Add Code

FoodX-251: A Dataset for Fine-grained Food Classification

1 code implementation • 14 Jul 2019 • Parneet Kaur, Karan Sikka, Weijun Wang, Serge Belongie, Ajay Divakaran

Food classification is a challenging problem due to the large number of categories, high visual similarity between different foods, as well as the lack of datasets for training state-of-the-art deep models.

Classification Fine-Grained Visual Categorization +1