Skip to content

Conversation

@PanZezhong1725
Copy link
Collaborator

No description provided.

qinyiqun and others added 6 commits February 11, 2026 11:33
* issue/204 - support graph in server scripts

* issue/208 - adapt to ali ppu

* issue/194 - add quantization modify configs accordingly

支持nv w8 1batch 1tp

增加json支持

InfiniLM 增加量化层和global config

以一种比较优雅的方式增加了quant config的支持

修改部分代码结构,删除无用代码

跟随inifnicore修改

删除所有的model_config,统一使用global_config

跟随InfiniLM最新代码修改

修改函数参数顺序

改名global config 为model config

Refactor: add new API alongside legacy interfaces with deprecation warnings

添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore

添加w4 inifnicore相关内容,以及将Quantization config划入InfiniCore

* issue/175 - qy device support

qy_page_131: add qy device

success qy inference_server.py

* Issue/170 - Add HYGON support and improve device type handling.

* Issue/193: feats for deployment

Signed-off-by: Ceng23333 <441651826@qq.com>

* skip responding eos token

Signed-off-by: Ceng23333 <441651826@qq.com>

* issue/143 use add_rmsnorm, nt flash attn, nt kv caching

* issue/204 - support graph in server scripts

* issue/208 - adapt to ali ppu

* rebase main

* issue/216 feat: support static kv cache in server

* fix llm server cache config

* demo131 - resolve mishandled conflicts

* demo131 - further adjust attn and caching logic

* demo131 - resolve merge requirements

---------

Signed-off-by: Ceng23333 <441651826@qq.com>
Co-authored-by: wooway777 <wooway777@gmail.com>
Co-authored-by: xgqdut2016 <kenan_gewei@163.com>
Co-authored-by: gongchensu <zhuyue_134@qq.com>
Co-authored-by: Ceng23333 <441651826@qq.com>
Co-authored-by: PanZezhong <panzezhong@qiyuanlab.com>
Co-authored-by: MaYuhang <2902139028@qq.com>
Issue/221 - resolve cambricon encode plus
issue/219: support vllm bench
Issue/226:add warmup before InfiniLM bench.py generation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants