ChatGLM2-6B

?? Github Repo ? ?? Twitter ? ?? [GLM@ACL 22] [GitHub] ? ?? [GLM-130B@ICLR 23] [GitHub]

?? Join our Slack and WeChat

介紹

ChatGLM2-6B 是開源中英雙語對話模型 ChatGLM-6B 的第二代版本，在保留了初代模型對話流暢、部署門檻較低等眾多優(yōu)秀特性的基礎(chǔ)之上，ChatGLM2-6B 引入了如下新特性：

更強(qiáng)大的性能：基于 ChatGLM 初代模型的開發(fā)經(jīng)驗(yàn)，我們?nèi)嫔壛?ChatGLM2-6B 的基座模型。ChatGLM2-6B 使用了 GLM 的混合目標(biāo)函數(shù)，經(jīng)過了 1.4T 中英標(biāo)識(shí)符的預(yù)訓(xùn)練與人類偏好對齊訓(xùn)練，評測結(jié)果顯示，相比于初代模型，ChatGLM2-6B 在 MMLU（+23%）、CEval（+33%）、GSM8K（+571%）、BBH（+60%）等數(shù)據(jù)集上的性能取得了大幅度的提升，在同尺寸開源模型中具有較強(qiáng)的競爭力。
更長的上下文：基于 FlashAttention 技術(shù)，我們將基座模型的上下文長度（Context Length）由 ChatGLM-6B 的 2K 擴(kuò)展到了 32K，并在對話階段使用 8K 的上下文長度訓(xùn)練，允許更多輪次的對話。但當(dāng)前版本的 ChatGLM2-6B 對單輪超長文檔的理解能力有限，我們會(huì)在后續(xù)迭代升級中著重進(jìn)行優(yōu)化。
更高效的推理：基于 Multi-Query Attention 技術(shù)，ChatGLM2-6B 有更高效的推理速度和更低的顯存占用：在官方的模型實(shí)現(xiàn)下，推理速度相比初代提升了 42%，INT4 量化下，6G 顯存支持的對話長度由 1K 提升到了 8K。

ChatGLM2-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model ChatGLM-6B. It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features:

Stronger Performance: Based on the development experience of the first-generation ChatGLM model, we have fully upgraded the base model of ChatGLM2-6B. ChatGLM2-6B uses the hybrid objective function of GLM, and has undergone pre-training with 1.4T bilingual tokens and human preference alignment training. The evaluation results show that, compared to the first-generation model, ChatGLM2-6B has achieved substantial improvements in performance on datasets like MMLU (+23%), CEval (+33%), GSM8K (+571%), BBH (+60%), showing strong competitiveness among models of the same size.
Longer Context: Based on FlashAttention technique, we have extended the context length of the base model from 2K in ChatGLM-6B to 32K, and trained with a context length of 8K during the dialogue alignment, allowing for more rounds of dialogue. However, the current version of ChatGLM2-6B has limited understanding of single-round ultra-long documents, which we will focus on optimizing in future iterations.
More Efficient Inference: Based on Multi-Query Attention technique, ChatGLM2-6B has more efficient inference speed and lower GPU memory usage: under the official implementation, the inference speed has increased by 42% compared to the first generation; under INT4 quantization, the dialogue length supported by 6G GPU memory has increased from 1K to 8K.

軟件依賴

pip install --upgrade torch
pip install transformers -U
# modelscope >= 1.7.2

關(guān)于更多的使用說明，包括如何運(yùn)行命令行和網(wǎng)頁版本的 DEMO，以及使用模型量化以節(jié)省顯存，請參考我們的 Github Repo。

For more instructions, including how to run CLI and web demos, and model quantization, please refer to our Github Repo.

Change Log

v1.0

示例代碼

# 備注：最新模型版本要求modelscope >= 1.7.2
# pip install modelscope -U 

from modelscope.utils.constant import Tasks
from modelscope import Model
from modelscope.pipelines import pipeline
model = Model.from_pretrained('ZhipuAI/chatglm2-6b', device_map='auto', revision='v1.0.7')
pipe = pipeline(task=Tasks.chat, model=model)
inputs = {'text':'你好', 'history': []}
result = pipe(inputs)
inputs = {'text':'介紹下清華大學(xué)', 'history': result['history']}
result = pipe(inputs)
print(result)

協(xié)議

本倉庫的代碼依照 Apache-2.0 協(xié)議開源，ChatGLM2-6B 模型的權(quán)重的使用則需要遵循 Model License。

引用

如果你覺得我們的工作有幫助的話，請考慮引用下列論文，ChatGLM2-6B 的論文會(huì)在近期公布，盡情期待～

@article{zeng2022glm,
  title={Glm-130b: An open bilingual pre-trained model},
  author={Zeng, Aohan and Liu, Xiao and Du, Zhengxiao and Wang, Zihan and Lai, Hanyu and Ding, Ming and Yang, Zhuoyi and Xu, Yifan and Zheng, Wendi and Xia, Xiao and others},
  journal={arXiv preprint arXiv:2210.02414},
  year={2022}
}

@inproceedings{du2022glm,
  title={GLM: General Language Model Pretraining with Autoregressive Blank Infilling},
  author={Du, Zhengxiao and Qian, Yujie and Liu, Xiao and Ding, Ming and Qiu, Jiezhong and Yang, Zhilin and Tang, Jie},
  booktitle={Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={320--335},
  year={2022}
}

五月天成人小说,中文字幕亚洲欧美专区,久久妇女,亚洲伊人久久大香线蕉综合,日日碰狠狠添天天爽超碰97

ChatGLM2-6B

介紹

軟件依賴

Change Log

示例代碼

協(xié)議

引用