ChatRWKV (pronounced as "RwaKuv" (rʌkuv in IPA), from 4 major params: R W K V)

Please check https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py first

ChatRWKV is like ChatGPT but powered by my RWKV (100% RNN) language model, which is the only RNN (as of now) that can match transformers in quality and scaling, while being faster and saves VRAM. Training sponsored by Stability EleutherAI :)

Our latest version is RWKV-6 https://arxiv.org/abs/2404.05892 (Preview models: https://huggingface.co/BlinkDL/temp )

RWKV-6 3B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-1

RWKV-6 7B Demo: https://huggingface.co/spaces/BlinkDL/RWKV-Gradio-2

RWKV-LM main repo: https://github.com/BlinkDL/RWKV-LM (explanation, fine-tuning, training, etc.)

Chat Demo for developers: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py

RWKV Discord: https://discord.gg/bDSBUMeFpc (7k+ members)

Twitter: https://twitter.com/BlinkDL_AI

Homepage: https://www.rwkv.com/

Raw cutting-edge RWKV weights: https://huggingface.co/BlinkDL

HF-compatible RWKV weights: https://huggingface.co/RWKV

Use v2/convert_model.py to convert a model for a strategy, for faster loading & saves CPU RAM.

Note RWKV_CUDA_ON will build a CUDA kernel (much faster & saves VRAM). Here is how to build it ("pip install ninja" first):

# How to build in Linux: set these and run v2/chat.py
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# How to build in win:
Install VS2022 build tools (https://aka.ms/vs/17/release/vs_BuildTools.exe select Desktop C++). Reinstall CUDA 11.7 (install VC++ extensions). Run v2/chat.py in "x64 native tools command prompt".

RWKV pip package: https://pypi.org/project/rwkv/ (please always check for latest version and upgrade)

https://github.com/cgisky1980/ai00_rwkv_server Fastest GPU inference API with vulkan (good for nvidia/amd/intel)

https://github.com/cryscan/web-rwkv backend for ai00_rwkv_server

https://github.com/saharNooby/rwkv.cpp Fast CPU/cuBLAS/CLBlast inference: int4/int8/fp16/fp32

https://github.com/JL-er/RWKV-PEFT lora/pissa/Qlora/Qpissa/state tuning

https://github.com/RWKV/RWKV-infctx-trainer Infctx trainer

World demo script: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_WORLD.py

Raven Q&A demo script: https://github.com/BlinkDL/ChatRWKV/blob/main/v2/benchmark_more.py

RWKV in 150 lines (model, inference, text generation): https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_in_150_lines.py

🔥 RWKV v5 in 250 lines 🔥 (with tokenizer too): https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_v5_demo.py

🔥 Building your own RWKV inference engine 🔥: begin with https://github.com/BlinkDL/ChatRWKV/blob/main/src/model_run.py which is easier to understand (used by https://github.com/BlinkDL/ChatRWKV/blob/main/chat.py).

RWKV preprint https://arxiv.org/abs/2305.13048

RWKV v6 illustrated:

Cool Community RWKV Projects:

https://github.com/saharNooby/rwkv.cpp fast i4 i8 fp16 fp32 CPU inference using ggml

https://github.com/harrisonvanderbyl/rwkv-cpp-cuda fast windows/linux & cuda/rocm/vulkan GPU inference (no need for python & pytorch)

https://github.com/Blealtan/RWKV-LM-LoRA LoRA fine-tuning

https://github.com/josStorer/RWKV-Runner cool GUI

More RWKV projects: https://github.com/search?o=desc&q=rwkv&s=updated&type=Repositories

ChatRWKV v2: with "stream" and "split" strategies, and INT8. 3G VRAM is enough to run RWKV 14B :) https://github.com/BlinkDL/ChatRWKV/tree/main/v2

os.environ["RWKV_JIT_ON"] = '1'
os.environ["RWKV_CUDA_ON"] = '0' # if '1' then use CUDA kernel for seq mode (much faster)
from rwkv.model import RWKV                         # pip install rwkv
model = RWKV(model='/fsx/BlinkDL/HF-MODEL/rwkv-4-pile-1b5/RWKV-4-Pile-1B5-20220903-8040', strategy='cuda fp16')

out, state = model.forward([187, 510, 1563, 310, 247], None)   # use 20B_tokenizer.json
print(out.detach().cpu().numpy())                   # get logits
out, state = model.forward([187, 510], None)
out, state = model.forward([1563], state)           # RNN has state (use deepcopy if you want to clone it)
out, state = model.forward([310, 247], state)
print(out.detach().cpu().numpy())                   # same result as above

Here is https://huggingface.co/BlinkDL/rwkv-4-raven/blob/main/RWKV-4-Raven-14B-v7-Eng-20230404-ctx4096.pth in action:

When you build a RWKV chatbot, always check the text corresponding to the state, in order to prevent bugs.

Never call raw forward() directly. Instead, put it in a function that will record the text corresponding to the state.

(For v4-raven models, use Bob/Alice. For v4/v5/v6-world models, use User/Assistant)

The best chat format (check whether your text is of this format): Bob: xxxxxxxxxxxxxxxxxx\n\nAlice: xxxxxxxxxxxxx\n\nBob: xxxxxxxxxxxxxxxx\n\nAlice:

There should not be any space after the final "Alice:". The generation result will have a space in the beginning, and you can simply strip it.
You can use \n in xxxxx, but avoid \n\n. So simply do xxxxx = xxxxx.strip().replace('\r\n','\n').replace('\n\n','\n')

If you are building your own RWKV inference engine, begin with https://github.com/BlinkDL/ChatRWKV/blob/main/src/model_run.py which is easier to understand (used by https://github.com/BlinkDL/ChatRWKV/blob/main/chat.py)

The lastest "Raven"-series Alpaca-style-tuned RWKV 14B & 7B models are very good (almost ChatGPT-like, good at multiround chat too). Download: https://huggingface.co/BlinkDL/rwkv-4-raven

Previous old model results:

中文模型

QQ群 553456870（加入时请简单自我介绍）。有研发能力的朋友加群 325154699。

中文使用教程：https://zhuanlan.zhihu.com/p/618011122 https://zhuanlan.zhihu.com/p/616351661

推荐UI：https://github.com/l15y/wenda

Name	Name	Last commit message	Last commit date
Latest commit BlinkDL Merge pull request #211 from zhiyuan1i/main May 2, 2025 7c941c9 · May 2, 2025 History 445 Commits
.github	.github	Create FUNDING.yml	Apr 3, 2023
misc	misc	sample	Feb 10, 2023
music	music	abc model	Sep 13, 2023
rwkv_pip_package	rwkv_pip_package	Reduce VRAM usage	May 2, 2025
src	src	fix top_p sampling when cumsum exactly equals to top_p	Sep 14, 2023
tokenizer	tokenizer	v0.8.10	Sep 2, 2023
v2	v2	Update benchmark_world.py	Dec 27, 2023
.gitignore	.gitignore	v2 STREAM mode (3G VRAM can run RWKV 14B)	Feb 23, 2023
20B_tokenizer.json	20B_tokenizer.json	first	Jan 13, 2023
API_DEMO.py	API_DEMO.py	upgraded to use rwkv7	Jan 28, 2025
API_DEMO_CHAT.py	API_DEMO_CHAT.py	upgraded to use rwkv7	Jan 28, 2025
API_DEMO_WORLD.py	API_DEMO_WORLD.py	upgraded to use rwkv7	Jan 28, 2025
ChatRWKV-strategy.png	ChatRWKV-strategy.png	Add files via upload	Apr 7, 2023
ChatRWKV.png	ChatRWKV.png	Add files via upload	Apr 5, 2023
LICENSE	LICENSE	Initial commit	Jan 13, 2023
README.md	README.md	Update README.md	Jul 11, 2024
RWKV-eval.png	RWKV-eval.png	Add files via upload	Jun 30, 2023
RWKV-paper.png	RWKV-paper.png	Add files via upload	May 23, 2023
RWKV-v5-benchmark-1.png	RWKV-v5-benchmark-1.png	Add files via upload	Jul 11, 2024
RWKV_in_150_lines.py	RWKV_in_150_lines.py	RWKV in 150 lines	Feb 27, 2023
RWKV_v5_demo.py	RWKV_v5_demo.py	v5 fix: group_norm(eps=64e-5)	Nov 15, 2023
RWKV_v6_demo.py	RWKV_v6_demo.py	v6 inference demo	Feb 9, 2024
RWKV_v6_demo_cuda_bf16.py	RWKV_v6_demo_cuda_bf16.py	fix	Jul 3, 2024
chat.py	chat.py	update	Mar 5, 2023
requirements.txt	requirements.txt	Update requirements.txt	Mar 24, 2023
run_lm_eval.py	run_lm_eval.py	lm_eval example	Oct 2, 2023
rwkv-x060.png	rwkv-x060.png	Add files via upload	Dec 14, 2023
rwkv_state_merger.py	rwkv_state_merger.py	Add files via upload	May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatRWKV (pronounced as "RwaKuv" (rʌkuv in IPA), from 4 major params: R W K V)

Please check https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py first

RWKV Discord: https://discord.gg/bDSBUMeFpc (7k+ members)

中文模型

Star History

About

Releases

Sponsor this project

Packages

Contributors 17

Languages

License

BlinkDL/ChatRWKV

Folders and files

Latest commit

History

Repository files navigation

ChatRWKV (pronounced as "RwaKuv" (rʌkuv in IPA), from 4 major params: R W K V)

Please check https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py first

RWKV Discord: https://discord.gg/bDSBUMeFpc (7k+ members)

中文模型

Star History

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 17

Languages

Packages