Skip to content

手动下载的模型放到哪个位置? #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
helloxz opened this issue Mar 15, 2023 · 33 comments
Closed

手动下载的模型放到哪个位置? #50

helloxz opened this issue Mar 15, 2023 · 33 comments

Comments

@helloxz
Copy link

helloxz commented Mar 15, 2023

看到有这一行说明:

如果你从Hugging Face Hub上下载checkpoint的速度较慢,也可以从这里手动下载。

我手动下载了模型,应该放到本地哪个文件夹呢?

@yaleimeng
Copy link

yaleimeng commented Mar 15, 2023

随便放在哪里都行。。在加载的时候设置好本地路径即可。
例如:
mypath = "/home/xxxx/public/chatglm-6b"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
model = AutoModel.from_pretrained(mypath, trust_remote_code=True).half().quantize(4).cuda() # 这里进行了int4量化。

@helloxz
Copy link
Author

helloxz commented Mar 15, 2023

不得行呢,这样会报错:

>>> mypath="D:/apps/ChatGLM-6B/model"
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 614, in from_pretrained
    pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\configuration_auto.py", line 852, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 565, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 632, in _get_config_dict
    _commit_hash=commit_hash,
  File "D:\Program Files\Python37\lib\site-packages\transformers\utils\hub.py", line 381, in cached_file
    f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
OSError: D:/apps/ChatGLM-6B/model does not appear to have a file named config.json. Checkout 'https://huggingface.co/D:/apps/ChatGLM-6B/model/None' for available files.

是我姿势没对吗?

@ykallan
Copy link

ykallan commented Mar 15, 2023

同样遇到了这个问题

@feixyz10
Copy link

you should download all files from here, except those you have downloaded from tsinghua cloud

@Ling-YangHui
Copy link

I have downloaded all files from huggingface.
However, when I execute

mypath = "G:/chatGLM-6B/model"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)

it returns:

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\<My Username>\\.cache\\huggingface\\modules\\transformers_modules\\G:'

It seems that some errors occur in my cache setting

@feixyz10
Copy link

Path name on Windows should be specially treated (just Google this). Pathlib package or replacing "/" with "\" might be possible to solve your problem.

@yaleimeng
Copy link

看报错应该是路径识别有点异常。程序觉得传递的路径有问题(Windows格式),Linux下是/开头而且不会有冒号。
因为程序基本上都是针对Linux系统写的,而Windows系统路径不一样,默认编码也不是utf-8,会有很多坑。
试试按照报错的路径去放置,实在不行建议双系统或者开个Windows的Linux虚拟机去跑(似乎虚拟机能用GPU了吧)。

@ykallan
Copy link

ykallan commented Mar 16, 2023

问题应该已解决,目前我的解决方案是,首先用AutoModel,把模型名称填入进去,然后等待模型下载好后,把模型从~/.cache/xxx 路径下,复制到项目目录,然后修改脚本,这样是可以启动的,直接通过云盘下载的模型,缺少一些文件,不能直接启动 @yaleimeng @feixyz10 @helloxz @Ling-YangHui

@yaleimeng
Copy link

yaleimeng commented Mar 16, 2023

哦。。我们从huggingface下载的。所以没遇到
按理说直接弄个压缩包分流就行了啊。。怎么不同地方的东西还不一样

@Liu-Steve
Copy link

云盘只有大文件,只要去huggingface下把小文件下齐就好了,我亲测可行

@marszhao
Copy link

%USERPROFILE%.cache\huggingface\hub\models--THUDM--chatglm-6b\snapshots下有一个或多个名字像git版本号的目录,放最新的那个下面就可以了

@nikshe
Copy link

nikshe commented Mar 17, 2023

不得行呢,这样会报错:

>>> mypath="D:/apps/ChatGLM-6B/model"
>>> from transformers import AutoTokenizer, AutoModel
>>> tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 614, in from_pretrained
    pretrained_model_name_or_path, trust_remote_code=trust_remote_code, **kwargs
  File "D:\Program Files\Python37\lib\site-packages\transformers\models\auto\configuration_auto.py", line 852, in from_pretrained
    config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 565, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "D:\Program Files\Python37\lib\site-packages\transformers\configuration_utils.py", line 632, in _get_config_dict
    _commit_hash=commit_hash,
  File "D:\Program Files\Python37\lib\site-packages\transformers\utils\hub.py", line 381, in cached_file
    f"{path_or_repo_id} does not appear to have a file named {full_filename}. Checkout "
OSError: D:/apps/ChatGLM-6B/model does not appear to have a file named config.json. Checkout 'https://huggingface.co/D:/apps/ChatGLM-6B/model/None' for available files.

是我姿势没对吗?

问题解决了吗?

@jerrylususu
Copy link

@nikshe 是否文件没有下齐全?除了8个权重文件,hugging face 目录下所有的小文件都要放在模型目录里。看这个报错是模型目录里没有 config.json 文件

@ttsking
Copy link

ttsking commented Mar 18, 2023

这8个bin文件要用cat把他们连起来吗,我显示找不到模型文件,但连起来又说读取失败:

root@2227e6c2b8b1:/work/chatglm-6b/ChatGLM-6B# python cli_demo.py
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
File "cli_demo.py", line 6, in
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().quantize(8).cuda()
File "/usr/local/lib/python3.8/dist-packages/transformers/models/auto/auto_factory.py", line 459, in from_pretrained
return model_class.from_pretrained(
File "/usr/local/lib/python3.8/dist-packages/transformers/modeling_utils.py", line 2164, in from_pretrained
raise EnvironmentError(
OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory THUDM/chatglm-6b.
root@2227e6c2b8b1:/work/chatglm-6b/ChatGLM-6B#

@ttsking
Copy link

ttsking commented Mar 18, 2023

我的解决了,原来是缺少 pytorch_model.bin.index.json

@nikohpng
Copy link

我这边必须填写window的绝对路径

@luieswww
Copy link

我的解决了,原来是缺少 pytorch_model.bin.index.json

能分享一下本地运行模型文件的代码吗

@DelaiahZ
Copy link

DelaiahZ commented Apr 3, 2023

我的模型文件是齐全的,路径也是正确的,但是也报错。最后我把keras卸载了,就没有问题了。好神奇啊!

@bash99
Copy link

bash99 commented Apr 4, 2023

我这边在wsl里面,都需要改成绝对路径,两行都要改。

@tiejiang8
Copy link

tiejiang8 commented Apr 7, 2023

I have downloaded all files from huggingface. However, when I execute

mypath = "G:/chatGLM-6B/model"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)

it returns:

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\<My Username>\\.cache\\huggingface\\modules\\transformers_modules\\G:'

It seems that some errors occur in my cache setting

mypath = 'G:\\chatGLM-6B\\model' 这样试试 双斜杠'\\'试试

@mingyue0094
Copy link

以下是我本地运行方法

第一次,联网运行

  • 模型会自动下载到 C:\Users\Administrator\.cache\huggingface\hub\models--THUDM--chatglm-6b-int4

后面就可以离线加载

mypath = r'C:\Users\Administrator\.cache\huggingface\hub\models--THUDM--chatglm-6b-int4\snapshots\9163f7e6d9b2e5b4f66d9be8d0288473a8ccd027'

tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
  • 9163f7e6d9b2e5b4f66d9be8d0288473a8ccd027 要看自己的是多少。

@xuji755
Copy link

xuji755 commented Apr 11, 2023

我的报错信息:
root@DESKTOP-FMBI0K0:/data/ChatGLM-6b# python3 demo.py
Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Traceback (most recent call last):
File "demo.py", line 5, in
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained
return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1813, in from_pretrained
**kwargs,
File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 205, in init
self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 55, in init
assert vocab_file is not None
AssertionError

我的代码:
from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html
mypath="/data/chatglm-6b-int4"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)
model = AutoModel.from_pretrained(mypath, trust_remote_code=True).half().cuda()
model = model.eval()

我使用的是wsl,下载的模型放在/data/chatglm-6b-int4下

@taraliu23
Copy link

This change works in cli_demo.py:

local_path = "/home/somepath/somepath/ChatGLM-6B/huggingface_chatglm-6m"
tokenizer = AutoTokenizer.from_pretrained(local_path, trust_remote_code=True)
model = AutoModel.from_pretrained(local_path, trust_remote_code=True).quantize(4).half().cuda()

Firstly I downloaded these .bin files from THU cloud and other files in chatglm-6b folder from huggingface, and an OSError showed when running cli_demo.py.
Then I used "git clone https://huggingface.co/THUDM/chatglm-6b", and it works.

@xiaoxinxin666666
Copy link

我的报错信息: root@DESKTOP-FMBI0K0:/data/ChatGLM-6b# python3 demo.py Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Traceback (most recent call last): File "demo.py", line 5, in tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) File "/usr/local/lib/python3.7/dist-packages/transformers/models/auto/tokenization_auto.py", line 679, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1813, in from_pretrained **kwargs, File "/usr/local/lib/python3.7/dist-packages/transformers/tokenization_utils_base.py", line 1958, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 205, in init self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b-int4/tokenization_chatglm.py", line 55, in init assert vocab_file is not None AssertionError

我的代码: from transformers import AutoModel, AutoTokenizer import gradio as gr import mdtex2html mypath="/data/chatglm-6b-int4" tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True) model = AutoModel.from_pretrained(mypath, trust_remote_code=True).half().cuda() model = model.eval()

我使用的是wsl,下载的模型放在/data/chatglm-6b-int4下

我也是一样的情况,给了地址,他还是走缓存

@duzx16
Copy link
Member

duzx16 commented Apr 12, 2023

@duzx16 duzx16 closed this as completed Apr 12, 2023
@linonetwo
Copy link

改为

from transformers import AutoModel, AutoTokenizer
import gradio as gr
import mdtex2html

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("E:/model/LanguageModel/ChatGLM/chatglm-6b", trust_remote_code=True).half().cuda()

后报错

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\Administrator\\.cache\\huggingface\\modules\\transformers_modules\\E:'

@linonetwo
Copy link

linonetwo commented Apr 20, 2023

tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("e:\\model\\LanguageModel\\ChatGLM\\chatglm-6b", trust_remote_code=True).half().cuda()

图片

这样可以

@LeXwDeX
Copy link

LeXwDeX commented May 10, 2023

在demo.py import之后增加/修改

import os
model_path = os.path.join(".", "models")
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
model = model.eval()

然后把从hugfacing上git clone好的文件放到models目录下。这个linux和windows都可以用。

@12lxr
Copy link

12lxr commented Jun 8, 2023

请问我用的自己训好的模型hf格式,用web_demo跑一直没有反应,没有输出。请问怎么解决呢
2781686214546_ pic

@daihuangyu
Copy link

模型参数没用缓存,但为什么模型本身加载的py还用的缓存啊,咋解决

@Louis24
Copy link

Louis24 commented Nov 17, 2023

I have downloaded all files from huggingface. However, when I execute

mypath = "G:/chatGLM-6B/model"
tokenizer = AutoTokenizer.from_pretrained(mypath, trust_remote_code=True)

it returns:

OSError: [WinError 123] 文件名、目录名或卷标语法不正确。: 'C:\\Users\\<My Username>\\.cache\\huggingface\\modules\\transformers_modules\\G:'

It seems that some errors occur in my cache setting

哈哈哈我今天遇到一样的情况 , 为什么最后会加个盘符啊??

@XuWink
Copy link

XuWink commented Mar 14, 2024

image
我遇到了相同的错误

@16sa8d4
Copy link

16sa8d4 commented Jul 17, 2024

linux环境上我将下载后的文件夹移动到项目中就可以了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests