Skip to content

Support local offline environment to read community config and model files #5817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

AlphaHinex
Copy link
Contributor

@AlphaHinex AlphaHinex commented May 3, 2023

PR types

Function optimization

PR changes

Others

Description

This PR supports local offline environment to read config and model files, fixes issues like below:


I've already downloaded config and model files in ~/.paddlenlp:

➜  .paddlenlp tree
.
├── datasets
├── models
│   ├── Salesforce
│   │   └── codegen-350M-mono
│   │       ├── added_tokens.json
│   │       ├── config.json
│   │       ├── merges.txt
│   │       ├── model_config.json
│   │       ├── model_state.pdparams
│   │       ├── special_tokens_map.json
│   │       ├── tokenizer_config.json
│   │       └── vocab.json
│   └── embeddings
└── packages

and when I use paddlenlp in an offline environment, I will get these exceptions:

>>> from paddlenlp import Taskflow
>>> codegen = Taskflow("code_generation", model="Salesforce/codegen-350M-mono",decode_strategy="greedy_search", repetition_penalty=1.0)
[2023-05-02 08:45:13,182] [    INFO] - Already cached /Users/alphahinex/.paddlenlp/models/Salesforce/codegen-350M-mono/vocab.json
[2023-05-02 08:45:13,182] [    INFO] - Already cached /Users/alphahinex/.paddlenlp/models/Salesforce/codegen-350M-mono/merges.txt
[2023-05-02 08:45:13,182] [    INFO] - Already cached /Users/alphahinex/.paddlenlp/models/Salesforce/codegen-350M-mono/added_tokens.json
[2023-05-02 08:45:13,182] [    INFO] - Already cached /Users/alphahinex/.paddlenlp/models/Salesforce/codegen-350M-mono/special_tokens_map.json
[2023-05-02 08:45:13,182] [    INFO] - Already cached /Users/alphahinex/.paddlenlp/models/Salesforce/codegen-350M-mono/tokenizer_config.json
[2023-05-02 08:45:13,246] [    INFO] - Adding                                 to the vocabulary
[2023-05-02 08:45:13,246] [    INFO] - Adding                                to the vocabulary
[2023-05-02 08:45:13,246] [    INFO] - Adding                               to the vocabulary
[2023-05-02 08:45:13,246] [    INFO] - Adding                              to the vocabulary
[2023-05-02 08:45:13,246] [    INFO] - Adding                             to the vocabulary
[2023-05-02 08:45:13,247] [    INFO] - Adding                            to the vocabulary
[2023-05-02 08:45:13,247] [    INFO] - Adding                           to the vocabulary
[2023-05-02 08:45:13,247] [    INFO] - Adding                          to the vocabulary
[2023-05-02 08:45:13,247] [    INFO] - Adding                         to the vocabulary
[2023-05-02 08:45:13,247] [    INFO] - Adding                        to the vocabulary
[2023-05-02 08:45:13,248] [    INFO] - Adding                       to the vocabulary
[2023-05-02 08:45:13,248] [    INFO] - Adding                      to the vocabulary
[2023-05-02 08:45:13,248] [    INFO] - Adding                     to the vocabulary
[2023-05-02 08:45:13,248] [    INFO] - Adding                    to the vocabulary
[2023-05-02 08:45:13,248] [    INFO] - Adding                   to the vocabulary
[2023-05-02 08:45:13,248] [    INFO] - Adding                  to the vocabulary
[2023-05-02 08:45:13,249] [    INFO] - Adding                 to the vocabulary
[2023-05-02 08:45:13,249] [    INFO] - Adding                to the vocabulary
[2023-05-02 08:45:13,249] [    INFO] - Adding               to the vocabulary
[2023-05-02 08:45:13,249] [    INFO] - Adding              to the vocabulary
[2023-05-02 08:45:13,249] [    INFO] - Adding             to the vocabulary
[2023-05-02 08:45:13,249] [    INFO] - Adding            to the vocabulary
[2023-05-02 08:45:13,250] [    INFO] - Adding           to the vocabulary
[2023-05-02 08:45:13,250] [    INFO] - Adding          to the vocabulary
[2023-05-02 08:45:13,250] [    INFO] - Adding         to the vocabulary
[2023-05-02 08:45:13,250] [    INFO] - Adding        to the vocabulary
[2023-05-02 08:45:13,250] [    INFO] - Adding       to the vocabulary
[2023-05-02 08:45:13,250] [    INFO] - Adding      to the vocabulary
[2023-05-02 08:45:13,251] [    INFO] - Adding     to the vocabulary
[2023-05-02 08:45:13,251] [    INFO] - Adding    to the vocabulary
[2023-05-02 08:45:13,251] [    INFO] - Adding                                      to the vocabulary
[2023-05-02 08:45:13,251] [    INFO] - Adding                                  to the vocabulary
[2023-05-02 08:45:13,251] [    INFO] - Adding                              to the vocabulary
[2023-05-02 08:45:13,251] [    INFO] - Adding                          to the vocabulary
[2023-05-02 08:45:13,252] [    INFO] - Adding                      to the vocabulary
[2023-05-02 08:45:13,252] [    INFO] - Adding                  to the vocabulary
[2023-05-02 08:45:13,252] [    INFO] - Adding              to the vocabulary
[2023-05-02 08:45:13,252] [    INFO] - Adding          to the vocabulary
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known


During handling of the above exception, another exception occurred:


Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x140053a90>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known


During handling of the above exception, another exception occurred:


Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Salesforce/codegen-350M-mono/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x140053a90>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))


During handling of the above exception, another exception occurred:


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/taskflow/taskflow.py", line 837, in __init__
    self.task_instance = task_class(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/taskflow/code_generation.py", line 59, in __init__
    self._construct_model(model)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/taskflow/code_generation.py", line 65, in _construct_model
    self._model = CodeGenForCausalLM.from_pretrained(model)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/model_utils.py", line 484, in from_pretrained
    return cls.from_pretrained_v2(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/model_utils.py", line 1320, in from_pretrained_v2
    config, model_kwargs = cls.config_class.from_pretrained(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/configuration_utils.py", line 735, in from_pretrained
    config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/configuration_utils.py", line 761, in get_config_dict
    config_dict, kwargs = cls._get_config_dict(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/configuration_utils.py", line 829, in _get_config_dict
    if url_file_exists(community_url):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/utils/downloader.py", line 440, in url_file_exists
    result = requests.head(url)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 100, in head
    return request("head", url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 565, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Salesforce/codegen-350M-mono/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x140053a90>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

the related dependencies in my env:

$ pip3 list|grep paddle
paddle-bfloat       0.1.7
paddle2onnx         1.0.5
paddlefsl           1.1.0
paddlenlp           2.5.2
paddlepaddle        2.4.2

@paddle-bot
Copy link

paddle-bot bot commented May 3, 2023

Thanks for your contribution!

@codecov
Copy link

codecov bot commented May 3, 2023

Codecov Report

Merging #5817 (1dc1dd5) into develop (93e78c2) will increase coverage by 0.00%.
The diff coverage is 100.00%.

❗ Current head 1dc1dd5 differs from pull request most recent head 6d5bb3e. Consider uploading reports for the commit 6d5bb3e to get more accurate results

@@           Coverage Diff            @@
##           develop    #5817   +/-   ##
========================================
  Coverage    61.49%   61.49%           
========================================
  Files          489      489           
  Lines        68976    68980    +4     
========================================
+ Hits         42414    42417    +3     
- Misses       26562    26563    +1     
Impacted Files Coverage Δ
paddlenlp/transformers/configuration_utils.py 72.74% <100.00%> (+0.13%) ⬆️
paddlenlp/transformers/model_utils.py 68.27% <100.00%> (-0.08%) ⬇️

@AlphaHinex AlphaHinex marked this pull request as ready for review May 3, 2023 03:15
@sijunhe sijunhe requested a review from wj-Mcat May 4, 2023 03:53
Copy link
Contributor

@wj-Mcat wj-Mcat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

非常感谢您的PR,可是在 description 里面描述的问题我相信并不需要通过这部分代码来调整:

  1. 你本地下载文件的网络存在问题,部分文件没办法下载
  2. 如果你所有的文件都已经在本地存在好,你可以将其作为from_pretrained的第一个参数来加载即可。

Comment on lines +844 to +845
elif os.path.isfile(os.path.join(cache_dir, CONFIG_NAME)):
resolved_config_file = os.path.join(cache_dir, CONFIG_NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是会优先根据变量pretrained_model_name_or_path(有可能是 file-path、directory-path、url)来加载文件,cache_dir 只是作为下载文件的缓存路径,并不能作为检索文件的路径。

如果你想从 cache_dir 来加载文件的话,可以手动指定路径:

# 从默认缓存路径加载模型
AutoModel.from_pretrained("bert-base-uncased")

# 从自定义缓存路径加载模型
AutoModel.from_pretrained("cache_dir")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢您的 review,在不进行上述代码调整的 description 所描述环境中,我联网时执行如下命令,能够成功:

>>> from paddlenlp.transformers import AutoModel
>>> AutoModel.from_pretrained("Salesforce/codegen-350M-mono")

但当我断开网络连接,再次执行时,则会遇到与上面描述中类似的问题(目前这个 PR 的改动也没能解决这个方法在离线环境不可用的问题),堆栈信息如下:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connection.py", line 174, in _new_conn
    conn = connection.create_connection(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/util/connection.py", line 72, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/socket.py", line 954, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 386, in _make_request
    self._validate_conn(conn)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
    conn.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connection.py", line 363, in connect
    self.sock = conn = self._new_conn()
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connection.py", line 186, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x143035160>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Salesforce/codegen-350M-mono/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x143035160>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/auto/modeling.py", line 478, in from_pretrained
    return cls._from_pretrained(pretrained_model_name_or_path, task, *model_args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/transformers/auto/modeling.py", line 341, in _from_pretrained
    if url_file_exists(standard_community_url):
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/paddlenlp/utils/downloader.py", line 440, in url_file_exists
    result = requests.head(url)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 100, in head
    return request("head", url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/requests/adapters.py", line 565, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='bj.bcebos.com', port=443): Max retries exceeded with url: /paddlenlp/models/community/Salesforce/codegen-350M-mono/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x143035160>: Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))

不知道我对 从默认缓存路径加载模型 这个用法的理解和传入的参数是否有误

Comment on lines +880 to +881
elif os.path.isfile(os.path.join(cache_dir, cls.resource_files_names["model_state"])):
return os.path.join(cache_dir, cls.resource_files_names["model_state"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

此部分代码说明同上。

@AlphaHinex AlphaHinex requested a review from wj-Mcat May 5, 2023 13:05
resolved_vocab_file = get_path_from_url_with_filelock(standard_community_url, cache_dir)
elif os.path.isfile(cached_legacy_config):
resolved_vocab_file = cached_legacy_config
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这样应该可以解决上面提到的 AutoModel 无法在离线环境使用已经下载好的社区模型问题了

通过已下载模型的绝对路径进行加载确实能够在离线环境下使用,但第一次下载社区模型用 community/model-name 的形式,下载后在离线环境使用时,把代码再改成 /path/to/community/model-name 的形式,总是要麻烦一些,不如直接在下载社区模型前检查一下 cache_dir 中是否已经包含下载好的文件,有的话直接使用,也避免了去校验配置文件是否存在等的网络请求

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlphaHinex AlphaHinex changed the title Support local offline environment to read config and model files Support local offline environment to read community config and model files May 11, 2023
@mymynew
Copy link

mymynew commented Jun 30, 2023

@AlphaHinex 问下config.json是从那里下载的?
https://bj.bcebos.com/paddlenlp/models/community/Salesforce/codegen-350M-mono/config.json
这链接下,没这个文件呐。在paddlenlp2.4下部署的时候,不会报这错。
但是paddlenlp2.4下好像不会触发FasterTransformer的自动编译,运行起来use_fast=true or false时间上没有差异。
怀疑paddlenlp2.4没成功启用FasterTransformer,换2.5.2离线部署就一直报config.json下载不到,跟你相同的错。

@AlphaHinex
Copy link
Contributor Author

@AlphaHinex 问下config.json是从那里下载的? https://bj.bcebos.com/paddlenlp/models/community/Salesforce/codegen-350M-mono/config.json 这链接下,没这个文件呐。在paddlenlp2.4下部署的时候,不会报这错。 但是paddlenlp2.4下好像不会触发FasterTransformer的自动编译,运行起来use_fast=true or false时间上没有差异。 怀疑paddlenlp2.4没成功启用FasterTransformer,换2.5.2离线部署就一直报config.json下载不到,跟你相同的错。

config.json 我记得是会 根据 model_config.json 自动生成 的,不用提前下载。

如果不行,可以在联网环境先下载一下模型获得 config.json

@github-actions
Copy link

This Pull Request is stale because it has been open for 60 days with no activity. 当前Pull Request 60天内无活动,被标记为stale。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants