Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

Please refer to newest instructions at official Rasa NLU document

中文Blog

Files you should have:

data/total_word_feature_extractor_zh.dat

Trained from Chinese corpus by MITIE wordrep tools (takes 2-3 days for training)

For training, please build the MITIE Wordrep Tool. Note that Chinese corpus should be tokenized first before feeding into the tool for training. Close-domain corpus that best matches user case works best.

A trained model from Chinese Wikipedia Dump and Baidu Baike can be downloaded from 中文Blog.

data/examples/rasa/demo-rasa_zh.json

Should add as much examples as possible.

Usage:

Clone this project, and run

python setup.py install

Modify configuration.

Currently for Chinese we have two pipelines:

Use MITIE+Jieba (sample_configs/config_jieba_mitie.yml):

language: "zh"

pipeline:
- name: "nlp_mitie"
  model: "data/total_word_feature_extractor_zh.dat"
- name: "tokenizer_jieba"
- name: "ner_mitie"
- name: "ner_synonyms"
- name: "intent_entity_featurizer_regex"
- name: "intent_classifier_mitie"

RECOMMENDED: Use MITIE+Jieba+sklearn (sample_configs/config_jieba_mitie_sklearn.yml):

language: "zh"

pipeline:
- name: "nlp_mitie"
  model: "data/total_word_feature_extractor_zh.dat"
- name: "tokenizer_jieba"
- name: "ner_mitie"
- name: "ner_synonyms"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_mitie"
- name: "intent_classifier_sklearn"

(Optional) Use Jieba User Defined Dictionary or Switch Jieba Default Dictionoary:

You can put in file path or directory path as the "user_dicts" value. (sample_configs/config_jieba_mitie_sklearn_plus_dict_path.yml)

language: "zh"

pipeline:
- name: "nlp_mitie"
  model: "data/total_word_feature_extractor_zh.dat"
- name: "tokenizer_jieba"
  default_dict: "./default_dict.big"
  user_dicts: "./jieba_userdict"
#  user_dicts: "./jieba_userdict/jieba_userdict.txt"
- name: "ner_mitie"
- name: "ner_synonyms"
- name: "intent_entity_featurizer_regex"
- name: "intent_featurizer_mitie"
- name: "intent_classifier_sklearn"

Train model by running:

If you specify your project name in configure file, this will save your model at /models/your_project_name.

Otherwise, your model will be saved at /models/default

python -m rasa_nlu.train -c sample_configs/config_jieba_mitie_sklearn.yml --data data/examples/rasa/demo-rasa_zh.json --path models

Run the rasa_nlu server:

python -m rasa_nlu.server -c sample_configs/config_jieba_mitie_sklearn.yml --path models

Open a new terminal and now you can curl results from the server, for example:

$ curl -XPOST localhost:5000/parse -d '{"q":"我发烧了该吃什么药？", "project": "rasa_nlu_test", "model": "model_20170921-170911"}' | python -mjson.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   652    0   552  100   100    157     28  0:00:03  0:00:03 --:--:--   157
{
    "entities": [
        {
            "end": 3,
            "entity": "disease",
            "extractor": "ner_mitie",
            "start": 1,
            "value": "发烧"
        }
    ],
    "intent": {
        "confidence": 0.5397186422631861,
        "name": "medical"
    },
    "intent_ranking": [
        {
            "confidence": 0.5397186422631861,
            "name": "medical"
        },
        {
            "confidence": 0.16206323981749196,
            "name": "restaurant_search"
        },
        {
            "confidence": 0.1212448457737397,
            "name": "affirm"
        },
        {
            "confidence": 0.10333600028547868,
            "name": "goodbye"
        },
        {
            "confidence": 0.07363727186010374,
            "name": "greet"
        }
    ],
    "text": "我发烧了该吃什么药？"
}

Name	Name	Last commit message	Last commit date
Latest commit crownpku Merge pull request #131 from crownpku/dependabot/pip/alt_requirements… Nov 9, 2023 f995c06 · Nov 9, 2023 History 2,366 Commits
.github	.github	Update FUNDING.yml	Mar 29, 2020
alt_requirements	alt_requirements	Bump gevent from 1.2.2 to 23.9.1 in /alt_requirements	Sep 25, 2023
data	data	update to newest rasa nlu	Apr 30, 2018
docker	docker	working on mitie support	Apr 11, 2018
docs	docs	update to newest rasa nlu	Apr 30, 2018
heroku	heroku	Redesigned docker config and requirement/directory layout	Jul 26, 2017
jieba_userdict	jieba_userdict	add jieba userdict function	Nov 16, 2017
rasa_nlu	rasa_nlu	update jieba_tokenizer.py	Aug 5, 2018
sample_configs	sample_configs	update README.md (#55 )	Jun 4, 2018
test_models	test_models	Fixed tests	Jun 28, 2017
tests	tests	update to newest rasa nlu	Apr 30, 2018
.coveragerc	.coveragerc	wip on conf imnprovements	Mar 15, 2018
.dockerignore	.dockerignore	Changes to get manual-testing of docker working	Jul 27, 2017
.env	.env	add chinese model and demo	Jun 24, 2017
.gitattributes	.gitattributes	fixes docker error on windows: standard_init_linux.go:178: exec	Jul 5, 2017
.gitignore	.gitignore	update to newest rasa nlu	Apr 30, 2018
.travis.yml	.travis.yml	update to newest rasa nlu	Apr 30, 2018
CHANGELOG.rst	CHANGELOG.rst	fix dates in changelog	Apr 22, 2018
CODE_OF_CONDUCT.md	CODE_OF_CONDUCT.md	Create CODE_OF_CONDUCT.md	Sep 13, 2017
LICENSE.txt	LICENSE.txt	Update LICENSE.txt	Jan 6, 2018
MANIFEST.in	MANIFEST.in	removed requirements reading	Jan 10, 2018
Makefile	Makefile	added run script and changed setup	Feb 14, 2018
README.md	README.md	update README.md (#55 )	Jun 4, 2018
app.json	app.json	renamed models occurences to projects. #595	Sep 25, 2017
cloudbuild.yaml	cloudbuild.yaml	Create cloudbuild.yaml	Nov 28, 2017
entrypoint.sh	entrypoint.sh	removed download script	Apr 20, 2018
requirements.txt	requirements.txt	remove -e . from requirements	Jan 3, 2018
setup.cfg	setup.cfg	added wheel to travis	Apr 14, 2018
setup.py	setup.py	update (#47 )	May 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

Please refer to newest instructions at official Rasa NLU document

中文Blog

Files you should have:

Usage:

About

Releases

Sponsor this project

Packages

Contributors 80

Languages

License

crownpku/Rasa_NLU_Chi

Folders and files

Latest commit

History

Repository files navigation

Rasa NLU for Chinese, a fork from RasaHQ/rasa_nlu.

Please refer to newest instructions at official Rasa NLU document

中文Blog

Files you should have:

Usage:

About

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 80

Languages

Packages