The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning. Learn more →
Top 23 Chinese Open-Source Projects
-
funNLP
中英文敏感词、语言检测、中外手机/电话归属地/运营商查询、名字推断性别、手机号抽取、身份证抽取、邮箱抽取、中日文人名库、中文缩写库、拆字词典、词汇情感值、停用词、反动词表、暴恐词表、繁简体转换、英文模拟中文发音、汪峰歌词生成器、职业名称词库、同义词库、反义词库、否定词库、汽车品牌词库、汽车零件词库、连续英文切割、各种中文词向量、公司名字大全、古诗词库、IT词库、财经词库、成语词库、地名词库、历史名人词库、诗词词库、医学词库、饮食词库、法律词库、汽车词库、动物词库、中文聊天语料、中文谣言数据、百度中文问答数据集、句子相似度匹配算法集合、bert资源、文本生成&摘要相关工具、cocoNLP信息抽取工具、国内电话号码正则匹配、清华大学XLORE:中英文跨语言百科知识图谱、清华大学人工智能技术系列报告、自然语言生成、NLU太难了系列、自动对联数据及机器人、用户名黑名单列表、罪名法务名词及分类模型、微信公众号语料、cs224n深度学习自然语言处理课程、中文手写汉字识别、中文自然语言处理 语料/数据集、变量命名神器、分词语料库+代码、任务型对话英文数据集、ASR 语音数据集 + 基于深度学习的中文
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
chinese-poetry
The most comprehensive database of Chinese poetry 🧶最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人,21050首词。
-
English-level-up-tips
An advanced guide to learn English which might benefit you a lot 🎉 . 离谱的英语学习指南/英语学习教程。
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
Qwen
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
-
Huatuo-Llama-Med-Chinese
Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调
-
awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合
-
Chinese-CLIP
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Why is this chinese repo popular ? How do non-chinese reader read it ? | /r/github | 2023-07-13
Qwen: https://github.com/QwenLM/Qwen
I usually see and recommend: https://go.dev/talks/2013/bestpractices.slide#1 https://go.dev/talks/2014/readability.slide#1 https://github.com/golang/go/wiki/CodeReviewComments https://about.sourcegraph.com/blog/go/idiomatic-go https://github.com/teivah/100-go-mistakes
Project mention: Baichuan 7B reaches top of LLM leaderboard for it's size (New foundation model 4K tokens) | /r/LocalLLaMA | 2023-06-17GitHub: baichuan-inc/baichuan-7B: A large-scale 7B pretraining language model developed by BaiChuan-Inc. (github.com)
Could probably whip up a python script real quick with this library: https://github.com/mozillazg/python-pinyin. Probably need some extra logic to deal with heteronyms. Not sure what your goal is.
Huatuo-Llama-Med-Chinese https://github.com/SCIR-HI/Huatuo-Llama-Med-Chinese
Chinese related posts
- What the heck is so great about this model?
- New open-source LLM model Qwen 72B surpasses GPT4 in 4 of 10 benchmarks
- Qwen (通义千问) chat and pretrained large language model by Alibaba Cloud
- Cornucopia-LLaMA-Fin-Chinese: NEW Textual - star count:263.0
- Baichuan IA de China
- Cornucopia-LLaMA-Fin-Chinese: NEW Textual - star count:221.0
- Why is this chinese repo popular ? How do non-chinese reader read it ?
-
A note from our sponsor - WorkOS
workos.com | 19 Apr 2024
Index
What are some of the best open-source Chinese projects? This list will help you:
Project | Stars | |
---|---|---|
1 | funNLP | 63,684 |
2 | HowToCook | 60,023 |
3 | chinese-poetry | 46,565 |
4 | English-level-up-tips | 34,903 |
5 | HowToLiveLonger | 29,051 |
6 | awesome-malware-analysis | 11,026 |
7 | Ehviewer_CN_SXJ | 10,888 |
8 | Qwen | 10,691 |
9 | chinese-xinhua | 10,627 |
10 | GPT2-Chinese | 7,342 |
11 | uber_go_guide_cn | 7,324 |
12 | pkuseg-python | 6,382 |
13 | 100-go-mistakes | 6,239 |
14 | Baichuan-7B | 5,629 |
15 | 汉字拼音转换工具(Python 版) | 4,666 |
16 | security-101-for-saas-startups | 4,576 |
17 | Huatuo-Llama-Med-Chinese | 4,210 |
18 | awesome-pretrained-chinese-nlp-models | 4,172 |
19 | text-classification-cnn-rnn | 4,066 |
20 | Baichuan2 | 3,909 |
21 | Chinese-CLIP | 3,529 |
22 | Baichuan-13B | 2,954 |
23 | ark-pixel-font | 2,945 |