diff --git a/README.md b/README.md index f04c5ad..94b9259 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@

-🤗 Hugging Face • 🤖 ModelScope • 💬 WeChat• 📜Tech Report +🤗 Hugging Face • 🤖 ModelScope • 💬 WeChat• 📜Tech Report

@@ -59,7 +59,7 @@ - 除此之外,我们还公开了训练Skywork-13B模型中使用的评估方法、数据配比研究和训练基础设施调优方案等信息。我们希望这些开源内容能够进一步启发社区对于大型模型预训练的认知,并推动人工智能通用智能(AGI)的实现。 -如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://arxiv.org/skywork-tech-report),[Skymath](https://arxiv.org/abs/2310.16713)论文,[SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf)论文。 +如果您希望了解更多的信息,如训练方案,评估方法,请参考我们的[技术报告](https://github.com/SkyworkAI/Skywork/blob/main/Skywork_13b_tech_report.pdf),[Skymath](https://arxiv.org/abs/2310.16713)论文,[SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf)论文。 # 🔥 更新信息 * 2023.10.30 我们开源了**Skywork-13B-Base** 和 **Skywork-13B-Math** 以及对应模型的量化模型。我们开源了**Skywork/Skypile-150B**数据集,该数据集包含根据中文网页清洗的超过**150亿**高质量中文token,硬盘大小大约600GB,是已知目前最大的开源中文数据集。 @@ -1253,9 +1253,8 @@ The community usage of Skywork model requires [Skywork Community License](https: 如果您觉得我们的工作对您有帮助,欢迎引用我们的论文~ ``` @article{skyworktechreport, - title={}, - author={}, - journal={arXiv preprint arXiv:}, + title={Skywork: A More Open Bilingual Foundation Model}, + author={Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu,Chenxia Li, Liu Yang, Xilin Luo, Xuejie Wu, Lunan Liu, Wenjun Cheng, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Lei Lin, Xiaokun Wang, Yutuan Ma, Chuanhai Dong, Yanqi Sun, Yifu Chen, Yongyi Peng, Xiaojuan Liang, Shuicheng Yan, Han Fang, Yahui Zhou}, year={2023} } ``` diff --git a/README_EN.md b/README_EN.md index 19ecfe9..6eadd7f 100644 --- a/README_EN.md +++ b/README_EN.md @@ -8,10 +8,9 @@

-🤗 Hugging Face • 🤖 ModelScope • 💬 WeChat• 📜Tech Report +🤗 Hugging Face • 🤖 ModelScope • 💬 WeChat• 📜Tech Report

-
[![GitHub Stars](https://img.shields.io/github/stars/SkyworkAI/Skywork)](https://github.com/SkyworkAI/Skywork/stargazers) @@ -51,7 +50,7 @@ Our open-source Skywork series models can be used for commercial purposes, but y - In addition, we have also disclosed the evaluation methods, data distribution studies, and training infrastructure optimization plans used in training the Skywork-13B model. We hope that these open-source materials can further inspire the community's understanding of large-scale model pre-training and drive the realization of Artificial General Intelligence (AGI). -If you are interested in more training and evaluation details, please refer to our [technical report](https://arxiv.org/skywork-tech-report), [Skymath]((https://arxiv.org/skywork-tech-report)) paper and [SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf) paper. +If you are interested in more training and evaluation details, please refer to our [technical report](https://github.com/SkyworkAI/Skywork/blob/main/Skywork_13b_tech_report.pdf), [Skymath]((https://arxiv.org/skywork-tech-report)) paper and [SkyworkMM](https://github.com/will-singularity/Skywork-MM/blob/main/skywork_mm.pdf) paper. # News and Updates * 2023.10.30 We release the **Skywork-13B-Base** and **Skywork-13B-Math** models, as well as quantized versions of each model to support deployment and inference on consumer-grade GPUs. We open-source the Skywork/Skypile-150B dataset. This dataset contains over 150 billion high-quality tokens cleaned from Chinese web pages, making it the largest open-source Chinese dataset currently known. @@ -1249,9 +1248,8 @@ The community usage of Skywork model requires [Skywork Community License](https: If you find our work helpful, please feel free to cite our paper~ ``` @article{skyworktechreport, - title={}, - author={}, - journal={arXiv preprint arXiv:}, + title={Skywork: A More Open Bilingual Foundation Model}, + author={Tianwen Wei, Liang Zhao, Lichang Zhang, Bo Zhu, Lijie Wang, Haihua Yang, Biye Li, Cheng Cheng, Weiwei Lü, Rui Hu,Chenxia Li, Liu Yang, Xilin Luo, Xuejie Wu, Lunan Liu, Wenjun Cheng, Peng Cheng, Jianhao Zhang, Xiaoyu Zhang, Lei Lin, Xiaokun Wang, Yutuan Ma, Chuanhai Dong, Yanqi Sun, Yifu Chen, Yongyi Peng, Xiaojuan Liang, Shuicheng Yan, Han Fang, Yahui Zhou}, year={2023} } ``` diff --git a/Skywork_13b_tech_report.pdf b/Skywork_13b_tech_report.pdf new file mode 100644 index 0000000..5e7d7e5 Binary files /dev/null and b/Skywork_13b_tech_report.pdf differ