Add Chinese readme

This commit is contained in:
ZW-ZHANG 2023-01-19 21:20:56 +08:00
parent be42534b56
commit feb4f903ba
2 changed files with 127 additions and 43 deletions

View File

@ -1,81 +1,78 @@
[English Introduction](README_en.md)
# NAS-Bench-Graph
本开源库提供了NAS-Bench-Graph的官方代码和所有评估好的架构效果。
This repository provides the official codes and all evaluated architectures for NAS-Bench-Graph, a tailored benchmark for graph neural architecture search.
NAS-Bench-Graph是首个为图神经架构搜索设计的基准填补了图数据上神经架构表格式基准的空白。NAS-Bench-Graph由清华大学多媒体与网络实验室设计与开发完成人包括博士生秦一鉴、博士后张子威、博士生张泽阳指导教师为朱文武教授和王鑫助理教授。
Yijian Qin, Ziwei Zhang, Xin Wang, Zeyang Zhang, Wenwu Zhu, [NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search](https://openreview.net/pdf?id=bBff294gqLp) (NeurIPS 2022)
## 安装
NAS-Bench-Graph可以直接通过PyPI安装`pip install nas_bench_graph`
## Install from PyPI
You can directly install our benchmark by `pip install nas_bench_graph`
## Usage
First, read the benchmark of a certain dataset by specifying the name. The nine supported datasets are: cora, citeseer, pubmed, cs, physics, photo, computers, arxiv, and proteins. For example, for the Cora dataset:
```
## 基准的使用
首先需要通过名称指定读取某个数据集的基准。NAS-Bench-Graph中支持的九个数据集分别是cora、citeseer、pubmed、cs、physics、photo、computers、arxiv 和 proteins。例如对于cora数据集示例代码如下```
from nas_bench_graph import lightread
bench = lightread('cora')
```
The data is stored as a `dict` in Python.
数据存储为Pyhon中的字典即`dict`类型。
Then, an architecture needs to be specified by its macro space and operations.
We consider the macro space as a directed acyclic graph (DAG) and constrain the DAG to have only one input node for each intermediate node. Therefore, the macro space can be specificed by a list of integers, indicating the input node index for each computing node (0 for the raw input, 1 for the first computing node, etc.). Then, the operations can be specified by a list of strings with the same length. For example, we provide the code to specify the architecture in the following figure:
读取数据后需要通过架构的宏观搜索空间选项和对应的操作指定架构。NAS-Bench-Graph的宏观搜索空间采用一个有向无环图 (DAG)并约束DAG中的每个中间计算节点只含有一个输入。
因此宏观搜索空间可以由一个整数的列表来指定代表每个计算节点输入的编号其中0表示原始输入1表示第一个计算节点以此类推
然后,每个节点的操作可以由另一个相同长度的字符串列表指定。例如,对于下图中的示例架构,可以采用如下的代码指定:
![arch](https://user-images.githubusercontent.com/17705534/173767528-eda1bc64-f4d8-4da1-a0e9-8470f55ccc6a.png)
```
from nas_bench_graph import Arch
arch = Arch([0, 1, 2, 1], ['gcn', 'gin', 'fc', 'cheb'])
# 0 means the inital computing node is connected to the input node
# 1 means the next computing node is connected to the first computing node
# 2 means the next computing node is connected to the second computing node
# 1 means there is another computing node connected to the first computing node
# 0表示初始计算节点的输入节点是0即原始数据
# 1表示下一个计算节点的输入节点是1号计算节点
# 2表示下一个计算节点的输入节点是2号计算节点
# 1表示下一个计算节点的输入节点是1号计算节点
```
请注意,我们假设所有的叶节点(即没有后代的节点)都连接到输出中,因此不需要指定输出节点。
Notice that we assume all leaf nodes (i.e., nodes without descendants) are connected to the output, so there is no need to specific the output node.
Besides, the list can be specified in any order, e.g., the following code can specific the same architecture:
此外,考虑到图的同构性,列表中的顺序可以随意指定,例如,下面的代码代表与上面相同的架构:
```
arch = Arch([0, 1, 1, 2], ['gcn', 'cheb', 'gin', 'fc'])
```
The benchmark data can be obtained by a look-up table. In this repository, we only provide the validation and test performance, the latency, and the number of parameters as follows:
基准中,模型所有记录的数据都可以通过查表获得。在本开源库中,我们仅提供了验证集和测试集的效果、延迟和模型参数量,如下所示:
```
info = bench[arch.valid_hash()]
info['valid_perf'] # validation performance
info['perf'] # test performance
info['latency'] # latency
info['para'] # number of parameters
info['valid_perf'] # 验证集效果
info['perf'] # 测试集效果
info['latency'] # 延迟
info['para'] # 模型参数量
```
For the complete benchmark, please downloadfrom https://figshare.com/articles/dataset/NAS-bench-Graph/20070371, which contains the training/validation/testing performance at each epoch. Since we run each dataset with three random seeds, each dataset has 3 files, e.g.,
如需基准的完整数据请从以下链接下载https://figshare.com/articles/dataset/NAS-bench-Graph/20070371。
完整数据包含每个训练轮epoch上训练/验证/测试集的效果。由于NAS-Bench-Graph在每个数据集使用了三个随机种子进行重复试验因此每个数据集包含3个文件例如
```
from nas_bench_graph import read
bench = read('cora0.bench') # cora1.bench and cora2.bench
bench = read('cora0.bench')
```
另两个随机种子对应cora1.bench和cora2.bench。
The full metric for any epoch can be obtained as follows.
所有训练轮中的完整指标可以按如下方式获得:
```
info = bench[arch.valid_hash()]
epoch = 50
info['dur'][epoch][0] # training performance
info['dur'][epoch][1] # validation performance
info['dur'][epoch][2] # testing performance
info['dur'][epoch][3] # training loss
info['dur'][epoch][4] # validation loss
info['dur'][epoch][5] # testing loss
info['dur'][epoch][6] # best performance
info['dur'][epoch][0] # 训练集效果
info['dur'][epoch][1] # 验证集效果
info['dur'][epoch][2] # 测试集效果
info['dur'][epoch][3] # 训练集损失函数
info['dur'][epoch][4] # 验证集损失函数
info['dur'][epoch][5] # 测试集损失函数
info['dur'][epoch][6] # 最佳模型效果
```
## Example usage of NNI and AutoGL
NAS-Bench-Graph can be used together with other libraries such AutoGL and NNI.
## NNI和智图AutoGL
NAS-Bench-Graph可以与其他图神经架构搜索库一起使用例如NNINeural Network Intelligence或智图AutoGL
For the usage of [AutoGL](https://github.com/THUMNLab/AutoGL), please refer to the [agnn branch](https://github.com/THUMNLab/AutoGL/tree/agnn).
对于[智图库](https://gitlink.org.cn/THUMNLab/AutoGL) 中的使用,请参阅智图库的[教程文档](http://mn.cs.tsinghua.edu.cn/AutoGL/docfile/tutorial_cn/t_nas_bench_graph.html)。 (需要版本至少为v0.4)
You can also refer to `runnni.py` to use the benchmark together with [NNI](https://github.com/microsoft/nni/).
对于[NNI](https://github.com/microsoft/nni/) 中的使用,请参阅代码示例`runnni.py`。
## Citation
If you find that NAS-Bench-Graph helps your research, please consider citing it:
## 引用
如果您在论文或项目中使用了NAS-Bench-Graph请按照下列方式进行引用
```
@inproceedings{qin2022nas,
title = {NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search},

87
README_en.md Normal file
View File

@ -0,0 +1,87 @@
# NAS-Bench-Graph
This repository provides the official codes and all evaluated architectures for NAS-Bench-Graph, a tailored benchmark for graph neural architecture search.
Yijian Qin, Ziwei Zhang, Xin Wang, Zeyang Zhang, Wenwu Zhu, [NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search](https://openreview.net/pdf?id=bBff294gqLp) (NeurIPS 2022)
## Install from PyPI
You can directly install our benchmark by `pip install nas_bench_graph`
## Usage
First, read the benchmark of a certain dataset by specifying the name. The nine supported datasets are: cora, citeseer, pubmed, cs, physics, photo, computers, arxiv, and proteins. For example, for the Cora dataset:
```
from nas_bench_graph import lightread
bench = lightread('cora')
```
The data is stored as a `dict` in Python.
Then, an architecture needs to be specified by its macro space and operations.
We consider the macro space as a directed acyclic graph (DAG) and constrain the DAG to have only one input node for each intermediate node. Therefore, the macro space can be specificed by a list of integers, indicating the input node index for each computing node (0 for the raw input, 1 for the first computing node, etc.). Then, the operations can be specified by a list of strings with the same length. For example, we provide the code to specify the architecture in the following figure:
![arch](https://user-images.githubusercontent.com/17705534/173767528-eda1bc64-f4d8-4da1-a0e9-8470f55ccc6a.png)
```
from nas_bench_graph import Arch
arch = Arch([0, 1, 2, 1], ['gcn', 'gin', 'fc', 'cheb'])
# 0 means the inital computing node is connected to the input node
# 1 means the next computing node is connected to the first computing node
# 2 means the next computing node is connected to the second computing node
# 1 means there is another computing node connected to the first computing node
```
Notice that we assume all leaf nodes (i.e., nodes without descendants) are connected to the output, so there is no need to specific the output node.
Besides, the list can be specified in any order, e.g., the following code can specific the same architecture:
```
arch = Arch([0, 1, 1, 2], ['gcn', 'cheb', 'gin', 'fc'])
```
The benchmark data can be obtained by a look-up table. In this repository, we only provide the validation and test performance, the latency, and the number of parameters as follows:
```
info = bench[arch.valid_hash()]
info['valid_perf'] # validation performance
info['perf'] # test performance
info['latency'] # latency
info['para'] # number of parameters
```
For the complete benchmark, please downloadfrom https://figshare.com/articles/dataset/NAS-bench-Graph/20070371, which contains the training/validation/testing performance at each epoch. Since we run each dataset with three random seeds, each dataset has 3 files, e.g.,
```
from nas_bench_graph import read
bench = read('cora0.bench') # cora1.bench and cora2.bench
```
The full metric for any epoch can be obtained as follows.
```
info = bench[arch.valid_hash()]
epoch = 50
info['dur'][epoch][0] # training performance
info['dur'][epoch][1] # validation performance
info['dur'][epoch][2] # testing performance
info['dur'][epoch][3] # training loss
info['dur'][epoch][4] # validation loss
info['dur'][epoch][5] # testing loss
info['dur'][epoch][6] # best performance
```
## Example usage of NNI and AutoGL
NAS-Bench-Graph can be used together with other libraries such AutoGL and NNI.
For the usage of [AutoGL](https://github.com/THUMNLab/AutoGL), please refer to the [agnn branch](https://github.com/THUMNLab/AutoGL/tree/agnn).
You can also refer to `runnni.py` to use the benchmark together with [NNI](https://github.com/microsoft/nni/).
## Citation
If you find that NAS-Bench-Graph helps your research, please consider citing it:
```
@inproceedings{qin2022nas,
title = {NAS-Bench-Graph: Benchmarking Graph Neural Architecture Search},
author = {Qin, Yijian and Zhang, Ziwei and Wang, Xin and Zhang, Zeyang and Zhu, Wenwu},
booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track}
year = {2022}
}
```