forked from Open-CT/opendata
842 lines
44 KiB
Plaintext
842 lines
44 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"# 计算思维数据处理"
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"数据处理要求:\r\n",
|
||
"### 1、每个题的平均作答时长;——结果数据\r\n",
|
||
"### 2、每个题的编码种类(有多少种,分别是什么,每种多少学生);——结果数据\r\n",
|
||
"3、每个题的每个操作步骤的平均作答时长;——过程数据\r\n",
|
||
"### 4、正确率(暂未提供标准编码,可以探索一下题目本身,协助形成标准答案编码);——结果数据\r\n",
|
||
"5、关键节点(通过数据,探索学生在从初始状态向终止状态进行的过程中,有几个关键步骤,每个关键步骤有几种类型的关键节点编码),体现“用数据说话”去探索关键节点。——过程数据\r\n",
|
||
"6、每个题目的每种编码下,都有什么样的学生作答类型,比如都是正确的,但是可以聚成多少类,每一类有什么特征,学生是通过什么样的操作路径到达最终的。\r\n",
|
||
"### 7、每道题目的正确率;——结果数据\r\n",
|
||
"\r\n",
|
||
"<font color=\"red\">注意 要求点四和要求点七相同,将一同分析"
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"## 0、数据加载和预处理\r\n",
|
||
"首先读取数据,并通过pandas进行数据帧处理\r\n",
|
||
"为了排版整洁和处理方便,以及保证工具的可拓展性,这里将处理工具封装成`data_analysis`的python类,并在jupyter中调用,该类在jupyter notebook同文件夹下的`main.py`文件中"
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"source": [
|
||
"# 首先引入需要的第三方库\r\n",
|
||
"import pandas as pd\r\n",
|
||
"import json \r\n",
|
||
"import numpy as np\r\n",
|
||
"import ast\r\n",
|
||
"from datetime import datetime\r\n",
|
||
"import plotly.graph_objs as go\r\n",
|
||
"from plotly.offline import plot\r\n",
|
||
"import plotly.offline as offline\r\n",
|
||
"from pandas.core.indexes import interval\r\n",
|
||
"import plotly.figure_factory as ff\r\n",
|
||
"pyolt=plot\r\n",
|
||
"import plotly.express as px\r\n",
|
||
"import math\r\n",
|
||
"import re"
|
||
],
|
||
"outputs": [],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"source": [
|
||
"# 从main.py中引用对应的工具\r\n",
|
||
"from main import data_analysis"
|
||
],
|
||
"outputs": [],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"source": [
|
||
"# 首先将excel文件读取为pandas的dataframe类型\r\n",
|
||
"# 然后将该dataframe作为参数以初始化对应的数据处理工具,这里没有对excel文件进行预处理,\r\n",
|
||
"# 超时数据还未删除,并且并没有对不同学校进行学生分类,所以命名为'df_all'\r\n",
|
||
"df_all = pd.read_excel('./data/ticket_user_mianyang.xlsx') \r\n",
|
||
"# dataframe并不直接对其进行处理,而是作为参数初始化一个类的实体,这样的好处是可以避免大量的代码冗余\r\n",
|
||
"# 在对不同学校和不同数据行进行分类后处理时,只需要额外生成新的类的实体即可\r\n",
|
||
"# 在这里对应所有数据行直接生成一个实体,命名为'df_all_entity'\r\n",
|
||
"# 注意在参数里有一个命名为'name'的参数,这里是方便在调试过程中快速判断出问题的是哪个dataframe\r\n",
|
||
"df_all_entity = data_analysis(df = df_all, name = 'all')\r\n",
|
||
"# 该步骤运行时间较长,在32秒左右"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"output_type": "stream",
|
||
"name": "stdout",
|
||
"text": [
|
||
"init complete\n"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"### 初始化完成\r\n",
|
||
"初始化完成的实体中有以下几个属性:\r\n",
|
||
"self.df\r\n",
|
||
" 这是未经任何处理的原始表格数据\r\n",
|
||
"self.count_df_list\r\n",
|
||
" 这是初步对答案编码进行归类并计数的dataframe的列表,该列表中包含22个dataframe,其中每个dataframe里针对不同回答的编码进行分类并计数。注意,这里只是初步计数,有某些编码再更抽象的意义上是等效的,这一点会在之后统计正确率的时候统一处理"
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"source": [
|
||
"print(\"题目个数为:\",len(df_all_entity.count_df_list))\r\n",
|
||
"print(\"展示22个datafame中的第0个:\")\r\n",
|
||
"df_all_entity.count_df_list[0]"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"output_type": "stream",
|
||
"name": "stdout",
|
||
"text": [
|
||
"题目个数为: 22\n",
|
||
"展示22个datafame中的第0个:\n"
|
||
]
|
||
},
|
||
{
|
||
"output_type": "execute_result",
|
||
"data": {
|
||
"text/plain": [
|
||
" 0\n",
|
||
"ans_str \n",
|
||
"None 1\n",
|
||
"[0, 180, 'SmoothScroll'] 1\n",
|
||
"[0, 2, 3, [0, 1, 3, 2]] 46\n",
|
||
"[0, 2, 3, [0, 2, 3, 4]] 91\n",
|
||
"[0, 2, 3, [0, 2, 4, 3]] 87\n",
|
||
"[0, 2, 3, [0, 3, 1, 2]] 121\n",
|
||
"[0, 2, 3, [0, 3, 2, 1]] 62\n",
|
||
"[0, 2, 3, [0, 4, 2, 1]] 47\n",
|
||
"[0, 2, 3, [1, 2, 3, 4]] 1082\n",
|
||
"[0, 2, 3, [1, 2, 4, 3]] 29\n",
|
||
"[0, 2, 3, [1, 4, 2, 3]] 35\n",
|
||
"[0, 2, 3, [1, 4, 3, 2]] 21\n",
|
||
"[0, 2, 3, [2, 4, 6, 5]] 1\n",
|
||
"[0, 2, 3, [3, 0, 1, 2]] 158\n",
|
||
"[0, 2, 3, [3, 0, 2, 1]] 28\n",
|
||
"[0, 2, 3, [3, 2, 0, 1]] 11\n",
|
||
"[0, 2, 3, [3, 2, 1, 0]] 9\n",
|
||
"[0, 2, 3, [4, 0, 2, 1]] 32\n",
|
||
"[0, 2, 3, [4, 1, 2, 3]] 34\n",
|
||
"[0, 2, 3, [4, 1, 3, 2]] 24\n",
|
||
"[0, 2, 3, [4, 2, 0, 1]] 34\n",
|
||
"[0, 2, 3, [4, 2, 1, 0]] 10\n",
|
||
"[0, 2, 3, [4, 3, 1, 0]] 12\n",
|
||
"[0, 2, 4, [0, 1, 2, 3, 4]] 2285\n",
|
||
"[0, 2, 4, [0, 1, 2, 4, 3]] 14\n",
|
||
"[0, 2, 4, [0, 1, 4, 2, 3]] 10\n",
|
||
"[0, 2, 4, [0, 1, 4, 3, 2]] 16\n",
|
||
"[0, 2, 4, [0, 4, 1, 2, 3]] 48\n",
|
||
"[0, 2, 4, [0, 4, 1, 3, 2]] 24\n",
|
||
"[0, 2, 4, [0, 4, 3, 1, 2]] 22\n",
|
||
"[0, 2, 4, [0, 4, 3, 2, 1]] 60\n",
|
||
"[0, 2, 4, [4, 0, 1, 2, 3]] 189\n",
|
||
"[0, 2, 4, [4, 0, 1, 3, 2]] 26\n",
|
||
"[0, 2, 4, [4, 0, 3, 1, 2]] 13\n",
|
||
"[0, 2, 4, [4, 0, 3, 2, 1]] 12\n",
|
||
"[0, 2, 4, [4, 3, 0, 1, 2]] 10\n",
|
||
"[0, 2, 4, [4, 3, 0, 2, 1]] 5\n",
|
||
"[0, 2, 4, [4, 3, 2, 0, 1]] 6\n",
|
||
"[0, 2, 4, [4, 3, 2, 1, 0]] 88\n",
|
||
"[0, 2, 5, [1, 2, 3, 4, 5, 6]] 1\n",
|
||
"[0, 2, 5, [1, 2, 6, 5, 3, 4]] 1\n",
|
||
"[0, 2, 5, [5, 4, 3, 0, 1, 2]] 1\n",
|
||
"[1, 2, 2, [1, 3, 2]] 5152\n",
|
||
"[1, 2, 2, [2, 0, 1]] 15067\n",
|
||
"[1, 2, 2, [2, 1, 0]] 4725\n",
|
||
"[1, 2, 2, [3, 1, 0]] 1713\n",
|
||
"[True, True] 12"
|
||
],
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>0</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ans_str</th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>None</th>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 180, 'SmoothScroll']</th>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [0, 1, 3, 2]]</th>\n",
|
||
" <td>46</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [0, 2, 3, 4]]</th>\n",
|
||
" <td>91</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [0, 2, 4, 3]]</th>\n",
|
||
" <td>87</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [0, 3, 1, 2]]</th>\n",
|
||
" <td>121</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [0, 3, 2, 1]]</th>\n",
|
||
" <td>62</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [0, 4, 2, 1]]</th>\n",
|
||
" <td>47</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [1, 2, 3, 4]]</th>\n",
|
||
" <td>1082</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [1, 2, 4, 3]]</th>\n",
|
||
" <td>29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [1, 4, 2, 3]]</th>\n",
|
||
" <td>35</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [1, 4, 3, 2]]</th>\n",
|
||
" <td>21</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [2, 4, 6, 5]]</th>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [3, 0, 1, 2]]</th>\n",
|
||
" <td>158</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [3, 0, 2, 1]]</th>\n",
|
||
" <td>28</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [3, 2, 0, 1]]</th>\n",
|
||
" <td>11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [3, 2, 1, 0]]</th>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [4, 0, 2, 1]]</th>\n",
|
||
" <td>32</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [4, 1, 2, 3]]</th>\n",
|
||
" <td>34</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [4, 1, 3, 2]]</th>\n",
|
||
" <td>24</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [4, 2, 0, 1]]</th>\n",
|
||
" <td>34</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [4, 2, 1, 0]]</th>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 3, [4, 3, 1, 0]]</th>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 1, 2, 3, 4]]</th>\n",
|
||
" <td>2285</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 1, 2, 4, 3]]</th>\n",
|
||
" <td>14</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 1, 4, 2, 3]]</th>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 1, 4, 3, 2]]</th>\n",
|
||
" <td>16</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 4, 1, 2, 3]]</th>\n",
|
||
" <td>48</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 4, 1, 3, 2]]</th>\n",
|
||
" <td>24</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 4, 3, 1, 2]]</th>\n",
|
||
" <td>22</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [0, 4, 3, 2, 1]]</th>\n",
|
||
" <td>60</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 0, 1, 2, 3]]</th>\n",
|
||
" <td>189</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 0, 1, 3, 2]]</th>\n",
|
||
" <td>26</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 0, 3, 1, 2]]</th>\n",
|
||
" <td>13</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 0, 3, 2, 1]]</th>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 3, 0, 1, 2]]</th>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 3, 0, 2, 1]]</th>\n",
|
||
" <td>5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 3, 2, 0, 1]]</th>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 4, [4, 3, 2, 1, 0]]</th>\n",
|
||
" <td>88</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 5, [1, 2, 3, 4, 5, 6]]</th>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 5, [1, 2, 6, 5, 3, 4]]</th>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[0, 2, 5, [5, 4, 3, 0, 1, 2]]</th>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[1, 2, 2, [1, 3, 2]]</th>\n",
|
||
" <td>5152</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[1, 2, 2, [2, 0, 1]]</th>\n",
|
||
" <td>15067</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[1, 2, 2, [2, 1, 0]]</th>\n",
|
||
" <td>4725</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[1, 2, 2, [3, 1, 0]]</th>\n",
|
||
" <td>1713</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>[True, True]</th>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"execution_count": 4
|
||
}
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"## 每道题的编码种类和正确率分析"
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"source": [
|
||
"accuracy_list = []\r\n",
|
||
"addition_list = []\r\n",
|
||
"for i, df in enumerate(df_all_entity.count_df_list):\r\n",
|
||
" print(i)\r\n",
|
||
" additional_infor_df = pd.DataFrame({'list':[ast.literal_eval(index) for index in df.index]})\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'count', list( df.iloc[:, 0]))\r\n",
|
||
" additional_infor_df.to_excel('./output/all'+'/' +str(i) + '_count.xlsx')\r\n",
|
||
" if i in [0, 1]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) != 4:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['0' if l==None else str(l[0]) for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num)\r\n",
|
||
" elif i in [2,3]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) == 0:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != str:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
"\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)!=0 and l[0]=='00' else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num) \r\n",
|
||
" elif i in [5]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) == 0:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != str:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
" \r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)==2 and l[0]+l[1]=='B_AC_A' else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num)\r\n",
|
||
" elif i in [6]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) == 0:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != str:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
"\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)==5 and l[0]+l[1]+l[2]+l[3]+l[4]=='B_AC_AG_FD_BE_B' else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num)\r\n",
|
||
" elif i in [7]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) != 4:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != list or type(additional_infor_df.iloc[j,0][1]) != list or type(additional_infor_df.iloc[j,0][2]) != list or type(additional_infor_df.iloc[j,0][3]) != list:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
"\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)==4 and [[0,1],[0,2],[1,2]] in l and [[0,4],[0,5],[1,5]] in l and [[0,6],[0,7],[1,7]] in l and [[0,10],[0,11],[1,11]] else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num)\r\n",
|
||
" elif i in [8] :\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) != 5:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != list or type(additional_infor_df.iloc[j,0][1]) != list or type(additional_infor_df.iloc[j,0][2]) != list or type(additional_infor_df.iloc[j,0][3]) != list or type(additional_infor_df.iloc[j,0][4]) != list:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
"\r\n",
|
||
" verify_list = [[[1, 0], [2, 0], [3, 0], [3, 1]], [[0, 1], [1, 1], [2, 1], [2, 2]], [[3, 2], [4, 2], [5, 2], [5, 3]], [[2, 3], [3, 3], [4, 3], [4, 4]], [[0, 4], [1, 4], [2, 4], [2, 5]]]\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)==5 and verify_list[0] in l and verify_list[1] in l and verify_list[2] in l and verify_list[3] in l and verify_list[4] in l else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num)\r\n",
|
||
" elif i in [9]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) == 0:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != str:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" \r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
" for row in range(len(additional_infor_df)):\r\n",
|
||
" list_temp = additional_infor_df.loc[row,'list']\r\n",
|
||
" if list_temp!=None and len(list_temp)==2:\r\n",
|
||
" if list_temp[0][0:2] > list_temp[0][-2:]:\r\n",
|
||
" list_temp[0] = list_temp[0][-2:] + '_' + list_temp[0][0:2]\r\n",
|
||
" if list_temp[1][0:2] > list_temp[1][-2:]:\r\n",
|
||
" list_temp[1] = list_temp[1][-2:] + '_' + list_temp[1][0:2]\r\n",
|
||
" if list_temp[0] > list_temp[1]:\r\n",
|
||
" additional_infor_df._set_value(row,'list', str([list_temp[1], list_temp[0]]))\r\n",
|
||
" else:\r\n",
|
||
" additional_infor_df._set_value(row,'list', str(list_temp))\r\n",
|
||
" else:\r\n",
|
||
" additional_infor_df._set_value(row,'list', str(list_temp))\r\n",
|
||
" grouped = additional_infor_df.groupby('list')['count'].sum()\r\n",
|
||
" additional_infor_df = pd.DataFrame({'list':[ast.literal_eval(index) for index in grouped.index]})\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'count', list( grouped.iloc[:]))\r\n",
|
||
" verify_list = [['02_09', '05_06'],['02_05', '06_09'],['02_06', '05_09']]\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)==2 and l in verify_list else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/df_all_entity.row_num)\r\n",
|
||
" elif i in [10]:\r\n",
|
||
" drop_index_list = []\r\n",
|
||
" for j in range(len(additional_infor_df)):\r\n",
|
||
" if additional_infor_df.iloc[j,0]== None:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif len(additional_infor_df.iloc[j,0]) == 0:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" elif type(additional_infor_df.iloc[j,0][0]) != str:\r\n",
|
||
" drop_index_list.append(j)\r\n",
|
||
" additional_infor_df = additional_infor_df.drop(drop_index_list)\r\n",
|
||
" additional_infor_df = additional_infor_df.reset_index(drop=True)\r\n",
|
||
" for row in range(len(additional_infor_df)):\r\n",
|
||
" list_temp = additional_infor_df.loc[row, 'list']\r\n",
|
||
" \r\n",
|
||
" if list_temp!=None:\r\n",
|
||
" list_temp = [[int(rebuild_i) for rebuild_i in re.findall(r\"\\d+\", rebuild)] for rebuild in list_temp]\r\n",
|
||
" for list_mem in list_temp:\r\n",
|
||
" list_mem.sort()\r\n",
|
||
" list_temp.sort()\r\n",
|
||
" additional_infor_df._set_value(row,'list', str([str(list_str[0]) + '_' + str(list_str[1]) for list_str in list_temp]))\r\n",
|
||
" else:\r\n",
|
||
" additional_infor_df._set_value(row,'list', str(list_temp))\r\n",
|
||
" grouped = additional_infor_df.groupby('list')['count'].sum()\r\n",
|
||
" additional_infor_df = pd.DataFrame({'list':[ast.literal_eval(index) for index in grouped.index]}) \r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'count', list( grouped.iloc[:]))\r\n",
|
||
" additional_infor_df.insert(len(additional_infor_df.columns), 'success', ['1' if l!= None and len(l)==3 and len(set([re.findall(r\"\\d+\",l[0])[0], re.findall(r\"\\d+\",l[0])[1], re.findall(r\"\\d+\",l[1])[0], re.findall(r\"\\d+\",l[1])[1], re.findall(r\"\\d+\",l[2])[0],re.findall(r\"\\d+\",l[2])[1]]))==6 and re.findall(r\"\\d+\",l[0])[0] in verify_list and re.findall(r\"\\d+\",l[0])[1] in verify_list and re.findall(r\"\\d+\",l[1])[0] in verify_list and re.findall(r\"\\d+\",l[1])[1] in verify_list and re.findall(r\"\\d+\",l[2])[0] in verify_list and re.findall(r\"\\d+\",l[2])[1] in verify_list else '0' for l in additional_infor_df.iloc[:,0] ])\r\n",
|
||
" accuracy_list.append(additional_infor_df.groupby('success')['count'].sum().iloc[1]/self.row_num)\r\n",
|
||
"\r\n",
|
||
" addition_list.append(additional_infor_df)\r\n",
|
||
" additional_infor_df.to_excel('./output/all_acc'+'/' +str(i) + '_count.xlsx')"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"output_type": "stream",
|
||
"name": "stdout",
|
||
"text": [
|
||
"0\n",
|
||
"1\n",
|
||
"2\n",
|
||
"3\n",
|
||
"4\n",
|
||
"5\n",
|
||
"6\n",
|
||
"7\n",
|
||
"8\n",
|
||
"9\n",
|
||
"10\n"
|
||
]
|
||
},
|
||
{
|
||
"output_type": "error",
|
||
"ename": "IndexError",
|
||
"evalue": "single positional indexer is out-of-bounds",
|
||
"traceback": [
|
||
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
|
||
"\u001b[1;31mIndexError\u001b[0m Traceback (most recent call last)",
|
||
"\u001b[1;32m<ipython-input-35-1fd2baedb513>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m 145\u001b[0m \u001b[0madditional_infor_df\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0minsert\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0madditional_infor_df\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcolumns\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'count'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mlist\u001b[0m\u001b[1;33m(\u001b[0m \u001b[0mgrouped\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0miloc\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 146\u001b[0m \u001b[0madditional_infor_df\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0minsert\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0madditional_infor_df\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcolumns\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'success'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;33m[\u001b[0m\u001b[1;34m'1'\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0ml\u001b[0m\u001b[1;33m!=\u001b[0m \u001b[1;32mNone\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m==\u001b[0m\u001b[1;36m3\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mset\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m[\u001b[0m\u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m==\u001b[0m\u001b[1;36m6\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mverify_list\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mverify_list\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mverify_list\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mverify_list\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mverify_list\u001b[0m \u001b[1;32mand\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mfindall\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34mr\"\\d+\"\u001b[0m\u001b[1;33m,\u001b[0m\u001b[0ml\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m2\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mverify_list\u001b[0m \u001b[1;32melse\u001b[0m \u001b[1;34m'0'\u001b[0m \u001b[1;32mfor\u001b[0m \u001b[0ml\u001b[0m \u001b[1;32min\u001b[0m \u001b[0madditional_infor_df\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0miloc\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m,\u001b[0m\u001b[1;36m0\u001b[0m\u001b[1;33m]\u001b[0m \u001b[1;33m]\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 147\u001b[1;33m \u001b[0maccuracy_list\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0madditional_infor_df\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mgroupby\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'success'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m'count'\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0msum\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0miloc\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;36m1\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m/\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mrow_num\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 148\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 149\u001b[0m \u001b[0maddition_list\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0madditional_infor_df\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
|
||
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m 893\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 894\u001b[0m \u001b[0mmaybe_callable\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mcom\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mapply_if_callable\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 895\u001b[1;33m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mmaybe_callable\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 896\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 897\u001b[0m \u001b[1;32mdef\u001b[0m \u001b[0m_is_scalar_access\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[1;33m:\u001b[0m \u001b[0mTuple\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
|
||
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_getitem_axis\u001b[1;34m(self, key, axis)\u001b[0m\n\u001b[0;32m 1499\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1500\u001b[0m \u001b[1;31m# validate the location\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1501\u001b[1;33m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_validate_integer\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 1502\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1503\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_ixs\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[1;33m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
|
||
"\u001b[1;32m~\\anaconda3\\lib\\site-packages\\pandas\\core\\indexing.py\u001b[0m in \u001b[0;36m_validate_integer\u001b[1;34m(self, key, axis)\u001b[0m\n\u001b[0;32m 1442\u001b[0m \u001b[0mlen_axis\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mlen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_get_axis\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0maxis\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1443\u001b[0m \u001b[1;32mif\u001b[0m \u001b[0mkey\u001b[0m \u001b[1;33m>=\u001b[0m \u001b[0mlen_axis\u001b[0m \u001b[1;32mor\u001b[0m \u001b[0mkey\u001b[0m \u001b[1;33m<\u001b[0m \u001b[1;33m-\u001b[0m\u001b[0mlen_axis\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m-> 1444\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mIndexError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m\"single positional indexer is out-of-bounds\"\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 1445\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 1446\u001b[0m \u001b[1;31m# -------------------------------------------------------------------\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
|
||
"\u001b[1;31mIndexError\u001b[0m: single positional indexer is out-of-bounds"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"source": [
|
||
"addition_list[5]\r\n"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"output_type": "execute_result",
|
||
"data": {
|
||
"text/plain": [
|
||
" list count success\n",
|
||
"0 [A_B, A_C] 38 0\n",
|
||
"1 [A_B, C_A] 472 0\n",
|
||
"2 [A_B, C_B] 379 0\n",
|
||
"3 [A_B] 116 0\n",
|
||
"4 [A_C, B_A] 16 0\n",
|
||
"5 [A_C, B_C] 28 0\n",
|
||
"6 [A_C] 45 0\n",
|
||
"7 [B_A, B_C] 13 0\n",
|
||
"8 [B_A, C_A, G_F, D_B, E_B] 1 0\n",
|
||
"9 [B_A, C_A] 26189 1\n",
|
||
"10 [B_A, C_B] 133 0\n",
|
||
"11 [B_A] 70 0\n",
|
||
"12 [B_C, A_B] 15 0\n",
|
||
"13 [B_C] 25 0\n",
|
||
"14 [C_A, B_C] 7 0\n",
|
||
"15 [C_A, C_B] 248 0\n",
|
||
"16 [C_A] 173 0\n",
|
||
"17 [C_B, A_C] 11 0\n",
|
||
"18 [C_B] 2037 0"
|
||
],
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>list</th>\n",
|
||
" <th>count</th>\n",
|
||
" <th>success</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>[A_B, A_C]</td>\n",
|
||
" <td>38</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>[A_B, C_A]</td>\n",
|
||
" <td>472</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>[A_B, C_B]</td>\n",
|
||
" <td>379</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>[A_B]</td>\n",
|
||
" <td>116</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>[A_C, B_A]</td>\n",
|
||
" <td>16</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>[A_C, B_C]</td>\n",
|
||
" <td>28</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>[A_C]</td>\n",
|
||
" <td>45</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>[B_A, B_C]</td>\n",
|
||
" <td>13</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>[B_A, C_A, G_F, D_B, E_B]</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>[B_A, C_A]</td>\n",
|
||
" <td>26189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>[B_A, C_B]</td>\n",
|
||
" <td>133</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>[B_A]</td>\n",
|
||
" <td>70</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>[B_C, A_B]</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>[B_C]</td>\n",
|
||
" <td>25</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>[C_A, B_C]</td>\n",
|
||
" <td>7</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>[C_A, C_B]</td>\n",
|
||
" <td>248</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>[C_A]</td>\n",
|
||
" <td>173</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>[C_B, A_C]</td>\n",
|
||
" <td>11</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>[C_B]</td>\n",
|
||
" <td>2037</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"execution_count": 21
|
||
}
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"source": [
|
||
"accuracy_list"
|
||
],
|
||
"outputs": [
|
||
{
|
||
"output_type": "execute_result",
|
||
"data": {
|
||
"text/plain": [
|
||
"[0.8438429882874328,\n",
|
||
" 0.703102247546692,\n",
|
||
" 0.8526432415321304,\n",
|
||
" 0.7790756568534346,\n",
|
||
" 0.8290281734726179,\n",
|
||
" 0.7917695473251029,\n",
|
||
" 0.8182969294080406,\n",
|
||
" 0.7529597974042419,\n",
|
||
" 0.79335232668566]"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"execution_count": 27
|
||
}
|
||
],
|
||
"metadata": {}
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"source": [
|
||
"# 发现问题\r\n",
|
||
"1"
|
||
],
|
||
"metadata": {}
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"name": "python3",
|
||
"display_name": "Python 3.8.8 64-bit ('base': conda)"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.8.8"
|
||
},
|
||
"interpreter": {
|
||
"hash": "be3a0175ef9952a30e10c7aa3f2137d621db0c1ee36e8101671841bcc4797871"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
} |