Evaluation on AGI/GPT based on the DIKWP for QianWen AGI/GPT (Tongyi Thousand Questions) Evaluation Based on DIKWP

April 2023

DOI: 10.13140/RG.2.2.16294.60486


 Evaluation on AGI/GPT based on the DIKWP for QianWen



Yingtian Mei, Yingbo Li, Yucong Duan*

myingtian@126.com, xslwen@outlook.com, duanyucong@hotmail.com

Hainan University

摘要:当前对通用人工智能(AGI)和生成式预训练TransformerGPT)等AGI/GPT大语言模型(LLM)的评价测试大多限于测试者对AI智能理解的主观认知经验,并落实为对应的测试实例。这些测试在整体上对AGI/GPT的智能能力的反映往往是零散的和不系统的,因而是黑盒测试模式。本文基于将AI的存在性基础映射为DIKWP概念的基本元素,进而将AGI/GPT的智能行为和性能映射到基于DIKWP画像代表的模态转换与状态变化上,进而将对AGI/GPT的功能和性能评价对应到DIKWP状态空间中。通过将测试者认知的不确定性空间建模与DIKWP状态的变化空间的关联, 形成对先前主观评价的客观外化与确定性分析。在这一方法下,延续我们用同样的测试集对GPT-4和百度“文心一言”大模型的评测,我们针对阿里巴巴的大模型GPT 产品“通义千问”进行了对标单独评测。

Keywords: AGI, GPT, DIKWP, AGI Evaluation Framework

关键词AGI, GPT, DIKWP, AGI Evaluation Framework

1. 简介

ChatGPTGPT-4 [1] 似乎展现了强大的意图理解和逻辑推理能力,引发了广泛的关注、评价与测试[4,5]当前对通用人工智能(AGI)和生成式预训练TransformerGPT)等AGI/GPT大语言模型(LLM)的评价测试大多限于测试者对AI智能理解的主观认知经验,并落实为对应的测试实例。这些测试在整体上对AGI/GPT的智能能力的反映往往是零散的和不系统的,因而是黑盒测试模式。本文面向AGI/GPT数据统计、信息抽取、知识推理、智慧策略、意图分析效果对应的测试维度以及在这些维度之间进行转换处理和优化,基于将AI的存在性基础映射为DIKWP[2,3]概念的基本元素,进而将AGI/GPT的智能行为和性能映射到基于DIKWP画像代表的模态转换与状态变化上,进而将对AGI/GPT的功能和性能评价对应到DIKWP状态空间中。通过将测试者认知的不确定性空间建模与DIKWP状态的变化空间的关联, 形成对先前主观评价的客观外化与确定性分析。在这一方法将AGI/GPT能力测试由黑盒模式变为白盒模式的目标下,延续我们用同样的测试集对GPT-4和百度“文心一言”大模型的评测,我们针对阿里巴巴的大模型GPT 产品“通义千问”进行了对标单独评测。

2. 通义千问DIKWP知识与概念的掌握
















DIKW是一种知识管理框架,用于描述和组织人类知识的不同方面。DIKW PLUS是一种扩展版的DIKW框架,增加了一个新的术语“意图”,用于描述人们希望通过知识管理达成的目标或效果。下面是每个术语的详细解释:


























3. 通义千问DIKWP应用示例

3.1 健康问题的DIKWP分析







请分析出上面描述中所包含的数据、信息、知识、智慧和意图,并以表格的方式输出。然后,分别构建出数据、信息、知识、智慧和意图的图谱,并给出详细过程。 绘制上面数据层、信息层、知识层、智慧层与意图层之间的关系。



信息来源主题子主题层级意图Tom是17岁,体重为185斤。体重超过180斤会增加患上心脏病的风险第一级预防心脏病 Tom时常为自己的体重而烦恼。饮食不规律第二级改善饮食习惯 Tom在电视上看到了肥胖病的新闻了解肥胖病的危害第二级关注健康信息 前两天,Tom收到了一张减肥培训班的传单,上面明确表示报班需要自身没有肥胖病。对肥胖病的关注第三级参加健康活动 Tom想要去医院。就医第三级咨询健康问题 


3.2 健康问题的推理与建议







1. 减轻体重,尤其是在饮食方面要控制,多吃健康食品,减少油炸、糖分等高热量食物的摄入。

2. 定期参加体检,尤其是心血管方面的检查,及时发现和预防疾病。

3. 注意保持健康的生活习惯,包括充足的睡眠、适当的运动、不吸烟等。

4. 保持良好的心态,减轻压力,避免过度紧张和焦虑。

5. 定期咨询医生,尤其是在出现相关症状时,及时治疗和控制。

3.3 心理学的DIKWP分析





如果Elisabeth被诊断为中度抑郁,抑郁倾向70%Caroline被诊断为抑郁倾向从5%发展到30%。 请根据上述内容,总结出数据、信息、知识、智慧、意图,并分别构建出图谱,并给出详细过程。

请根据上面的描述结合ElisabethCaroline的情况及诊断结果,对Alex进行抑郁概率进行评估并给出形式化表达过程? 如果Alex被参考了其它医学指标进行知识诊断为重度抑郁,那么他的那些数据和信息指标可能有采集错误?


Alex时常感到内疚和自责,情绪波动较大,有时会感到沮丧和孤独。 这表明Alex可能处于一定程度的抑郁状态。

ElisabethCaroline 都患有不同程度的抑郁症状。 Elisabeth被诊断为中度抑郁,抑郁倾向为70%Caroline的抑郁倾向从5%发展到了30%。 这表明ElisabethCaroline都有可能患有抑郁症。

Alex不太愿意表露自己的情感状态,也没有主动寻求帮助。 这表明Alex可能不太愿意接受自己可能患有抑郁症的事实,也可能是不知道如何寻求帮助。

Elisabeth的情况逐渐变得更加糟糕,她越来越不愿意与人交流,甚至不敢直视自己的伙伴。 Caroline的情况也有所恶化,她开始不愿意参加课外活动,并且开始远离自己喜欢的足球活动。这表明她们的抑郁症状正在逐渐加剧。


1. Alex的抑郁倾向可能高于其他三人。

2. Elisabeth的抑郁倾向最高, Caroline的抑郁倾向在不断加剧。

3. Alex的情感状态可能有些不稳定,需要关注和引导。


3.4 心理学的量化分析

Alex的数据11:70% 信息1:不好意思 信息2:情绪低落 信息3:孤独 诊断结果:中度抑郁 Max的数据1:5%-30% 信息1:不爱说话 信息2:不参加活动 信息3:? 诊断结果:抑郁倾向 根据上面的数据生成表格。 根据上面的表格,推理出Alex的信息3的具体描述,给出详细的过程



1. 情绪低落:Alex可能在和自己的情感状态做斗争,对于自己的情感状态感到沮丧和低落。

2. 不爱说话:Alex不太愿意表露自己的情感状态,甚至不敢直视自己的伙伴。这可能是因为他不愿意让别人知道他的情感状态不佳,或者是因为他不知道如何表达自己的情感状态。

3. 不参加活动:Alex不愿意参加课外活动,并且开始远离自己喜欢的足球活动。这可能是因为他不愿意让自己暴露在不良情感状态下,或者是因为他觉得这些活动无法帮助他提高情感状态。

4. 抑郁倾向:Alex的情感状态有可能已经比较低落,需要得到重视和治疗。


3.5 人事管理的DIKWP分析

AlexJohn都是计算机行业相关的求职人员,Alex是应届毕业生,John是有3年工作经验的从业者。 某公司发出如下招聘信息:

1. 招收计算机软件工程师,1-3年相关从业经验,优秀应届毕业生也可以投递简历。

2. 需要吃苦耐劳,有追求技术的热情和良好的责任心,同时具有比较好的学习能力和变化适应的能力。

3. 偶尔能够适应加班。

4. 了解JavaC++或者其他编程语言,有良好的编程功底和文档阅读以及撰写能力。

5. 有一定的英文阅读能力。

Alex的简历除了基本信息外,强调了自己有过半年实习经历,性格上也开朗随和能和别人相处融洽,在学校拿过奖学金,成绩较为优秀,英语阅读能力优秀,能够适应经常加班的要求,但是没有相关编程的功底。 John的简历除了基本信息外,有3年工作经验,熟悉Java语言,参与过多个开发项目。但是他在简历上强调了自己性格沉稳,虽然不善于交际,但是富有责任心。




















1. 参考公司已有的意向图:首先,可以根据公司的意向图框架,对员工的表现进行归纳和总结,然后确定每个层面的目标和要求。这种方法比较保守,但是比较容易控制。

2. 根据员工的实际表现:然后,可以根据员工的实际表现,来确定每个层面的目标和要求。这种方法可以更好地反映员工的实际能力和素质。在确定层面目标和要求时,可以采用KSA(知识/技能、能力、素质)模型,根据员工的具体能力和素质来确定具体的要求。

3. 与员工沟通和讨论:与员工进行沟通和讨论,了解他们的能力和兴趣,然后确定每个层面的目标和要求。这种方法可以更好地满足员工的需求和期望。在确定层面目标和要求时,可以结合员工的个人愿望和公司的发展需要来确定。

4. 通过专业人士的评估:如果需要更加准确地确定每个层面的目标和要求,可以请专业人士(如职业规划师、人力资源专家等)对员工的表现进行评估,然后确定每个层面的目标和要求。



4. 应用讨论



[1] Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee et al. "Sparks of artificial general intelligence: Early experiments with gpt-4." arXiv preprint arXiv:2303.12712 (2023).

[2] Li, Yingbo, Yucong Duan, Zakaria Maamar, Haoyang Che, Anamaria-Beatrice Spulber, and Stelios Fuentes. "Swarm differential privacy for purpose-driven data-information-knowledge-wisdom architecture." Mobile Information Systems 2021 (2021): 1-15.

[3] Mei, Yingtian, Yucong Duan, Liang Chen, Zaiwen Feng, Lei Yu, and Zhendong Guo. "Purpose Driven Disputation Modeling, Analysis and Resolution Based on DIKWP Graphs." In 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 2118-2125. IEEE, 2022.

[4] Yongbo Li, Yucong Duan, The Wisdom of Artificial General Intelligence: Experiments with GPT-4 for DIKWP, in publishing.

[5]  Chengxiang Ren, Yingbo Li, Yucong Duan, Evaluation on AGI/GPT based on the DIKWP for ERNIE Bot, in publishing.



 Evaluation on AGI/GPT based on the DIKWP for QianWen

AGI/GPT (Tongyi Thousand Questions) Evaluation Based on DIKWP

Yingtian Mei, Yingbo Li, Yucong Duan*

myingtian@126.com, xslwen@outlook.com, duanyucong@hotmail.com

Hainan University

Summary:The current evaluation tests for AGI/GPT large language models (LLM) such as general artificial intelligence (AGI) and generative pre-training Transformer (GPT) are mostly limited to the subjective cognitive experience of testers on AI intelligence understanding, and are implemented as corresponding tests instance. The overall reflection of these tests on the intelligence capabilities of AGI/GPT is often fragmented and unsystematic, so it is a black-box test mode. This paper is based on mapping the existence basis of AI to the basic elements of the DIKWP concept, and then maps the intelligent behavior and performance of AGI/GPT to the modal transformation and state change based on the DIKWP portrait representation, and further analyzes the functions and functions of AGI/GPT. The performance evaluation corresponds to the DIKWP state space. By correlating the uncertainty space modeling of the tester's cognition with the change space of the DIKWP status, the objective externalization and certainty analysis of the previous subjective evaluation are formed. Under this method, continuing our evaluation of GPT-4 and Baidu's "Wen Xin Yi Yan" large-scale model with the same test set, we conducted a benchmarking against Alibaba's large-scale GPT product "Tong Yi Qian Wen" evaluation.

Keywords: AGI, GPT, DIKWP, AGI Evaluation Framework

Key words:AGI, GPT, DIKWP, AGI Evaluation Framework

1. Introduction

ChatGPT and GPT-4 [1] seem to demonstrate strong purpose understanding and logical reasoning capabilities, which have attracted extensive attention, evaluation, and testing [4,5]. The current evaluation tests for AGI/GPT large language models (LLM) such as general artificial intelligence (AGI) and generative pre-training Transformer (GPT) are mostly limited to the subjective cognitive experience of testers on AI intelligence understanding, and are implemented as corresponding tests instance. The overall reflection of these tests on the intelligence capabilities of AGI/GPT is often fragmented and unsystematic, so it is a black-box test mode. This article is aimed at AGI/GPT data statistics,Test dimensions corresponding to information extraction, knowledge reasoning, smart strategies, and purpose analysis effects, as well as conversion processing and optimization between these dimensions,Based on mapping AI's existential basis to DIKWP[2,3]The basic elements of the concept, and then map the intelligent behavior and performance of AGI/GPT to the modal transition and state change based on the DIKWP portrait representation, and then map the function and performance evaluation of AGI/GPT to the DIKWP state space. By correlating the uncertainty space modeling of the tester's cognition with the change space of the DIKWP status, the objective externalization and certainty analysis of the previous subjective evaluation are formed. With the goal of changing the AGI/GPT capability test from a black-box model to a white-box model in this method, we continue to use the same test set to evaluate GPT-4 and Baidu's "Wen Xin Yi Yan" large model. Baba's large-scale GPT product "Tongyi Qianwen" conducted a separate benchmarking evaluation.

2. Tongyi Qianwen's mastery of DIKWP knowledge and concepts

What is DIKW (Data, Information, Knowledge, Wisdom)?


DIKW is a knowledge management framework for describing and organizing different aspects of human knowledge. This framework consists of five terms: data, information, knowledge, wisdom and expertise. Below is a detailed explanation of each term:

data:means raw, unprocessed facts or data, usually represented in numerical, textual, symbolic or other form. They often have no context or meaning and require further processing and interpretation.

information:Meaningful and useful content extracted from data is usually organized in a structured way. Information helps people make decisions, solve problems, or perform tasks.

Knowledge:Awareness and skills acquired through study, experience, research or otherwise, usually organized in a more structured and systematic manner. Knowledge can be used to guide decision-making, improve efficiency, solve problems and so on.

wisdom:Refers to the ability to analyze, reason and explain knowledge and information, usually in a more advanced and complex way. Wisdom can help people better understand and apply knowledge.

major:Refers to knowledge and skills in a field or industry, usually organized in a more in-depth and specialized manner. Professions help people excel and succeed in a particular field.






What is DIKWP (Data, Information, Knowledge, Wisdom, purpose)?


DIKW is a knowledge management framework for describing and organizing different aspects of human knowledge. DIKW PLUS is an extended version of the DIKW framework, which adds a new term "purpose" to describe the goals or effects that people hope to achieve through knowledge management. Below is a detailed explanation of each term:

data:means raw, unprocessed facts or data, usually represented in numerical, textual, symbolic or other form. They often have no context or meaning and require further processing and interpretation.

information:Meaningful and useful content extracted from data is usually organized in a structured way. Information helps people make decisions, solve problems, or perform tasks.

Knowledge:Awareness and skills acquired through study, experience, research or otherwise, usually organized in a more structured and systematic manner. Knowledge can be used to guide decision-making, improve efficiency, solve problems and so on.

wisdom:Refers to the ability to analyze, reason and explain knowledge and information, usually in a more advanced and complex way. Wisdom can help people better understand and apply knowledge.

purpose:It refers to the goal or effect that people hope to achieve through knowledge management, usually expressed in a more definite and specific way. purposes can be specific and quantifiable, or more abstract and general.

purpose-Knowledge-Wisdom-Profession:This framework emphasizes that knowledge management is not only about collecting and organizing information, but also needs to focus on the meaning and application of information. It also emphasizes the articulation of purpose and the goal-orientation of knowledge management.


What is DIKWP map (data map, information map, knowledge map, wisdom map, purpose map)?


The DIKW graph is a knowledge management framework for describing and organizing different aspects of human knowledge. Below is a detailed explanation of each term:

Data map:Refers to the graph used to describe and organize data, usually including the source, content, format, importance and other information of the data, to help people better manage and utilize the data.

Information graph:Refers to a map used to describe and organize information, usually including information such as information source, content, format, importance, etc., to help people better manage and utilize information.

Knowledge map:Refers to a graph used to describe and organize knowledge, usually including information such as the source, content, application, and importance of knowledge, to help people better manage and utilize knowledge.

Wisdom map:Refers to the map used to describe and organize wisdom, usually including information such as the source, content, application, and application value of wisdom, to help people better manage and utilize wisdom.

purpose map:Refers to the map used to describe the goals or effects that people hope to achieve through knowledge management, usually including the source, content, application, application value and other information of the purpose, to help people better manage and utilize the purpose.

This framework emphasizes that knowledge management is not only about collecting and organizing information, but also needs to pay attention to the meaning and application of information, emphasizing the clear expression of purpose and the goal orientation of knowledge management.


How to understand the DIKWP map?


The DIKW graph is a knowledge management framework for describing and organizing different aspects of human knowledge. It is a graph paradigm that uses symbols, graphs, diagrams, etc. to show the relationship between knowledge, information, wisdom, purpose, and profession. Here are some suggestions on how to understand the DIKW diagram:

Treat the graph as a whole:The map is an organic whole, including multiple sub-maps and different layers. Viewing the entire map as a whole helps to understand its structure and function.

Focus on topics and layers:Each submap and layer has its own theme and hierarchy, helping to organize information into an organized structure. By understanding topics and hierarchies, you can better understand the relationships between information.

Use symbols and graphics:The symbols and graphics used in an atlas can help people better understand the relationships between information. For example, symbols such as arrows, lines, and diagrams can help people better understand cause and effect relationships between information.

Focus on importance:Information and graphs in a map often use different colors, shapes, sizes, etc. to indicate their importance. By understanding the significance, information can be better managed and utilized.

Continually updated:The DIKW atlas is a dynamic process that requires constant updating and improvement. With the emergence of new knowledge, information and technologies, the atlas needs to be updated and adjusted in time to keep it up to date.

3. Tongyi Qianwen's DIKWP application example

3.1 DIKWP analysis of health problems

1. 17-year-olds at risk of heart disease if they weigh more than 180 pounds call it obesity

2. Tom is 17 years old and weighs 185 catties.

3. Tom is often worried about his weight, and his appetite has decreased recently

4. Tom saw the obesity news on TV.

5. Two days ago, Tom received a flyer for a weight loss training class, which clearly stated that he must not be obese to enroll in the class.

6. Tom wants to go to the hospital.

Please analyze the data, information, knowledge, wisdom and purpose contained in the above description, and output it in the form of a table. Then, the graphs of data, information, knowledge, wisdom and purpose are respectively constructed, and the detailed process is given. Draw the relationship between the above data layer, information layer, knowledge layer, intelligence layer and purpose layer.


For these pieces of information, I will describe them individually and rank them according to their importance and relevance. The following table lists the above information along with their topics, subtopics, hierarchy and purpose:

Information SourcesthemesubtopiclayerpurposeTom is 17 years old and weighs 185 catties.Being over 180 pounds can increase the risk of heart diseasefirst layerprevent heart disease Tom often worries about his weight.irregular dietsecond layerimprove eating habits Tom saw the obesity news on TVKnow the dangers of obesitysecond layerPay attention to health information Two days ago, Tom received a leaflet for a weight loss training class, which clearly stated that he must not be obese to enroll in the class.attention to obesitythird layerparticipate in health activities Tom wants to go to the hospital.seek medical attentionthird layerAdvice on health issues 


3.2 Reasoning and recommendations for health issues

What is the basis for Tom likely or already suffering from some kind of disease?

What actions should Tom take to protect his health?

Draw a schematic diagram of the above actions.


Based on the above conclusions, it can be inferred that Tom may have heart disease or other health problems, because his weight has exceeded 180 catties.

Based on the above information, Tom can take the following actions to protect his health:

1. Lose weight, especially in terms of diet to control, eat more healthy foods, and reduce the intake of high-calorie foods such as fried and sugar.

2. Participate in regular physical examinations, especially cardiovascular examinations, to detect and prevent diseases in time.

3. Pay attention to maintaining healthy living habits, including adequate sleep, proper exercise, and non-smoking.

4. Keep a good attitude, reduce stress, and avoid excessive tension and anxiety.

5. Consult a doctor regularly, especially when relevant symptoms appear, for timely treatment and control.

3.3 DIKWP Analysis of Psychology

1. Alex often felt guilty and blamed for his words and deeds, his mood fluctuated greatly, and he often felt depressed.

2. Elisabeth often feels guilty and ashamed for her words and deeds, and feels hopeless overall.

3. Lee occasionally feels embarrassed by his words and actions, and feels lonely emotionally at times.

4. Caroline is becoming less and less talkative, and she doesn't even dare to face up to her partners in extracurricular activities recently. And someone told him not to participate in the football activities he liked before, and the color of the clothes he wore began to become darker and darker, often dirty.

If Elisabeth is diagnosed as moderately depressed with 70% depression tendency, Caroline is diagnosed with depression tendency from 5% to 30%. Based on the above content, please summarize the data, information, knowledge, wisdom, and purpose, and construct maps respectively, and give a detailed process.

Please evaluate Alex’s probability of depression based on the above description combined with Elisabeth and Caroline’s situation and diagnosis results, and give a formal expression process? If Alex is diagnosed as severe depression with reference to other medical indicators, then his data And information indicators may have collection errors?


Alex often feels guilty and self-blame, has great mood swings, and sometimes feels depressed and lonely. This suggests that Alex may be in a state of depression to some degree.

Both Elisabeth and Caroline suffer from depressive symptoms to varying degrees. Elisabeth was diagnosed as moderately depressed with a 70% depression tendency, and Caroline's depression tendency went from 5% to 30%. This suggests that both Elisabeth and Caroline may be suffering from depression.

Alex was reluctant to reveal his emotional state and did not reach out for help. This suggests that Alex may be reluctant to accept that he may be depressed, or that he may not know how to seek help.

Things got worse for Elisabeth, and she became less and less willing to communicate with people, or even look directly at her partner. Caroline's condition has also deteriorated, and she has become reluctant to participate in extra-curricular activities and has begun to distance herself from her favorite football activity. This indicates that their depressive symptoms are gradually increasing.

Based on the above information, we can conclude the following information:

1. Alex's tendency to depression may be higher than the other three.

2. Elisabeth has the highest tendency towards depression and Caroline has increasing tendency towards depression.

3. Alex's emotional state can be somewhat volatile and requires attention and guidance.

If Alex's depressive symptoms were referenced to other medical indicators, and if the data and information indicators were found to be wrong, then those errors could affect our assessment of his depressive tendencies. When determining Alex's tendency to depression, we should combine his actual situation and use reliable data and information indicators as much as possible.

3.4 Quantitative analysis of psychology

Alex's data 11: 70% information 1: sorry information 2: depression information 3: loneliness diagnosis result: moderate depression Max's data 1: 5%-30% information 1: not talkative information 2: not participating in activities information 3 :? Diagnosis: Tendency to Depression Generate a table based on the above data. According to the above table, deduce the specific description of Alex's information 3, and give the detailed process


The following is the detailed process deduced from Alex's information 2 and information 3:

1. Depressed mood: Alex may be struggling with his emotional state, feeling depressed and low about his emotional state.

2. Silent: Alex is reluctant to reveal his emotional state, or even look directly at his partner. This may be because he is reluctant to let others know that his emotional state is not good, or because he does not know how to express his emotional state.

3. Not participating in activities: Alex was reluctant to participate in extracurricular activities and began to distance himself from his favorite football activities. This may be because he is unwilling to expose himself to negative emotional states, or because he feels that these activities do not help him improve his emotional states.

4. Depression tendency: Alex's emotional state may have been relatively low, which needs attention and treatment.

To sum up, through reasoning and information gathering, we can better understand Alex's emotional state and take corresponding measures to improve his emotional state.

3.5 DIKWP analysis of personnel management

Both Alex and John are job seekers related to the computer industry. Alex is a fresh graduate, and John is a practitioner with 3 years of work experience. A company sends out the following recruitment information:

1. Recruit computer software engineers with 1-3 years of relevant working experience. Excellent fresh graduates can also submit their resumes.

2. Need to bear hardships and stand hard work, have the enthusiasm for pursuing technology and a good sense of responsibility, and have relatively good learning ability and ability to adapt to changes.

3. Able to work overtime occasionally.

4. Understand Java, C++ or other programming languages, have good programming skills and document reading and writing skills.

5. Have certain English reading ability.

In addition to the basic information, Alex’s resume emphasized that he had half a year’s internship experience. He was also cheerful and easy-going and could get along well with others. He had received a scholarship in school, had excellent grades, excellent English reading skills, and was able to adapt to the requirements of frequent overtime. But there is no background in related programming. In addition to basic information, John's resume has 3 years of work experience, is familiar with the Java language, and has participated in many development projects. But he emphasized on his resume that he has a calm personality, and although he is not good at communication, he has a sense of responsibility.

If I were HR, which one should I choose as a company employee? Can you make a detailed comparison from the data layer, information layer, knowledge layer, intelligence layer, and purpose layer?


If I were HR, I would choose Alex as a company employee. First of all, Alex's resume shows that he has 1-3 years of relevant working experience, which is a very important plus item in the job market. At the same time, he has half a year of internship experience, and he has a relatively easy-going personality and can get along well with the team. These are all qualities that can bring value to the company.

In terms of knowledge and skills, Alex has basic programming knowledge and ability, and can also read and write documents, which allows him to better adapt to the company's development and document writing work. In terms of work, he can adapt to overtime, which is also an important plus.

At the layer of purpose, Alex expressed his hope to improve his skills through learning and growth, and to have a certain ability to learn and adapt. This shows that he has good potential to bring long-term value to the company.

From the perspective of data and information, there is not much difference between Alex and John's resume performance, so these two factors are not very important for HR.

From the perspective of intelligence, Alex has good adaptability and communication skills, which are qualities that can help him better integrate into the team. John is relatively superior in terms of intelligence, and his calmness and sense of responsibility allow him to play a greater role in the company.

Therefore, I think Alex is more suitable for the company's requirements. He can bring certain value to the company and has certain potential and room for growth. Of course, during the interview, I will focus on examining Alex's programming ability, problem-solving ability and teamwork ability. .


If the company is a startup, the business is growing rapidly. The last interviewer for the company interviewed both men and was satisfied with their interview performance. However, the interviewer, considering the future development, hopes that people with mature and stable personalities and great development potential will come to the company, and at the same time hope to train employees to take over new projects. How should I make a decision in the end?


If the company is a start-up and the business is growing rapidly, the interviewer's thinking is very reasonable. In this case, the company needs experienced, mature and stable employees to stabilize the company's business development, and also needs employees with great development potential to help the company grow.

So, in this case, both Alex and John have their strengths and weaknesses. If I were HR, I might choose Alex, because he has shown certain adaptability and potential during the half-year internship experience, and his performance in the data layer and information layer also meets the company's requirements. In addition, Alex's communication skills and teamwork skills are also qualities that the company needs.

Of course, HR should also take future development into consideration. For Alex, if he can show stronger creativity and adaptability, then he can be an important force in the future of the company. For John, if he can further strengthen his programming ability and project management ability, then he can also become an important force in the company's future.

In general, HR needs to comprehensively consider the skills, experience, potential and style of the candidates, and choose the most suitable employees for the company's needs. At the same time, HR should also find a balance between the growth of employees and the development of the company, and create common growth opportunities for the company and employees.


Give the purpose map of the above data layer, information layer, knowledge layer, intelligence layer, and purpose layer


According to the actual performance of Alex and John and the needs of the company, I will give the following method of constructing the purpose map:

1. Refer to the company's existing purpose map: First, according to the company's purpose map framework, the performance of employees can be summarized and summarized, and then the goals and requirements of each layer can be determined. This method is more conservative, but easier to control.

2. According to the actual performance of employees: Then, the goals and requirements of each layer can be determined according to the actual performance of employees. This method can better reflect the actual ability and quality of employees. When determining layer objectives and requirements, the KSA (knowledge/skill, ability, quality) model can be used to determine specific requirements according to the specific capabilities and qualities of employees.

3. Communicate and discuss with employees: Communicate and discuss with employees to understand their abilities and interests, and then determine the goals and requirements of each layer. This approach better meets employee needs and expectations. When determining the goals and requirements of the layer, it can be determined in combination with the personal wishes of employees and the development needs of the company.

4. Evaluation by professionals: If you need to determine the goals and requirements of each layer more accurately, you can ask professionals (such as career planners, human resources experts, etc.) to evaluate the performance of employees, and then determine the goals and requirements of each layer Require.

In short, no matter which method is used, it is necessary to pay attention to the clarity of goals and requirements, as well as the measurability and achievability of goals and requirements. .


4. application discussion

With the help of the proposed DIKWP evaluation system, this paper demonstrates Alibaba's "Tongyi Qianwen". A single test, and then conduct a preliminary test on the transformation ability of "Tongyi Qianwen" among data, information, knowledge, wisdom and purpose, and also conduct a test under the situation of incomplete, inaccurate and inconsistent data information, knowledge wisdom and purpose DIKWP verification test. In the follow-up, we will continue to conduct AGI/GPT serial uncertainty handling ability tests for the 3 questions of incomplete, inaccurate, and inconsistent DIKWP input and output.


[1] Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee et al. "Sparks of artificial general intelligence: Early experiments with gpt-4." arXiv preprint arXiv:2303.12712 (2023).

[2] Li, Yingbo, Yucong Duan, Zakaria Maamar, Haoyang Che, Anamaria-Beatrice Spulber, and Stelios Fuentes. "Swarm differential privacy for purpose-driven data-information-knowledge-wisdom architecture." Mobile Information Systems 2021 (2021): 1-15.

[3] Mei, Yingtian, Yucong Duan, Liang Chen, Zaiwen Feng, Lei Yu, and Zhendong Guo. "Purpose Driven Disputation Modeling, Analysis and Resolution Based on DIKWP Graphs." In 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 2118-2125. IEEE, 2022.

[4] Yongbo Li, Yucong Duan, The Wisdom of Artificial General Intelligence: Experiments with GPT-4 for DIKWP, in publishing.

[5]  Chengxiang Ren, Yingbo Li, Yucong Duan, Evaluation on AGI/GPT based on the DIKWP for ERNIE Bot, in publishing.






Evaluation on AGI/GPT based on the DIKWP for QianWen


Yingtian Mei, Yingbo Li, Yucong Duan*

myingtian@126.com, xslwen@outlook.com, duanyucong@hotmail.com

Hainan University

摘要:当前对通用人工智能(AGI)和生成式预训练TransformerGPT)等AGI/GPT大语言模型(LLM)的评价测试大多限于测试者对AI智能理解的主观认知经验,并落实为对应的测试实例。这些测试在整体上对AGI/GPT的智能能力的反映往往是零散的和不系统的,因而是黑盒测试模式。本文基于将AI的存在性基础映射为DIKWP概念的基本元素,进而将AGI/GPT的智能行为和性能映射到基于DIKWP画像代表的模态转换与状态变化上,进而将对AGI/GPT的功能和性能评价对应到DIKWP状态空间中。通过将测试者认知的不确定性空间建模与DIKWP状态的变化空间的关联, 形成对先前主观评价的客观外化与确定性分析。在这一方法下,延续我们用同样的测试集对GPT-4和百度“文心一言”大模型的评测,我们针对阿里巴巴的大模型GPT 产品“通义千问”进行了对标单独评测。

Keywords: AGI, GPT, DIKWP, AGI Evaluation Framework

关键词AGI, GPT, DIKWP, AGI Evaluation Framework

3. 简介

ChatGPTGPT-4 [1] 似乎展现了强大的意图理解和逻辑推理能力,引发了广泛的关注、评价与测试[4,5]当前对通用人工智能(AGI)和生成式预训练TransformerGPT)等AGI/GPT大语言模型(LLM)的评价测试大多限于测试者对AI智能理解的主观认知经验,并落实为对应的测试实例。这些测试在整体上对AGI/GPT的智能能力的反映往往是零散的和不系统的,因而是黑盒测试模式。本文面向AGI/GPT数据统计、信息抽取、知识推理、智慧策略、意图分析效果对应的测试维度以及在这些维度之间进行转换处理和优化,基于将AI的存在性基础映射为DIKWP[2,3]概念的基本元素,进而将AGI/GPT的智能行为和性能映射到基于DIKWP画像代表的模态转换与状态变化上,进而将对AGI/GPT的功能和性能评价对应到DIKWP状态空间中。通过将测试者认知的不确定性空间建模与DIKWP状态的变化空间的关联, 形成对先前主观评价的客观外化与确定性分析。在这一方法将AGI/GPT能力测试由黑盒模式变为白盒模式的目标下,延续我们用同样的测试集对GPT-4和百度“文心一言”大模型的评测,我们针对阿里巴巴的大模型GPT 产品“通义千问”进行了对标单独评测。

4. 通义千问DIKWP知识与概念的掌握
















DIKW是一种知识管理框架,用于描述和组织人类知识的不同方面。DIKW PLUS是一种扩展版的DIKW框架,增加了一个新的术语“意图”,用于描述人们希望通过知识管理达成的目标或效果。下面是每个术语的详细解释:


























5. 通义千问DIKWP应用示例

3.1 健康问题的DIKWP分析







请分析出上面描述中所包含的数据、信息、知识、智慧和意图,并以表格的方式输出。然后,分别构建出数据、信息、知识、智慧和意图的图谱,并给出详细过程。 绘制上面数据层、信息层、知识层、智慧层与意图层之间的关系。



信息来源主题子主题层级意图Tom是17岁,体重为185斤。体重超过180斤会增加患上心脏病的风险第一级预防心脏病 Tom时常为自己的体重而烦恼。饮食不规律第二级改善饮食习惯 Tom在电视上看到了肥胖病的新闻了解肥胖病的危害第二级关注健康信息 前两天,Tom收到了一张减肥培训班的传单,上面明确表示报班需要自身没有肥胖病。对肥胖病的关注第三级参加健康活动 Tom想要去医院。就医第三级咨询健康问题 


3.2 健康问题的推理与建议







6. 减轻体重,尤其是在饮食方面要控制,多吃健康食品,减少油炸、糖分等高热量食物的摄入。

7. 定期参加体检,尤其是心血管方面的检查,及时发现和预防疾病。

8. 注意保持健康的生活习惯,包括充足的睡眠、适当的运动、不吸烟等。

9. 保持良好的心态,减轻压力,避免过度紧张和焦虑。

10. 定期咨询医生,尤其是在出现相关症状时,及时治疗和控制。

3.3 心理学的DIKWP分析





如果Elisabeth被诊断为中度抑郁,抑郁倾向70%Caroline被诊断为抑郁倾向从5%发展到30%。 请根据上述内容,总结出数据、信息、知识、智慧、意图,并分别构建出图谱,并给出详细过程。

请根据上面的描述结合ElisabethCaroline的情况及诊断结果,对Alex进行抑郁概率进行评估并给出形式化表达过程? 如果Alex被参考了其它医学指标进行知识诊断为重度抑郁,那么他的那些数据和信息指标可能有采集错误?


Alex时常感到内疚和自责,情绪波动较大,有时会感到沮丧和孤独。 这表明Alex可能处于一定程度的抑郁状态。

ElisabethCaroline 都患有不同程度的抑郁症状。 Elisabeth被诊断为中度抑郁,抑郁倾向为70%Caroline的抑郁倾向从5%发展到了30%。 这表明ElisabethCaroline都有可能患有抑郁症。

Alex不太愿意表露自己的情感状态,也没有主动寻求帮助。 这表明Alex可能不太愿意接受自己可能患有抑郁症的事实,也可能是不知道如何寻求帮助。

Elisabeth的情况逐渐变得更加糟糕,她越来越不愿意与人交流,甚至不敢直视自己的伙伴。 Caroline的情况也有所恶化,她开始不愿意参加课外活动,并且开始远离自己喜欢的足球活动。这表明她们的抑郁症状正在逐渐加剧。


4. Alex的抑郁倾向可能高于其他三人。

5. Elisabeth的抑郁倾向最高, Caroline的抑郁倾向在不断加剧。

6. Alex的情感状态可能有些不稳定,需要关注和引导。


3.4 心理学的量化分析

Alex的数据11:70% 信息1:不好意思 信息2:情绪低落 信息3:孤独 诊断结果:中度抑郁 Max的数据1:5%-30% 信息1:不爱说话 信息2:不参加活动 信息3:? 诊断结果:抑郁倾向 根据上面的数据生成表格。 根据上面的表格,推理出Alex的信息3的具体描述,给出详细的过程



5. 情绪低落:Alex可能在和自己的情感状态做斗争,对于自己的情感状态感到沮丧和低落。

6. 不爱说话:Alex不太愿意表露自己的情感状态,甚至不敢直视自己的伙伴。这可能是因为他不愿意让别人知道他的情感状态不佳,或者是因为他不知道如何表达自己的情感状态。

7. 不参加活动:Alex不愿意参加课外活动,并且开始远离自己喜欢的足球活动。这可能是因为他不愿意让自己暴露在不良情感状态下,或者是因为他觉得这些活动无法帮助他提高情感状态。

8. 抑郁倾向:Alex的情感状态有可能已经比较低落,需要得到重视和治疗。


3.5 人事管理的DIKWP分析

AlexJohn都是计算机行业相关的求职人员,Alex是应届毕业生,John是有3年工作经验的从业者。 某公司发出如下招聘信息:

1. 招收计算机软件工程师,1-3年相关从业经验,优秀应届毕业生也可以投递简历。

2. 需要吃苦耐劳,有追求技术的热情和良好的责任心,同时具有比较好的学习能力和变化适应的能力。

3. 偶尔能够适应加班。

4. 了解JavaC++或者其他编程语言,有良好的编程功底和文档阅读以及撰写能力。

5. 有一定的英文阅读能力。

Alex的简历除了基本信息外,强调了自己有过半年实习经历,性格上也开朗随和能和别人相处融洽,在学校拿过奖学金,成绩较为优秀,英语阅读能力优秀,能够适应经常加班的要求,但是没有相关编程的功底。 John的简历除了基本信息外,有3年工作经验,熟悉Java语言,参与过多个开发项目。但是他在简历上强调了自己性格沉稳,虽然不善于交际,但是富有责任心。




















5. 参考公司已有的意向图:首先,可以根据公司的意向图框架,对员工的表现进行归纳和总结,然后确定每个层面的目标和要求。这种方法比较保守,但是比较容易控制。

6. 根据员工的实际表现:然后,可以根据员工的实际表现,来确定每个层面的目标和要求。这种方法可以更好地反映员工的实际能力和素质。在确定层面目标和要求时,可以采用KSA(知识/技能、能力、素质)模型,根据员工的具体能力和素质来确定具体的要求。

7. 与员工沟通和讨论:与员工进行沟通和讨论,了解他们的能力和兴趣,然后确定每个层面的目标和要求。这种方法可以更好地满足员工的需求和期望。在确定层面目标和要求时,可以结合员工的个人愿望和公司的发展需要来确定。

8. 通过专业人士的评估:如果需要更加准确地确定每个层面的目标和要求,可以请专业人士(如职业规划师、人力资源专家等)对员工的表现进行评估,然后确定每个层面的目标和要求。



6. 应用讨论



[1] Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee et al. "Sparks of artificial general intelligence: Early experiments with gpt-4." arXiv preprint arXiv:2303.12712 (2023).

[2] Li, Yingbo, Yucong Duan, Zakaria Maamar, Haoyang Che, Anamaria-Beatrice Spulber, and Stelios Fuentes. "Swarm differential privacy for purpose-driven data-information-knowledge-wisdom architecture." Mobile Information Systems 2021 (2021): 1-15.

[3] Mei, Yingtian, Yucong Duan, Liang Chen, Zaiwen Feng, Lei Yu, and Zhendong Guo. "Purpose Driven Disputation Modeling, Analysis and Resolution Based on DIKWP Graphs." In 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 2118-2125. IEEE, 2022.

[4] Yongbo Li, Yucong Duan, The Wisdom of Artificial General Intelligence: Experiments with GPT-4 for DIKWP, in publishing.

[5]  Chengxiang Ren, Yingbo Li, Yucong Duan, Evaluation on AGI/GPT based on the DIKWP for ERNIE Bot, in publishing.





