||
Evaluation on AGI/GPT based on the DIKWP for ColossalChat
AGI-AIGC-GPT Evaluation Laboratory Report Series
Guojun Hou nickhou1986@gmail.com
April 13, 2023
Hainan University
Abstract:
The development of General Artificial Intelligence (AGI) and generative pre-trained Transformer (GPT) technologies has attracted widespread attention. However, current AGI/GPT evaluation and testing methods are mostly limited to subjective cognitive experience, and still lack an objective, effective, and unified evaluation system and evaluation standards. To address the lack of objectivity in describing the functional completeness and ability system of intelligence in existing AGI/GPT evaluation and testing technologies and methods, this study proposes a more complete evaluation and testing system based on DIKWP. The framework proposed in this article addresses the problems of fragmentation, divergence, and excessive emphasis on subjective experience in current GPT and AGI testing and evaluation systems, providing a relatively complete and systematic testing framework for subsequent capability evaluation of AGI and GPT technologies. We evaluated the Colossal-AI product "ColossalChat" using this framework. Due to ColossalChat's limited support for the Chinese language, this report will be evaluated and written in English.
Keywords: AGI, GPT, DIKWP, AGI Evaluation Framework
1. Introduction:
In recent years, researchers have shown increasing interest in cognitive artificial intelligence technology. The emergence of generative and general artificial intelligence has greatly enhanced human understanding of various digital resources and has been widely applied in fields such as text processing, machine translation, and image processing. In particular, the recently popular ChatGPT and GPT-4 have demonstrated powerful intent understanding and logical reasoning capabilities. Related research in specific human job fields, such as medicine, programming, mathematics, law, and digital creativity, suggests that their level of performance has already reached a level similar to that of human experts.
Numerous AGI and GPT testing methods have been widely applied to general and specific research, such as in medicine. While AGI is qualitatively described as a data-driven aggressive method under the data relay mode, it is increasingly demonstrating information extraction, knowledge inference, intelligent strategies, and intent analysis in terms of interactive performance. This reflects the semantic association and transformation of data, information, knowledge, wisdom, and intent in the data representation space and Bayesian processing. However, current evaluation methods pose significant challenges to evaluators' cognition and still lack effective, consistent, and objective evaluation systems and control standards. Current evaluations mainly reflect fragmented, divergent, and excessively experiential problem interactions, with subjective interpretations of the same results. The lack of an effective, objective, and comprehensive evaluation system not only leads to confusion in the development and planning direction of large models but also poses obstacles to the governance of AGI. Therefore, based on the DIKWP framework, this article demonstrates a more complete testing system. By graphping digital resources to the DIKWP model and expression system and transforming digital interaction into DIKWP system interaction, a complete closed intelligent AI target graphping space can be formed for AGI. Furthermore, based on the transformation of difference space and uncertainty space, a complete AI evaluation space is constructed, providing a more objective, comprehensive, and consistent method for evaluating AGI and GPT technologies.
In this testing report, we utilized three methods to evaluate ColossalChat. Firstly, we tested ColossalChat's proficiency in understanding DICWP knowledge. Secondly, we analyzed the career planning of a database administrator using the DIKWP framework with ColossalChat. Thirdly, the DIKWP team provided suggestions to ColossalChat.
2. ColossalChat's Mastery of DIKWP Knowledge and Concepts.
What is DIKWP?
DIKW stands for Data-Information- Knowledge- Wisdom, which are five levels of increasing abstraction in information science and technology. It can be used to describe how data is transformed into knowledge or wisdom through various processes such as analysis, interpretation, understanding, etc.
What is DIKWP (Data, Information, Knowledge, Wisdom, Process)?
DIKWP stands for Data-Information-Knowledge-Wisdom-Processes, which are six levels of increasing complexity in information science and technology. This model emphasizes the importance of processes involved in transforming raw data into useful knowledge and wisdom. These processes include collection, storage, retrieval, manipulation, communication, and visualization.
What is the DIKWP graph (Data graph, Information graph, Knowledge graph, Wisdom graph, Process graph)?
The DIKWP Graph is a framework for understanding how different types of information can be represented visually in order to gain insight or facilitate decision-making. It stands for “data”, “information”, “knowledge”, “wisdom” and “process”. Each letter represents a specific type of map which are designed to help people interpret and analyze data more effectively. The DIKWP Graph provides a comprehensive set of tools to explore and present data in meaningful ways..
How to understand the DIKWP graph (Data graph, Information graph, Knowledge graph, Wisdom graph, Process graph)?
The DIKWP Graph stands for Data Graph, Information Graph, Knowledge Graph, Wisdom Graph and Process Graph. The data graph represents raw facts or data points which are collected from various sources such as databases, websites etc. This can be used to create information graphs by organizing these data points into meaningful structures. A knowledge graph then takes this further by understanding relationships between different entities in order to draw conclusions based on them. Finally, wisdom comes with experience; process diagrams represent processes or workflows which have been developed over time through trial-and-error learning. By combining all of these elements together, it allows us to better visualize complex systems and make informed decisions about how best to approach any given problem.
3. Example of DIKWP Application in ColossalChat.
3.1 A DIKWP analysis of career planning for database administrators.
Analytical requirements:
Generally, database administrators need to have solid theoretical knowledge and practical project experience to become a qualified database administrator.
Paul is a recent college graduate with a degree in software engineering, and he wants to work in the field of data. After graduating,
Paul wants to become a DBA, but he has not received any interview requests after submitting some resumes.
Paul's friend suggested that he intern at a big data company for six months before applying for jobs again.
Real-world testing of ColossalChat
Please analyze the data, information, knowledge, wisdom, and intent contained in the description.
above and construct a graph for each. Provide a detailed process and explain the relationship between the data layer, information layer, knowledge layer, wisdom layer, and intent layer.
Data Layer - This layer contains raw facts or figures such as demographic statistics, financial records, customer profiles etc. It can be used by other layers (knowledge, wisdom) to make decisions or provide insights.
Information Layer- This layer consists of structured data which provides meaningful insight when analyzed. For example, it could include sales reports, market research results, customer feedback surveys, etc. The information layer helps to identify trends and patterns within the data set.
Knowledge Layer – This layer includes organized knowledge gained from analysis of the data layer. Knowledge here refers to understanding how different elements interact together and forming conclusions based on this understanding. Examples would include predictive models, decision trees, rules engines, etc.
Wisdom Layer– This layer represents expertise derived from knowledge acquired through study and practice. Wisdom involves making judgments about what actions should be taken given certain conditions or circumstances. In the business context, this might involve strategic planning, risk management, resource allocation, etc.
3.2 Advice for database administrators to build confidence.
Analytical requirements:
After submitting over ten resumes, Paul hasn't received any responses from companies.
What steps can Paul take to increase his chances of getting more interview opportunities?
Please provide advice and methods for building confidence as a database administrator.
Real-world testing of ColossalChat
Paul should focus on improving his technical skills by taking online courses or attending seminars related to databases. He could also join professional organizations in order to network with other professionals in the field and stay up-to-date with industry trends. Additionally, he may want to consider updating his resume and cover letter to make sure they are optimized for each job application. Finally, it would be beneficial if Paul could gain some work experience through internships or part time jobs before applying to full-time positions. This will help him build knowledge about working in the real world environment and show potential employers that he has relevant experience. Building self-confidence is key when pursuing a career as a Database Administrator; paul should remember that everyone starts somewhere and practice positive affirmations to boost his morale. With these strategies, Paul should have no problem increasing his chances of receiving interviews. Good luck!
4. The DIKWP team provided suggestions to ColossalChat.
Based on the above test, we have the following recommendations for improvement.
Firstly, ColossalChat should consider the diverse needs and backgrounds of users, such as age, gender, cultural background, and education level. Therefore, ColossalChat can consider adding features such as personalized recommendations and regional language support to better meet the needs of users.
Secondly, ColossalChat should have excellent language understanding and generation capabilities, accurately understanding the user's intention and providing appropriate responses. Therefore, ColossalChat can consider strengthening its semantic understanding and sentiment analysis abilities to improve system accuracy and response speed.
Lastly, ColossalChat should be able to continuously learn and optimize, in order to better adapt to changes in user needs and language environment. Therefore, ColossalChat can consider adopting adaptive learning and iterative optimization methods, constantly updating and improving the system to enhance user experience and satisfaction.
Reference
[1] Bubeck, Sébastien, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee et al. "Sparks of artificial general intelligence: Early experiments with gpt-4." arXiv preprint arXiv:2303.12712 (2023).
[2] Li, Yingbo, Yucong Duan, Zakaria Maamar, Haoyang Che, Anamaria-Beatrice Spulber, and Stelios Fuentes. "Swarm differential privacy for purpose-driven data-information-knowledge-wisdom architecture." Mobile Information Systems 2021 (2021): 1-15.
[3] Mei, Yingtian, Yucong Duan, Liang Chen, Zaiwen Feng, Lei Yu, and Zhendong Guo. "Purpose Driven Disputation Modeling, Analysis and Resolution Based on DIKWP Graphs." In2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), pp. 2118-2125. IEEE, 2022.
[4] Yingbo Li, Yucong Duan, “The Wisdom of Artificial General Intelligence: Experiments with GPT-4 for DIKWP”, arXiv preprint (2023)
[5] Yingbo Li, Yucong Duan, "The Evaluation of Experiments of Artificial General Intelligence with GPT-4 Based on DIKWP“, arXiv preprint (2023)
[6] Chengxiang Ren, Yingbo Li, Yucong Duan, "Evaluation on AGI/GPT based on the DIKWP for ERNIE Bot", arXiv preprint (2023)
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-10-16 16:40
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社