||
This is my recent invited talk given to young entrepreneurs on the LLM and ChatGPT ecosystem.
Natural Language Processing (NLP) is the crown jewel of AI. AI is mainly divided into perceptual intelligence and cognitive intelligence, and the leap from perceptual intelligence to cognitive intelligence is mainly reflected in the ability to complete NLP tasks. Human language is the carrier of human knowledge, and mastering language is a gateway to entering human cognitive intelligence. For thousands of years, eliminating language barriers has always been a dream of mankind. Babel in the Bible refers to the tower that mankind wished to build to overcome barriers of human languages, but it was considered to be impossible to build. We NLP practitioners have also been pursuing this dream, hoping to get closer to the final goal of overcoming the language barrier.
However, on November 30, 2022, remember this day, with the official launch of the ChatGPT model by the American artificial intelligence company OpenAI, the Tower of Babel was officially completed! It not only successfully eliminated the language barriers for mankind but also established a bridge between humans and machines. In no time did we all realize that a ChatGPT tsunami had swept across the world.
Why is ChatGPT judged to be the Tower of Babel? Because its language performance is actually more “native” than native speakers: native speakers inevitably have slips of the tongue from time to time, but the large generative language model like ChatGPT is difficult to make such mistakes and seems to be always in line with language habits. From the input side, it can understand any human language. From the output side, it can speak fluently. What is most shocking is that from its language performance, we can observe what is called the “Chain of Thought” (CoT) behind its responses, with certain logical reasoning abilities, giving people the impression of being clear and organized. Behind the input and output is the so-called LLM (large language model, GPT in particular), which is like a bottomless black hole to users. Inside are actually many layers of neural networks, represented internally as multidimensional vectors, which house a ton of knowledge.
Let’s take a look at how the LLM behind ChatGPT is developed. There are already tons of technical introductions on this topic, and we will briefly describe the underlying principles. Its basis is GPT-3, or more precisely, the latest version called text-davinci-003. This model is first of all extremely large in scale, and its size is believed to have made miracles happen. With billions of tokens as training data, it forms a model with billions of parameters. Research has shown that generic large models will exhibit an “emergence” of certain skills once they reach a certain scale, and these emerging skills can perform well in various multi-task scenarios with minimal prompting. Previously, this phenomenon was generally attributed to the “transformation of quantity into quality”, and it was basically treated as a mystery in philosophical terms. It is like saying that everything is attributed to God’s favor.
In my understanding, it is not that mysterious, but a reasonably natural result as the emergence of multi-task skills has to be based, and can only be observed, on a super-large data model. This is because otherwise, there is no sufficient space for the model to tune itself based on human preferences. Large language models are learned from text sequences, and their greatest feature is their ability to over-generate, giving many possibilities for subsequent sequences like “chain reactions”, but only a small percentage of these possibilities are desirable and beneficial. Many generations may be shallow, empty, or even toxic. ChatGPT’s breakthrough lies in the meticulous final fine-tuning process, using reinforcement learning as its core, it found an effective method to keep aligned with human preferences. This is like having a huge basin with numerous children bathing inside, and now you want to pour out the bathwater without pouring out the children. It is almost impossible. But if you can afford to lose some, the result is that the water is poured out, with some good children still inside the basin to help the case. The premise of doing this is that the basin must be large. Only super-large data models can achieve this with sufficient abilities left for numerous tasks. For example, what proportion of parallel translated text or of data of question-and-answer pairs is there in a normal language raw corpus? It’s a tiny tiny fraction, and when the data size is small, it is hard to learn the translation or question-answering skills from sequence-based learning. Only with super-large data and model can the small proportion multiplied by a large number of tokens create the necessary conditions and soil for implicit learning of such skills. In a basic model with almost infinite generation possibilities, if enough work is not done in a later stage, the probability of generating useless responses is high. Therefore, “aligning with human preferences” becomes the ultimate goal of fine-tuning. In this process, many children were also poured out, which is called the “alignment tax” in the literature. But it doesn’t really matter, because people can’t see the lost treasures, as long as they see the good results, it’s fine. Large models have enough redundancy and can survive filtering and pruning at all levels. In fact, it is not the large model itself that creates miracles, but the large model prepares a warm bed for miracles to happen.
What makes ChatGPT different from previous large models is that it has carefully planned for reinforcement learning from human feedback. For a generic open system, humans cannot really pinpoint where it is right or wrong, but at least they can say whether the response is good/useful or bad/no-value. Using this type of feedback to reinforce the learning and to fine-tune the large model, ChatGPT suddenly becomes very human-like. Human-machine interaction has changed from humans accommodating machines and having to write code, to machines accommodating humans and understanding human language. This is a huge transformation.
Reinforcement learning is relatively a difficult type of learning algorithm compared with other supervised learning approaches because it involves a long chain and the definition of the ultimate goal is not explicit and direct, but indirect based on the final outcomes. The idea behind training is to suppress the high probability of poor performance in the original model and bring out the low probability gems hidden in the model: the child is the reinforcement target that conforms to human expectations, but not a specific child as the optimization target. In any case, there is no unique answer format in this world, and there is usually no golden standard for a generation. What we have is the fuzzy feedback given by humans based on preferences: this answer is good, that one is nonsense; this one is correct, that one is discrimination. A typical method that can make good use of this terminal feedback is reinforcement learning. Once this feedback loop is established, the model can be continuously strengthened and iterated, and its performance will naturally improve. So, after some meticulous learning from human feedback, on November 30, 2022, the curtain was lifted, and this was the moment when humans witnessed the miracle.
To be honest, I have been engaged in NLP for my whole life, and I never thought I would see such a miracle in my lifetime. It has been three months since ChatGPT was created, and it still feels like a dream. Sometimes I stare at the ChatGPT icon and ask myself, is this the language gateway to the new ecological universe? I have to say that all the signs indicate that ChatGPT has unlimited potential for NLP.
Let’s take a step back and review the contemporary history of the golden decade of artificial intelligence.
Ten years ago, in the ImageNet competition, deep learning overwhelmingly crushed all other machine learning performances in the image field, triggering a landmark neural network revolution. Deep neural networks rely on supervised learning of big data. Since then, we have known that as long as the data is large enough and labeled, deep learning can handle it. After sweeping through image, speech, and machine translation, it encountered the stumbling block of NLP because many NLP tasks do not have large-scale language data with labels.
Five years ago, the NLP field saw the emergence of large language models (LLMs) represented by BERT and GPT. LLM can directly “eat” language without the need for annotations, which is called self-supervised learning in academia. LLM marks the arrival of the second revolution, which pushed NLP to the center of AI and became the core engine of cognitive intelligence. AI finally overcame the dependence on labeled data which had been the knowledge bottleneck for NLP, leaping from perception to cognition.
Three months ago, ChatGPT was born, creating an almost perfect human-machine natural language interface. From then on, machines began to accommodate humans, using natural language to interact, rather than humans accommodating machines, using computer language. This is a groundbreaking change.
From the emergence of LLM to the advent of ChatGPT, it truly externalized both its linguistic talent and its knowledge potential, allowing ordinary people to experience it. Looking back, human-machine interaction and its related applications have been explored for many years, but before ChatGPT came out, it had never really been solved. When the GPT-3 model was launched two years ago, skilled players of us already knew how capable it was. As long as you give it a few examples, it can follow the examples to accomplish various NLP tasks, so-called few-shot learning. It does not require major modifications to the large model or large-scale labeled data. With just a few examples, GPT-3’s potential can be unleashed to accomplish various NLP tasks, which is already amazing as it overcomes the knowledge bottleneck of supervised learning. However, the basic limitations of these amazing performances of LLM are mostly known within a small circle of players, and a language bridge is needed for its true breakthrough. ChatGPT has come forward with its biggest feature, zero-shot learning, which means that not a single labeled sample is needed, and you can directly tell it what to do. After five years of supervised learning and five years of self-supervised learning of the deep neural network revolution, the final result has been delivered, and the ChatGPT Bebel tower has been fully constructed, marking the pinnacle of the golden decade of AI. ChatGPT has since been like a tsunami, stirring up the world and causing a sensation all over.
Looking at the history of AI from a broader perspective, 30 years ago, the main approach to NLP tasks was through symbolic logic. Symbolic routes and machine learning are the two paths that have alternated in dominance in AI history every 20-30 years, like a pendulum. But in the past 30 years, machine learning has been on the rise as the mainstream, with the deep learning revolution in the last 10 years. The pendulum shows no sign of swinging back. We practitioners have been on a long journey of the symbolic rule system. It is not in the mainstream, rarely even mentioned by anyone, but it has not been lacking in its own innovation with its own differentiated advantages. It is worth noting that the symbolic parser has eventually embraced data-driven empiricism and relies on a pipeline of multiple modules to ultimately deal with the hierarchy of language structures. We call this deep parsing. Similar to LLM, deep parsing consists of many levels (around 50-100 levels) of bottom-up processing. It also first digests the language but parses incoming sentence sequences into internal symbolic graph structures, rather than LLM’s vector representations. Although deep parsing and deep learning take different representation schemes, both empower downstream NLP tasks, one with structures and the latter with vectors, both greatly improving the efficiency of downstream NLP tasks. Of course, LLM is still the stronger player because it not only masters syntax structures but also performs exceptionally well in discourse and computational styles, the former involving long-distance discourse relationships and the latter capturing subtle differences in language expressions. Discourse and computational style pose a significant challenge to parsers that primarily focus on sentence structures.
There have always been two main lines in AI. In addition to machine learning, there is traditional symbolic logic, which rises to the philosophical height of rationalism versus empiricism. These two paths have waxed and waned over the past 30 years, with machine learning on the rise and symbolic logic disappearing from the mainstream stage, although the industry has never given up on its use. The transparency and interpretability of symbolic logic translate directly into the convenience of engineering fixed-point error correction, which contrasts with LLM’s black-box-like internal vectors. LLM can use retraining to macroscopically improve, or use fine-tuning or few shots to induce. LLM cannot do pinpoint correction or debugging like in surgery. LLM’s lack of interpretability also often causes user concerns and confusion in practical applications. Perhaps one day in the future, the two paths will converge at a point where a new AI revolution will occur.
From the perspective of AGI, we see that almost all models before LLM were specialized, and the narrower the task, the better the performance. One exception is the parser, which is in essence the “symbolic foundation model” in the pre-LLM era, empowering downstream NLP tasks with structures, just like LLM does with vectors. From a more general perspective, the emergence of LLM represents a breakthrough in the development of artificial intelligence towards achieving AGI, or Artificial General Intelligence. AGI has long been a controversial goal, and many scholars, including myself, have doubted or even mocked its feasibility. However, with the advent of LLM five years ago, AGI became more scientifically viable, rather than just a Utopia. OpenAI, which champions AGI, has become the shining star in this field, having delivered a long list of influential LLM general models that include the GPT series for NLP, Codex for code writing and debugging (eventually used for Microsoft’s Co-pilot service), and DALL-E for image generation.
With ChatGPT as the pinnacle, large models have taken over all NLP tasks simply by using natural language as instructions, not only those defined by the NLP community but also many user-defined tasks. Its NLP tasks are completely open. Tasks related to language and knowledge can be attempted in any language, and often the results are immediate and magical at the same time. Someone has listed 49 task scenarios that it can handle, but it can actually do much more than that. In addition, new scenarios are being discovered all the time. This is an unprecedented phenomenon in the history of AI, which the industry calls “skill emergence”.
We can examine why it is so capable and knowledgeable. Overall, human systematic knowledge is largely expressed in language. Human knowledge is mainly carried in the form of text (written language), and mathematical formulas can be seen as an extension of written language. From a linguistic perspective, human knowledge can be divided into linguistic knowledge and knowledge beyond linguistics. Linguistic knowledge includes lexicon knowledge, syntax, morphology, discourse, style, etc. Knowledge beyond linguistics is a much broader circle with a much wider boundary. Large language models have not yet mastered human knowledge as a whole, and it seems that they have managed to capture some knowledge floating on top of the sea of human knowledge. As for ChatGPT, it can be said that it has mastered almost all of the linguistic knowledge, but only about 20% of human knowledge in general, including common sense, basic logic, and encyclopedic knowledge. It calls for more serious research to quantify it properly, but in the ballpark, it feels like about 20% of the knowledge has been learned, and the remaining 80% is still not within reach. However, the law of large numbers applies here, namely the 80-20 rule, which means that mastering 20% of the knowledge floating on top in effect covers 80% of the scenarios. However, since there is still an 80% knowledge gap, it still pretends to know things it doesn’t from time to time. Given that, LLM can still reshape the ecosystem and the world if we learn to use its strengths and to handle its weaknesses wisely.
How do we judge whether it has learned and how well it has performed a task? In any NLP task, there is a quality assurance (QA) protocol to follow, which requires at minimum a test set of annotated samples. Currently, ChatGPT uses zero-shot learning (i.e. zero samples), where a random task is assigned to it and once it is done, it moves to a new task, so there is no chance for building a persistent test set. So its performance on result quality cannot be quantified directly. In such cases when the internal testing protocol is missing or no longer applicable, external methods must be used to evaluate the data quality indirectly, such as customer surveys or using my previous company Netbase’s social listening service to collect customer feedback online. All the external signs indicate that customer satisfaction seems to be over 80%, and in most task attempts, customer needs are met fairly well, at times with nice surprises and miracle-like performance. Another relatively objective external indicator is user stickiness and growth of user accounts. ChatGPT has set unprecedented records in this regard, with tens of millions of users in just a few weeks. ChatGPT’s customer growth rate exceeds everyone’s imagination.
In conclusion, ChatGPT represents a major breakthrough in the field of natural language processing and artificial intelligence. As a large language model, it has revolutionized the way we approach NLP tasks and has demonstrated remarkable versatility and capability. However, it is important to keep in mind that ChatGPT is not perfect and there is still much work to be done in terms of improving its performance and addressing its limitations.
Despite these challenges, ChatGPT has already had a profound impact on the field of AI and is poised to continue shaping the future of technology in significant ways. As AI continues to evolve and advance, it is likely that we will see more breakthroughs of LLMs that push the boundaries of what is possible and help us achieve even greater levels of understanding and innovation.
Over the last three months, there has been no end of online forums, discussions, and talks about ChatGPT, and there is still no sign of aesthetic fatigue. Recently, the former head of Y Combinator China Dr. Lu Qi came to Silicon Valley to give a passionate speech, which added fuel to the fire. He compared ChatGPT’s revolution to Web-1. As we all know, the iconic brand that represented the first Internet boom was the Netscape browser. Although Netscape did not grow to a large company, it was the internet revolution it started that created giants like Yahoo, Google, and Amazon. A similar revolution occurred in China, giving rise to world-class companies such as Baidu, Tencent, and Alibaba. Lu Qi believes that we are right now in such an era. He said that the roadmap is so clear, and the trend is so obvious that he has absolutely no doubt in his mind. Overall, I largely agree with his view of technological trends and landscape.
ChatGPT marks the emergence of a new era. Some people say that this is the “iPhone moment” or “Android moment” in the history of contemporary information technology and will lead to a brand-new ecosystem. I feel that Lu Qi’s comparison is more comprehensive, as ChatGPT is like the “Netscape browser” that initiated the first Internet revolution. Regardless of the comparison, it is a game-changer.
However, it is essential to note that ChatGPT also has its shortcomings and challenges. One issue that everyone has noticed is the so-called hallucinations, in fabricating details and distorting facts. Although ChatGPT has conquered any form of human language, it has only scraped the tip of the iceberg of cognitive intelligence. Is it possible for LLM to solve this problem completely? In my opinion, the LLM route alone will not solve cognitive intelligence. As mentioned earlier, ChatGPT has only covered about 20% of human knowledge. Even if LLM continues to expand several orders of magnitude in sequence-based learning, in my estimates it can at best reach 40%-50%. The remaining 50% is a deep sea that can hardly be fathomed. The long tail of knowledge is an absolute explosion of combinations, way beyond the reach of sequence-based language learning. The annoying behavior is that for any knowledge beyond its ken, LLM will not hesitate to fabricate it with fake details that appear genuine. This is a severe problem. The accuracy defect of such long-tail knowledge is an inevitable problem for application services based on LLM.
Moreover, there are many other issues that need to be overcome. For example, when a large model empowers downstream scenarios, how can customer privacy and security be protected during the process of calling the large model? This problem has not yet been solved, but it is believed that better solutions will develop in time. The supplier of large models will surely pay special attention to this issue and provide solutions for their ecosystem’s development.
Another issue is the complex reasoning ability. From the conversations of ChatGPT, we observe that it already has basic reasoning ability. The source of this ability is very interesting. It mainly benefits from self-supervised learning of the massive computer code base. The GPT3.5 on which ChatGPT is based has been trained not only on human natural language but also on massive available open source code written in various computer languages on GitHub, and most of the code has corresponding natural language explanations (comments) too. Since computer code is by nature more logical than natural language, this has helped ChatGPT to organize its response and speak more coherently. This was said to be a nice surprise that the developers themselves had not anticipated. However, it currently still has shortcomings in complex reasoning logic. Fortunately, complex reasoning ability is different from the boundless knowledge network. It is a relatively closed logical set, and it is believed that it can be solved in not too far a future (perhaps GPT4 might already be able to handle it?).
Lastly, let’s talk about the progress of multimodal learning. LLM, as the basic model, has been validated in NLP multi-tasking and has performed exceptionally well. After the breakthrough in NLP, the framework for empowering downstream tasks with a basic model began to radiate toward other modalities. This direction of research is very active in the academic field of multimodal learning. Everything is still ongoing. Currently, the level of multimodal learning in practice is still in the stage of prompt engineering. What is lacking is a natural language interface. People who play with prompts in large models for image and music generation already know the huge potential and effectiveness of the basic model. It is very similar to the situation when we played with few-shot prompts in the GPT-3 playground before ChatGPT was born. It can be foreseen that in near future, a smooth natural language interface will emerge, and users will be able to describe the art they desire, whether it is a painting or a song. The work of aligning with human taste is also ongoing. It is predicted that a natural language to image (NL2img) model like “ChatDalle”, similar to ChatGPT, will implement the desired natural language interface. The same trend is bound to happen in natural language to music (NL2music). We are in an exciting new era of AIGC (AI-generated content) for art creation.
Another predictable picture is that based on the trend of multimodal LLM, there will eventually be a unified large model that integrates various modalities and their associated knowledge. The breakthrough of this model barrier will provide critical support for entrepreneurs to utilize LLMs to empower downstream applications in various scenarios. As we all know, whether it is finance, law, or medicine, each major vertical has its accumulated long-standing structured symbolic knowledge base, including the domain ontology and other databases. How to connect to the domain’s symbolic resources involves breaking the domain barrier. It is expected that this barrier will be largely solved in the next two to three years.
The direct impact of the ChatGPT tsunami is that the NLP ecosystem is facing a reshuffle, and every existing information product or service must be re-examined in the context of LLM.
When we first discussed ChatGPT’s impact on IT services, the first thing that came to our mind was how to combine ChatGPT with search technology, and whether it could re-invent search.
Search is traceable, and every returned result is recorded, so it involves no information fusion. ChatGPT is untraceable and excels at information fusion: ChatGPT has no possibility of plagiarism in essence. Every sentence it spits out is novel sequence based on its digested information sources. Apparently, traditional search and ChatGPT have their own respective advantages and disadvantages. Search is the king of information services, ubiquitous, with a very stable business model. Since the rise of search in the Web 1.0 era, the form and mode of search have basically not changed for more than 20 years. In fact, new technologies and entrepreneurs have been trying to challenge search continuously over the years, and the venture capital industry has also been paying attention to potential search subverters that may become the “next Google”, but the status of search has always been unshakable, at least until now. But this time is different. Microsoft has exclusive code authorization for ChatGPT and has boldly launched the so-called “new Bing”. Google, who has dominated the space for so long, has to mobilize urgently and confront it head-on. A drama of search+LLM is unfolding, like a live drama, telling us that although there are still many difficulties to overcome in integrating these two technologies, the trend is unstoppable, and reshaping a new ecology of search is imperative.
In addition to search, those finely polished directional information products and services now face the fate of being re-examined and reformed, including chat, virtual assistants, grammar correction, machine translation, summarization, knowledge Q&A, etc. The representative services in these areas (Siri, Grammarly, etc.) used to have high technological barriers, which have suddenly been lowered. Although many products are not facing a catastrophic crisis due to years of polishing and user inertia, some may still exist for a long time, after all, they are all on a downhill road. This is a revolutionary victory of general AI over traditional AI. It is something we would not believe feasible before. We used to be so skeptical of the general approach, waiting to see the joke of those who advocated AGI, such as Open AI who managed to launch a series of impressive LLMs (GPT series, Codex, DALL-E) including ChatGPT.
Look at Siri, which was released by Apple 13 years ago. 13 years is longer than the entire golden decade of the deep learning revolution, but Siri has only recently managed to offer 2-round or 3-round conversations. Amazon’s popular product, Alexa, is the same. It has been polished for several years and accumulated so much user data. Now, with the advent of ChatGPT, what will Apple and Amazon do? They must embrace LLMs.
Next is the commonly seen e-commerce customer service. As we all know, Alibaba and JD.com’s online after-sales customer service has been polished to perfection. Because after-sales service issues are relatively concentrated, the problem set is not large while the data are large, accumulated over the years. However, customer service is not only limited to post-sales. In order to handle customer service smoothly, LLM cannot be ignored.
Moving on to education, it’s clear that the ChatGPT model has the potential to revolutionize all education products and services. Anyone developing educational applications will need to reconsider how to embrace LLMs within the framework of the large model. Education itself deals with language, regardless of whether it is related to arts or science. Although the current large model is not particularly strong in science and engineering (yet), this knowledge gap will be filled to varying degrees soon. ChatGPT is sure to disrupt education, while also providing the largest opportunity for modernizing education. Language learning and computer programming education are obvious areas for ChatGPT to shine, as the model itself is a language model. Although its programming abilities are not yet at the level of professional engineers, it is proficient enough in common code formats to assist with programming and with the learning of programming. In fact, Co-pilot, which has been empowered by the GPT codex, has already become an auxiliary tool for more and more programmers.
Stepping back, we are also facing a huge risk, such as fake news. If one wants to promote a company or product, one can now use ChatGPT to generate all kinds of promotional posts that sound convincing. In the future, those online reviews and comments will also be obscured by fake news, as the cost of creating fake news approaches zero. Without proper precautions, all of this could place humanity in a world where truth and falsehood are indistinguishable. All along, we have been talking about the benefits of LLM and how it can empower new ecosystems for productivity explosion. We expect that in the next five to ten years, new international IT giants like a new Google or New Alibaba will emerge under this new ecosystem, leading to a major transformation in the technology ecosystem. But the danger of LLM misuse is equally great. Is mankind ready for it? Clearly not. Of course, this is another topic, and we will leave it there for now.
With LLM (ChatGPT in particular), there are more product forms and services waiting for entrepreneurs to explore.
Regarding this topic, we need to emphasize the unprecedented entrepreneurial conditions brought by ChatGPT. ChatGPT itself has become a testing ground for products. It is a playground with an infinitely low bar that everyone can play in. The low bar is due to the paradigm shift in human-machine interfaces mentioned earlier. For the first time in AI history, machines began to cater to humans, rather than humans catering to machines. Human language, rather than computer code, became the tool for human-machine interaction. The significance of this change for the new ecology of NLP is difficult to overemphasize. In fact, this provides conditions for “mass entrepreneurship”.
Those who have started AI businesses should all have this experience. The most basic condition for a startup team to have a chance of success is that the product manager and the technical leader can work closely together and communicate effectively. The product leader, relying on their market intuition and understanding of customer needs, strives to find the best market entry angle for technology to be transformed into a service and form a product design plan. The feasibility of this design plan needs to be endorsed and then developed by the technical leader. However, often due to different professional backgrounds and knowledge structures, the situation where the product manager and the technical leader talk past each other is not uncommon. Once this situation arises, the startup company is basically doomed to fail.
ChatGPT fundamentally eliminates the problem of talking past each other. Previously, only the technical leader and programmers could verify the feasibility of a plan, but now, the product leader/CXO, engineers, data analysts, and users with different backgrounds and expertise all have a unified platform, ChatGPT, on which they can illustrate product ideas. Everyone can simulate services on it. Not only has the communication barrier between humans and machines been overcome, but also the communication barrier between different teams. The emergence of this thing is a precondition for a product explosion and mass entrepreneurship.
In the United States, hundreds of startups are now exploring ideas of downstream products and services following ChatGPT or the backend LLMs. While the upstream big models are still rapidly progressing, what they are doing downstream is already in active development. There are countless ordinary people sharing their stories online, showing how they can earn 5,000 dollars using ChatGPT in just two or three hours. This kind of sharing means that the entrepreneurial enthusiasm of grassroots people has been mobilized. It seems that everyone can use this opportunity to find an entrepreneurial perspective. Summarizing these grassroots ideas may also lead to new tracks that can be standardized and scaled to meet market demands.
A big model like ChatGPT is ultimately an operating system-level existence. Every AI-related information product and service, especially those related to language and knowledge, cannot do without it. When Intel dominated the market, the famous logo was “Intel Inside”. In the future, it will be “Chat-Inside”, or more accurately, “Chat-In&Out”. Why in and out? When a big model like ChatGPT empowers products, it is both like a waiter and a chef. The waiter can take your order, interact with you, and understand your needs while also doing the cooking and delivering the service. It requires both language talent and knowledge skills. This is what we call the LLM expert workbench, which may be the biggest new ecological form in the next five years and may open countless doors for entrepreneurship. The basic service form is online information services in various industries, whether it is online education, online lawyers, online consultants, online finance, or online tourism. All are aimed at significantly improving service efficiency. With ChatGPT, you only need to hire one expert to replace the 10 experts that were previously needed to handle tasks. The end result is a productivity explosion.
In conclusion, the wave of mass entrepreneurship is coming, and ChatGPT has brought unprecedented entrepreneurial conditions. It has become a testing ground for products with an infinitely low bar that everyone can play in. The emergence of this technology has eliminated communication barriers between humans and machines and between teams, leading to new tracks that can be standardized and scaled to meet market unmet needs. The future of ChatGPT as an operating system-like existence may be the biggest new ecological form in the next five years, called the LLM expert workbench, which open doors for entrepreneurship and will lead to a productivity explosion.
At this point, the application ecosystem seems very clear. The principle is that experts must be the final filter before delivering the results (human judge as final filter). This is the basic setup, but experts may also provide input prompts to inspire LLM to produce better results.
For almost every application scenario, there is a task to create an expert workbench, including supplementing existing products or services, such as every segment of online education, as well as online doctors, lawyers, financial consultants, etc., and exploring previously unthought-of business scenarios. This is a visible transformation or reshuffling of the ecosystem, providing efficient expert advice (expert-in-loop services).
Speaking of workbenches, e-commerce giants have built relatively large customer service workbenches, which were introduced when user needs and satisfaction could not be met with fully automated solutions or with fully manual solutions. Now with LLM, this form can be extended to all online service sectors. The productivity explosion that this can bring about is beyond imagination.
The design concept of “Human as Judge” has been validated for several years in low-code platforms (such as RPA platforms, parser-enabled information extraction platforms, etc.) for its effectiveness and efficiency. Here, we are talking about a completely new form, where humans only need to act as judges to complete the service. It is now entirely possible to create online information service workbenches tailored to various segments or scenarios, with experts sitting in the background. Specifically, the expert’s role is only to make the decision based on their knowledge and experience, especially at the final “go or no-go” moment. Being a judge is much more efficient than being an athlete.
https://liweinlp.com/?p=10320Download
It is worth emphasizing that ChatGPT brings something new as enabling information technology, as it serves both at a backend and a frontend. It can perform well in high-level and low-level tasks, which is why chat is just the surface of ChatGPT, and its essence is a human-machine interface. Its ability to complete various NLP tasks is at its core. With both surface and essence, downstream products or services can be built around it. In the Intel era, computer product brand advertisements were remembered as “Intel inside,” and in the future, the new ecology should be called “chat in&out,” which refers to the new ecology empowered by LLM, not only empowering the human-machine interaction but also empowering the professional services, with only experts providing the final check. In this form, the experts are behind the scenes. To put it another way, LLM is both a waiter and a chef, but an expert needs to review the food and take responsibility before it is served to ensure service quality (such as online doctors, lawyers, consultants, etc.).
In such an ecosystem, the next five years will be a period of explosive growth for online services. Fortunately, the three-year pandemic has greatly promoted the grassroots awareness of online services, helping to cultivate user online habits and develop the market.
While LLM is powerful in terms of breadth of knowledge, it also has its limitations in terms of precision. The key challenge in building an expert-in-loop service is to overcome the precision bottleneck of LLM. The goal is to raise the precision to a level where it does not significantly impact the efficiency of the expert’s work. If at least 1/4 of the results generated by LLM can match the level of a manual expert’s research, then the efficiency of the expert-in-loop service can be ensured. This is a feasible expectation, and the current solutions are not far from meeting this threshold. With this in mind, we conclude that the door to entrepreneurship in the new ecology of LLM has indeed been opened.
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-12-24 07:01
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社