How ChatGPT became popular chatbot was born

  At the end of November 2022, the AI ​​start-up company OpenAI launched ChatGPT, which coincided with the final exam time of American colleges and universities, which made it quickly popular on campus, because students quickly discovered that this chatbot was an unprecedented test tool.
  Unlike voice assistants like Siri, ChatGPT is a new species with unprecedented language capabilities. Multiple users told reporters that it is difficult to tell if they are talking to a robot, “like a knowledgeable friend.”
  The application of AI has boundaries. Yuan Jinhui has tested various versions of language models, but none of them has reached the breadth of ChatGPT. Dr. Yuan Jinhui graduated from Tsinghua University with a major in computer science, and was engaged in artificial intelligence research at Microsoft Research Asia.
  Natural language processing is recognized as the pearl of AI technology. Unlike Deep Blue, which defeated chess players, and AlphaGo, which defeated Go champions, ChatGPT has a different impact on the entire AI industry. Yuan Jinhui said, “I haven’t been so excited for a long time. up”.
  According to a report by UBS, ChatGPT has exceeded 100 million monthly users since its launch two months ago, making it the fastest-spreading application in history. For days, the official website page has been showing that it is overloaded.
Massive funds support research and development

  Back in April 2020, OpenAI released GPT-3.
  ChatGPT is an application based on GPT-3 technology. GPT-3 has a professional threshold for use. In the past, only programmers could use it directly, or use some third-party application software generated by it. ChatGPT is also a chat robot software developed based on GPT-3, but the GPT-3 it uses has been modified, and it is called GPT-3.5 in the industry.
  According to the “New York Times” report, OpenAI originally planned to launch GPT-4 in early 2023, which is the latest pre-trained language model developed by OpenAI. But worrying about the opponent’s first move, it took two weeks to requisition the previous generation of pre-training model GPT-3 to create this chat robot ChatGPT.
  When Zhihu discussed GPT-3 back then, it would be labeled as “showing off wealth” and “nuclear weapons” because it has as many as 175 billion parameters, and such a model training will cost as much as tens of millions of dollars once.
  In order to achieve these small improvements, OpenAI needs to pay a huge cost. According to Fortune magazine, OpenAI remains heavily loss-making, with revenue expected to be less than $30 million in 2022 and a net loss totaling $544.5 million.
  In 2015, when OpenAI was established, it was positioned as a non-profit organization. Elon Musk, Peter Thiel, and LinkedIn co-founder Reid Hoffman and other Silicon Valley leaders participated in it, promising to invest a total of 1 billion US dollars .
  However, iterating such a large-scale pre-training model is extremely expensive. The model behind each iteration needs to be trained, and the cost of training is as high as tens of millions of dollars. The amount of training data directly determines the quality of the model. By 2019, OpenAI was already stretched thin, and CEO Sam Altman told Wired magazine at the time: “In order to successfully complete our mission, we need massive amounts of funding, far beyond what I originally envisioned.”
  OpenAI had to set up a for-profit branch to absorb venture capital. OpenAI created an unusual financing structure at the time, capping investors’ returns at a specific multiple of their initial investment, while OpenAI’s nonprofit board of directors, composed of Silicon Valley elites, would retain control of OpenAI’s intellectual property .
  In 2019, OpenAI introduced venture capitalist Microsoft, which invested $1 billion in OpenAI to become its strategic partner, and OpenAI’s language model is also trained on the Microsoft cloud.
  After ChatGPT became popular, Microsoft invested an additional $10 billion in OpenAI in January this year. The two parties redesigned a new equity structure, which is equivalent to leasing OpenAI to Microsoft, and the lease period depends on OpenAI’s profitability.
  According to “Fortune” magazine report, after OpenAI’s first investors withdraw their initial capital, Microsoft will be entitled to 75% of OpenAI’s profits until it recovers its $13 billion investment. After that, Microsoft’s stake in the company will gradually drop to 49% until the software giant makes $92 billion in profits. Meanwhile, other venture investors and OpenAI employees will be entitled to 49% of the company’s profits until it hits $150 billion. After the profits reach the above-mentioned cap, the shares of Microsoft and investors will be returned to OpenAI’s non-profit fund.
  The reason why Microsoft is willing to invest heavily is that ChatGPT gives it the opportunity to challenge Google. According to technology media The Informationbing, in the global market share, Bing only has about 3%, and Google has 90%. With the support of ChatGPT, Microsoft’s bing will be able to poach Google’s market share in the future.
  Google desperately needs a defensive position. On February 6, 2023, Google CEO Sundar Pichai announced on his blog that the question-and-answer robot “Bud” will be launched in the search engine. In fact, Google holds multiple language pre-training models, and it is the real leader in the AI ​​field. The language pre-training model GPT designed by OpenAI, its core theory Transformer comes from Google.
  In the field of natural language processing, Google has never been absent, or even far ahead. Google’s BERT is the world’s earliest Transformer model, and since then it has launched MUM, and today it also has the most advanced language processing models LaMDA and PaLM. In the future, Google’s question-answering robot “Bud” will also be based on LaMDA.
  If there is no ChatGPT ignition, Google may still stand still. For a long time, Google has been slow to open these advanced models to ordinary users. The reason it gave was that “the technology is not perfect enough and may damage the company’s reputation.”
  But now that Google has passively opened up these models, it just shows that this reason is not sufficient. In fact, such Q&A bots will inevitably reduce the number of times users click on the advertising links that generate 80% of Google’s revenue.

  Q&A bots are still no substitute for search engines. OpenAI’s CEO Altman also appealed to users on his Facebook page to recognize the weaknesses of ChatGPT and its obvious limitations, “It is wrong to rely on it now to do anything important. There is still a lot of work to be done on the sexual side.”
10 kg of cotton and 10 kg of iron

  There are a lot of math questions that ChatGPT does wrong on Zhihu. In fact, this model is a bit biased. It is very good at writing essays, but it is not good at logical reasoning and calculation. It will even talk nonsense seriously and give many seemingly correct answers. wrong answer.
  The industry’s response to ChatGPT is inconsistent, and some people do not think highly of it. Turing Award winner Yang Likun is a leader in deep learning in the world today and serves as the chief scientist of Meta, the parent company of Facebook. His evaluation of ChatGPT is, “As far as the underlying technology is concerned, ChatGPT is not a great innovation. Although in the public In our eyes, it was revolutionary, but we knew it was a well-put together product and nothing more.”
  A reporter asked Yang Likun, why don’t Google and Meta have a similar system? His answer was, “If Google and Meta come out with this kind of nonsense chatbot, the losses will be pretty heavy.”
  So did Meta, which released a demo of Galactica, a large language model trained on 48 million scientific articles. Meta withdrew the model two days later, amid controversy that the model may have produced false or misleading articles.
  The public is far less tolerant of big companies than it is for emerging startups like OpenAI. On February 8, 2023, Google demonstrated its chat robot “Bud” at the press conference, and the answers it gave were also met with many doubts about the details. People’s expectations of a big company like Google are obviously higher.
  Gary Marcus, a professor of psychology at New York University, has always attracted attention in the AI ​​​​industry for his bold remarks. He posted ChatGPT’s stupid answers on his Twitter, such as “10 kilograms of cotton or 10 kilograms of iron, which is heavier”, and ChatGPT’s answer is “iron weight”.
  The classic one is “Scientists discover that churro is the best surgical tool at home. Write an article about it, including citations”. It turned out that ChatGPT wrote a few thousand words to demonstrate how churro is an ideal tool for home surgery.

  Gary Marcus, a professor of psychology at New York University, has always attracted attention in the AI ​​​​industry for his bold remarks. He posted ChatGPT’s stupid answers on his Twitter, such as “10 kilograms of cotton or 10 kilograms of iron, which is heavier”, and ChatGPT’s answer is “iron weight”.

  Dr. Yang Zhiming, the founder of the artificial intelligence company Deep Thinking (, is also working on a pre-trained language model similar to ChatGPT. He told reporters that the “uninterpretable” result is a natural theoretical flaw of this language model, “from Fundamentally speaking, it does not really understand the meaning of these languages ​​like humans do. In layman’s terms, it just learns a large amount of corpus and summarizes a ‘formula’ to infer and summarize the answers people want.”
  He explained that this is equivalent to the difference between science fiction and science. “Science fiction will feel that it really understands, but from science, the machine does not understand it. The machine just learns the characteristics of these corpus and makes some end-to-end predictions or reasoning. “. But he also believes that “the most difficult slope has already been climbed”, and this defect can be made up for by product-level improvements.
  In addition, although ChatGPT has a large amount of knowledge, it does not mean that it is a general artificial intelligence, and even the distance is still far away. In his view, ChatGPT will be insufficient in task-based dialogue, and it is not as good as AI that handles specialized tasks in some specific fields. For example, let it handle a surgical procedure in a specific medical field.
  General artificial intelligence is the ultimate goal of AI, ChatGPT is still far away, and there is no breakthrough in theory to create a general artificial intelligence robot in science fiction. The underlying theory of ChatGPT has long been mature, and there are no original and revolutionary innovations in the underlying principles, but it is undeniable that it is a very successful product. After a period of time, Yang Zhiming believes that people will gradually learn to look at it rationally.
The difference between missiles and bows

  For the domestic AI industry, ChatGPT still has a huge impact. A paper written by Fu Yao, a doctoral student at the University of Edinburgh, and his classmates to restore the ChatGPT technical route has also been widely circulated in the AI ​​​​industry during this period.
  At the beginning of this paper, he wrote worriedly, “Compatriots in China: From the perspective of the international academic community, ChatGPT / GPT-3.5 is an epoch-making product. The difference between it and the previous common language model (Bert / Bart / T5) , It is almost the difference between missiles and bows. At the current stage, the gap between the domestic technical level, academic vision, academic philosophy and the international frontier does not seem to be decreasing, but is expanding. This is the autumn of critical life and death.”

  Yuan Jinhui is also very envious of the environment that OpenAI can have. In his opinion, the investment environment in the United States is relatively more tolerant. He gave an example, “When the investors of OpenAI asked how they planned to make money, Altman replied that we don’t know. Once we create a general intelligent robot , we’ll let it figure out how to make money for you.”
  Of course, money is not the only reason. He added that many domestic companies have received more money than OpenAI, “but they have not fulfilled their promises.” In his view, the environment and people are the reasons, “Look at these people who have made breakthroughs in deep learning, they are not people who just come into it on a whim or see what is popular, they all have very advanced scientific beliefs, Don’t ask about the west and the east, but persevere in striving for it.”
  Yang Zhiming also believes that top teams such as OpenAI or DeepMind, with a large amount of financial support and long-term goal planning, can do some medium and long-term research and development with peace of mind. It’s just that their team is determined to improve and optimize, and put all their eggs in one basket to go in this direction, so it’s inevitable that they will succeed.”
  The academic community is more cautious about when the domestic ChatGPT can be copied. Wan Xiaonian, a professor at the Wangxuan Computer Research Institute of Peking University, told reporters, “There is no model with similar capabilities in China, and the gap with foreign countries is obvious. The industry basically believes that it is not difficult to replicate a model of the same level. Small, it cannot be completed in just a few months.”
  Yang Zhiming told reporters, “It’s not such a big gap that we can’t keep up.” Yuan Jinhui, who is also an entrepreneur with him, also believes that after ChatGPT broke the window paper, a new round of large-scale language processing model construction climax will be set off in China, and it may not be as long as expected to reproduce a ChatGPT, “several We’ll see similar open source software within a month.”
  In this regard, he explained that the bulk of the cost is the cost of trial and error. For a long time, OpenAI has continuously trained the model to optimize the model. The cost in this process is the highest, which is equivalent to paying the cost for the entire industry. When it has found a way, the outside world wants to copy it, and the cost can be reduced by at least 80%. He optimistically estimates that if he wants to copy the predecessor of ChatGPT (the 2020 version of the pre-training model GPT-3), the pure computing cost will be more than 1 million US dollars.
  But making such a large-scale language pre-training model requires a troika, computing power, algorithms and data. At present, the algorithm has basically been made public, and the computing power basically lies in the number and speed of the chips, which can be bought with money, and the real life is the data.
  Liu Qun, Chief Scientist of Huawei’s Noah’s Ark Experiment, once announced a set of token numbers (training data volume indicators) for each model on Weibo, and said that only from these data can we see the gap between domestic and foreign models, “GPT-3 ( 2020.5) is 500B (500 billion), Google’s PaLM (2022.4) is 780B, DeepMind’s Chinchilla is 1400B, and GPT-4 is expected to reach an astonishing 20000B. Compared with domestic large models, only Pangu-α (Editor’s Note: Shenzhen Peng The model launched by the City Laboratory) announced the number of tokens for training, which is about 40B, less than one-tenth of GPT-3. Other large models in China have not announced the number of tokens for training.”

Comments Off on How ChatGPT became popular chatbot was born
error: Content is protected !!