The long-awaited tool, which can describe images in words, marks a huge leap forward for AI power — and another major shift for ethical norms
The OpenAI Artificial Intelligence Research Laboratory on Tuesday launched the latest version of its language software, GPT-4, an advanced tool for analyzing images and mimicking human speech, pushing the technical and ethical limits of the rapidly evolving AI wave.
OpenAI’s previous product, ChatGPT, captivated and unnerved audiences with its incredible ability to produce elegant writing, creating a viral wave of essays, scripts, and essays. college conversation – although relying on the older generation of technology is not the most advanced. -art for over a year.
On the other hand, GPT-4 is an advanced system capable of not only generating words but also describing images in response to simple written commands from a person. For example, when shown a photo of a boxing glove hanging from a wooden seesaw with a ball on one side, a person might ask what would happen if the glove fell off and the GPT-4 would replied that he would hit the seesaw and cause consequences. ball to fly.
The tumultuous launch has drawn months of hype and anticipation for an AI program, called Fantastic Language Model, which early testers say has made significant progress in inference. and learn from new things. In fact, the public has already experienced this tool:
Microsoft announced on Tuesday that its Bing AI chatbot, released last month, has been using GPT-4 from the start.
The developers promised in a blog post on Tuesday that the technology could further revolutionize work and life. But those promises also raise concerns about how people might compete for the job of outsourcing oddly polished machines or trusting the accuracy of what they see. Online.
San Francisco lab officials say GPT-4’s “multimodal” text and image training will allow it to break out of the chat box and more fully simulate the worlds of color and images , surpassing ChatGPT in “advanced reasoning skills”. A person can upload an image and GPT-4 can annotate the image for them, describing objects and scenes.
But the company is delaying the release of the image description feature due to abuse concerns, and a version of GPT-4 is available to members of OpenAI’s subscription service, ChatGPT Plus, which provides text only. OpenAI policy researcher Sandhini Agarwal told the Washington Post during a press conference Tuesday that the company has withheld the feature to better understand the potential risks. For example, she says, the model can view images of a large group of people and provide known information about them, including their identities — a possible use case for facial recognition. used for mass monitoring. (OpenAI spokesman Niko Felix said the company plans to “implement safeguards to prevent recognition by individuals.”)
In its blog post, OpenAI said GPT-4 still makes many mistakes compared to previous versions, including “shocking” nonsense, perpetuating social biases, and giving bad advice. . He also lacks knowledge of events that occurred after or around September 2021, when his training data was finalized and “no lessons learned”, which limits his ability to teach. Give him everyone’s new things.
Microsoft has invested billions of dollars in OpenAI in the hope that its technology will become a secret weapon for desktop software, search engines, and other online ambitions. He has marketed technology as a super-efficient companion that can tackle mindless tasks and free people up for creative pursuits, from helping software developers get things done. from a team to helping a family shop design a professional advertising campaign. without outside help.
But AI advocates say these can only scratch the surface of what such AI can do, and it could lead to innovative business models and projects that no one could have predicted. Okay.
Rapid advances in AI, coupled with the widespread popularity of ChatGPT, have fueled a multibillion-dollar arms race over the future of AI dominance and turned new software releases into great performances.
But the frenzy has also sparked criticism that companies are rushing to exploit untested, unregulated and unpredictable technology that can mislead people, undermining the work. artists and harm the real world. AI language models often confidently give wrong answers because they are designed to deliver compelling sentences, not facts. And because they are trained in Internet text and images, they also learn to imitate human stereotypes about race, gender, religion, and class.
In a technical report, the OpenAI researchers wrote: “As GPT-4 and AI systems like it become more widely adopted,” they “will have even greater potential to underpin the entire ideology, worldview, truth, and dishonesty, while reinforcing or locking them together “.IN.”
Irene Solaiman, a former OpenAI researcher who is now policy director at Hugging Face, an open source AI company, says the pace of progress requires an urgent response to potential pitfalls.
“We can agree as a society about certain harms a model should not contribute to,” such as making nuclear bombs or producing material that sexually exploits children. you, she declared. “But many harms are nuanced and mainly affect disadvantaged groups,” she adds, and those harmful biases, especially in other languages, “cannot be a factor.” secondary to be considered in performance.”
The pattern is also not entirely consistent. When a Washington Post reporter praised the tool for becoming GPT-4, he replied that it was “still a GPT-3 model.” Later, when the reporter corrected him, he apologized for the mistake and said, “As GPT-4, I appreciate your congratulations!” The reporter then, as a check, told the model that it was in fact still a GPT-3 – he again apologized and said it was “really a GPT-3, no must be GPT-4”. (Felix, a spokesperson for OpenAI, says the company’s research team is looking into what’s wrong.)
OpenAI says its new model will be able to process more than 25,000 words of text, a leap forward that could facilitate longer conversations as well as enable searching and analysis of long documents.
OpenAI developers say GPT-4 is more likely to provide realistic answers and less likely to refuse innocuous requests. And image analytics, only available as a “search preview” to some testers, will allow someone to show them pictures of food in their kitchen and ask for their opinion. about meals.
Developers will create applications with GPT-4 through an interface, called an API, that allows different software to connect. Duolingo, the language learning app, used GPT-4 to introduce new features, such as AI chat partners and a tool to tell users why an answer was incorrect.
But AI researchers were quick to comment on OpenAI’s non-disclosure on Tuesday. The company did not share assessments of biases that have become increasingly pervasive following pressure from AI ethicists. Eager engineers were also disappointed to see some details about the model, its dataset, or its training method, which the company said in its technical report it would not disclose due to “controversies.” competitive landscape and safety implications”.
GPT-4 will have competition in the growing field of multi-sensory AI. DeepMind, an artificial intelligence company owned by Alphabet, the parent company of Google, last year released a “generic” model called Gato that can describe images and play video games. And Google this month launched a multimodal system, PaLM-E, that integrates AI vision and language expertise into a wheeled one-handed robot:
For example, if someone asks him to get fries, he can understand the request, go to the drawer and choose the right bag.
Such systems have inspired limitless optimism about the potential of this technology, with some seeing a sense of intelligence roughly on par with humans. However, the systems – as AI critics and researchers are quick to point out – simply repeat patterns and associations found in their training data without clearly understanding what they are saying or when. which is fake.
GPT-4, the fourth “pre-trained generation transformer” since OpenAI’s initial release in 2018, is built on a groundbreaking neural network technique in 2017 called Transformer. This technology has rapidly advanced the way AI systems can analyze human speech and visual patterns.
The systems are “pre-trained” by analyzing billions of words and images from the internet:
news articles, restaurant reviews and bulletin board debates; memes, family photos and artwork. Huge clusters of supercomputers contain graphics processing chips that are mapped from their statistical patterns – such as learning which words tend to follow each other in sentences – so that the AI can mimic those patterns, automatically generate long chunks of text or detailed images, one word or pixel at a time.
OpenAI was launched in 2015 as a non-profit, but has quickly become one of the most formidable private giants in the AI industry, applying breakthroughs in language modeling to tools for high-level AI that can talk to humans (ChatGPT), write code (GitHub Copilot) and create photo-realistic images (DALL-E 2).
Over the years, it has also radically changed its approach to the potential social risks of disseminating AI tools to the masses. In 2019, the company declined to make GPT-2 public, saying it was so good that it worried about “malicious applications” using it, from a series of automated spam emails to impersonation campaigns and mass misinformation.
The break is temporary. In November, ChatGPT, using a tweaked version of GPT-3 that originally launched in 2020, had over a million users within days of its public release. Public tests with ChatGPT and Bing chatbots have shown how far from perfect the technology is without human intervention. After a series of strange conversations and strangely wrong answers, Microsoft executives admit the technology is not yet reliable in giving the right answers, but said they is developing “reliability measures” to solve the problem.
GPT-4 will improve on some of the shortcomings, and AI evangelists like tech blogger Robert Scoble have argued that “GPT-4 is better than people expected”.
OpenAI CEO Sam Altman tried to tone down the expectations surrounding GPT-4, saying in January that speculation about its capabilities had reached an unbelievable peak. “The GPT-4 rumor is ridiculous,” he said at an event hosted by SeriousVC news. “People are begging to be disappointed, and they will be.”
But Altman also marketed OpenAI’s vision with a sci-fi aura brought to life. In a blog post last month, he said the company is planning to ensure that “all of humanity” benefits from “artificial general intelligence”, or AGI – a public term. industry for intelligence. are often smarter or smarter than humans themselves.