Meta’s AI team has developed a massive new language model that shares the remarkable abilities as well as the serious flaws of OpenAI’s pioneering neural network GPT-3. In an unprecedented move for Big Tech, it is giving it away to researchers, along with information on how it was constructed and taught.
Meta’s action marks the first time that a completely trained big language model would be made available to any researcher interested in studying it. Many people are relieved because they are concerned about the way this sophisticated technology is being developed by small teams behind closed doors.
“We strongly believe that the ability for others to scrutinize your work is an important part of research. We really invite that collaboration,” says Joelle Pineau, a longtime advocate for transparency in the development of technology, who is now managing director at Meta AI.
“I applaud the transparency here,” says Emily M. Bender, a computational linguist at the University of Washington and a frequent critic of the way language models are developed and deployed.
Large language models—powerful systems that can generate pages of text and simulate human conversation—have emerged as one of the most popular AI concepts in recent years. However, they have serious problems, repeating disinformation, bias, and harmful language.
In theory, putting more people to work on the problem should help. However, because language models require massive amounts of data and processing power to train, they have traditionally been projects for rich technology firms. The larger research community, including ethicists and social scientists concerned about their misuse, has had to stand by.
Meta AI says it wants to change that. “Many of us have been university researchers,” says Pineau. “We know the gap that exists between universities and industry in terms of the ability to build these models. Making this one available to researchers was a no-brainer.” She hopes that others will pore over their work and pull it apart or build on it. Breakthroughs come faster when more people are involved, she says.
Meta’s Open Pretrained Transformer (OPT) model is now available for non-commercial use. It is also releasing its source code and a logbook that tracks the training process. The logbook comprises daily updates from team members regarding the training data: how and when it was put to the model, what worked and what didn’t. The researchers documented every issue, crash, and reboot in over 100 pages of notes throughout a three-month training period that lasted from October 2021 to January 2022.
OPT has the same size as GPT-3, with 175 billion parameters (the values in a neural network that are modified during training). Pineau claims that this was intentional. The team designed OPT to be as accurate and toxic as GPT-3 on language tasks. GPT-3 is now accessible as a paid service from OpenAI, although neither the model nor its code have been disclosed. Pineau explains that the goal was to provide scholars with a similar language model to explore.
The request to remark on Meta’s statement was denied by OpenAI.
Google, which is exploring the use of large language models in its search products, has also been criticized for a lack of transparency. The company sparked controversy in 2020 when it forced out leading members of its AI ethics team after they produced a study that highlighted problems with the technology.