The end seems nigh for bad pun writers. That, at least, is the official conclusion from Tech Monitor on ‘pun papa biden,’ a new tool built using an open-sourced Large Language Model (LLM) and designed to generate reassuringly groan-inducing dad jokes in the dulcet tones of the 46th President of the United States. “Did you hear about the guy that was stuck in a tree?” the open source model asks our brave reporter. A pause, as our humble publication girds itself for the punchline. “He’s still on that branch.”
Clearly, real comedians need not fear for their jobs. But despite its humorous limitations, ‘pun papa biden’ is one of a growing number of impressive off-the-wall tools built using open-source LLMs. These models have displayed immense improvements in power and sophistication in recent months. Keerthana Gopalakrishnan, a software developer based in California and the brains behind the latest AI presidential pun generator, says she was surprised These soaring abilities have left the open-source community at an existential crossroads. While pun generation should, Many argue, despite these risks, that open-source models are a necessary counterweight to the global dominance of companies like Meta and Google. That, at least, is the dream embraced Open-source software “has long been the backbone of AI,” says generative AI expert Henry Ajder. The principle of taking code and publishing it for all the world to see and tinker with has remained more or less unquestioned among the AI research community, and has been credited with supercharging the technology’s development. Even so, says Ajder, while most developers have good intentions in sharing their source code, they’re also unintentionally supplying bad actors “with the foundations that can be used to build some pretty disturbing and unpleasant toolsets.”
OpenAI agrees. Despite its name, the company is now a closed-source operation, meaning that the code behind the popular ChatGPT and GPT-4 cannot be copied or modified. What’s more, the firm seems to regret its earlier enthusiasm for releasing its models into the wilds of GitHub. “We were wrong,” OpenAI co-founder Ilya Sutskever told The Verge. “If you believe, as we do, that at some point AI, AGI [Artificial General Intelligence], is going to be extremely, unbelievably potent, then it just does not make sense to be open-source.”
Detractors argue that the company’s rejection of its old ideals might be a convenient way to bolster its coffers — a marketing tactic that imbues a sense of mystery and power in a technology that many coders outside its corporate walls seem perfectly capable of honing without worrying about unleashing a superintelligence. Others, meanwhile, have profound ethical objections to closed-source toolsets. They warn that AI is an extremely powerful tool which, if reserved to just a few large companies, has the potential to hypercharge global inequality.
This isn’t just a theoretical proposition. Open-source LLMs currently enable researchers and small-scale organisations to experiment at a fraction of the cost associated with their closed-source cousins. They also enable developers around the globe to better understand this all-important technology. Gopalakrishnan agrees. “I think it’s important to lower the barrier to entry for experimentation,” she says. “There are a lot of people interested in this technology who really want to innovate.”
Content from our partners
What’s behind the open-source AI boom?
Developers got a big boost from Meta’s powerful LLaMA, which leaked online on March 3rd, just one week after its launch. This was the first time that a major firm’s proprietary LLM had leaked to the public, thus making it effectively open-source. Although licensing regulations prevented LLaMA — and its derivatives — from being used for commercial purposes, it still helped developers accelerate their understanding and experimentation. Numerous LLaMA-inspired models were soon released, including Stanford’s Alpaca, which added a layer of instruction-tuning to the model.
A key accelerator in the development of open-source LLMs has been the popular adoption of LoRA, which stands for Low-Rank Adaptation. This technique allows developers to fine-tune a model at a fraction of the cost and time — essentially enabling researchers to personalise an LLM on ordinary hardware in just a few hours. Gopalakrishnan used LoRA to train ‘Pun Papa Biden’ in less than fifteen hours while at a hackathon in California.
LoRA is also stackable, meaning that improvements made A leaked document, whose author was identified That faction was, the author quickly clarified, open-source AI. It cost more than $100m to train GPT-4, according to OpenAI CEO Sam Altman. Researchers at UC Berkeley, meanwhile, released Koala in early April — an open-source ChatGPT-equivalent based on LLaMA and trained exclusively on freely-available data. On public cloud-computing platforms, the researchers estimate that training Koala will typically cost under $100. Through ChatGPT, OpenAI lowered the barrier to using LLMs. Open-source development, meanwhile, has lowered the barrier to fine-tuning them and personalising them.
Leave a Reply