Ask the artificial intelligence system created by German startup Aleph Alpha about its “Lieblingssportteam” (favorite sports team) in German, and it riffs about Bayern Munich and former midfielder Toni Kroos. Quiz the neural network on its “equipo deportivo favorito,” and it replies in Spanish about Atlético Madrid and its long-ago European Cup win. In English, it’s the San Francisco 49ers.
Answering a question never seen, matching language to culture, and peppering answers with backup facts has until recently been beyond the ken of neural networks, the statistical prediction engines that are a pillar of artificial intelligence (AI). Aleph Alpha’s approach, and others like it, represent a shift in AI from “supervised” systems taught to complete tasks, such as identifying cars and pedestrians or finding disloyal customers through labelled examples. This new breed of “self-supervised learning” networks can find hidden patterns in data without being told in advance what they’re seeking—and apply knowledge from one field to another.
The results can be uncanny. Open AI’s GPT-3 can write lengthy, convincing prose; Israel’s AI21 Labs’ Jurassic-1 Jumbo suggests ideas for blog posts on tourism or electric cars. Facebook uses a language-understanding system to find and filter hate speech. Aleph Alpha is fine-tuning its general AI model with specialized data in fields such as finance, automotive, agriculture, and pharmaceuticals.
“What can you do with these models beyond writing cool text that seems like a human has written it?” says Aleph Alpha CEO and founder Jonas Andrulis. The serial entrepreneur sold a prior company to Apple, stayed three years in R&D management, then built his current venture in Heidelberg. “These models will free us from the burden of banal office work, or government busywork like writing reports that no one reads. It’s like a capable assistant—or an unlimited number of smart interns.”
Self-supervised systems turn traditional software development on its head: Instead of tackling a specific problem in a narrow field, the new AI architects first build their self-learning models, let them ingest content from the internet and private datasets, and then discover what problems to solve. Practical applications are starting to emerge.
For white-collar office workers, for example, Aleph Alpha is teaming up with workflow automation software maker Bardeen to explore how users could enter free-text commands in different languages to generate useful code without knowing how to program.
As a measure of the field’s progress, just two years ago the state-of-the-art neural network—a language-understanding system called BERT—held 345 million parameters. Aleph Alpha, which closed a €23 million ($27 million) funding round in July, is training a 13 billion parameter AI model on Oracle Cloud Infrastructure (OCI), using hundreds of Nvidia’s most powerful graphic processing units connected by high-speed networking. A second Aleph Alpha model holds 200 billion parameters.
Cloud computing, such as OCI, is removing a big development constraint. “Artificial general intelligence is limited by computing power, and it’s limited by training the systems,” says Hendrik Brandis, cofounder and partner at EarlyBird Venture Capital in Munich, which led Aleph