It has been a while since this blog has been active. Almost 9 years to be precise. Things are changing. My work has changed and I am no longer working for a big cooperation. And the world around us is changing – with potentially the biggest impact coming from Artificial Intelligence and the possibilities it brings. I recently read the excellent scenario called AI 2027 – and what only a couple of years ago would not have left a lasting impression on me – except maybe calling it “a nice science fiction scenario” – now somehow has hit me differently.
Interesting connections to research in AI today
The reason was not how elaborate it was, how well the research was executed or how immersive the scenario is – even though all of this is for sure the case! What got me thinking deeply was how much it connects to findings in AI that are being published right now. Let me give you some examples that struck me:
- Empathy and AI’s becoming trusted advisors to humans. This is described in great detail in the later stages of the scenario – and is for sure science-fiction, right? Well, what is happening today already is that GPT4.5 passes the turing test better than humans (see source) – as well as a publication being released which is claiming that ChatGPT is more empathetic than humans (see source). This is happening – today.
- How little we know about how our AI’s function. This is one one of the main problems described in the scenario – and if you look at recent research by Anthropic, who have been trying to understand better how their own model works here (with impressive findings), it is amazing how little we know about the inner workings of our AI’s – yet trust them so much on a variety of tasks already today.
- And last but not least: Anthropic’s work on alignment-faking (source). The scenario describes alignment-faking in a lot of detail and you might wonder if this is really a big problem in practice. However – the mentioned research by Anthropic shows it is a problem, as apparently continuous training of a model to “patch” harmful behavior does not necessarily mean the model will stop engaging in the harmful plotting / thoughts – but instead that it just gets better at faking the response that its trainers expect. This is not science fiction, this research is based on experiments carried out in 2024 using Anthropic’s Claude 3 Opus large language model. Once again – today’s research on today’s models.
Interesting connections, aren’t they?
Lots of further questions
At least for me, the scenario also raised a couple of interesting questions worth further exploring. A seasoned AI researcher maybe knows the answers to at least some of them, but I still need to figure them out for myself. Here are some:
- Do OpenAI, Anthropic or Google indeed prioritize creating AI’s that can speed up AI research? Is there any literature that supports this claim? It would make sense, in a way, as it would enhance their lead against the competition. But companies don’t always do what makes most sense in the long term – especially venture-driven ones like OpenAI, whose investors have an expectation to earn their money back. Or Google, who has a stock value and profit expectations around search to fulfill – every single quarter.
- What is the state of alignment-research around AI’s? The field has gotten rather quiet – but I am also not following it enough to notice advances. Geoffrey Hinton (also called “the godfather of AI” by some) in one of his interviews (watch minute 15:00 to 17:00) says that right now, 99% of the resources in AI-companies are spent on making the AI better – while less than 1% of the research goes into alignment. The interview is from 2023 – has this changed since then?
- Apparently, we are in the middle of an AI hype cycle right now if you look at the funding poured into AI-companies. Will this hype be followed by an AI winter again – like it has happened countless times in the past? Or is this cycle different? If yes – how and why? If it is the same and there will be an AI winter coming – what will be the factor limiting AI advances this time?
- There is a lot of talk in the scenario about Artificial General Intelligence (AGI). This is also often mentioned elsewhere. But how exactly is AGI even defined? And how will we know when it has arrived? Are there tests beyond the obvious ones like the mentioned Turing-test – which have mostly been passed by even current models? Indications? Or will it be a case of “we know it when we see it”?
Let me end this by saying that I do NOT believe our world will go the way that is described in the scenario. I am on optimist by nature, so I do believe AI can and will be used to advance mankind, rather than dispose of it. However, it is important to look at the dangers that new technology brings, think about it, do the research – and make sure we remain in control. Thinking about the future is what the scenario is trying to stipulate – and it sure did so for me.