Improving the safety of advanced AI systems


As the capabilities of AI systems ramps up, so do concerns from some researchers about the negative repercussions these can have. Image: DIgilife - stock.adobe.com

Federation researchers have won funding from a UK-based charitable initiative to support their work in improving the safety of artificial intelligence (AI) systems.

Professor Peter Vamplew and Associate Professor Cameron Foale from the Centre for Smart Analytics (CSA) have received the funding from Founders Pledge to support their research into the applications of multi-objective reinforcement learning to improve the safety of advanced AI systems.

This work will be carried out in collaboration with members of the Australian Responsible Autonomous Agents Collective, a research group of academics and students from Federation University, Deakin University and the University of New South Wales.

Multi-objective reinforcement learning refers to the process where an AI system is trained to find a solution or compromise to a problem where several factors could influence the outcome.

"If you are trying to train a self-driving car, for example, you want the car to get to a destination and you want it to get there quickly, but you want it to get there safely, you don't want it to break road rules, and you want the drive to be comfortable while minimising fuel consumption," Professor Vamplew said.

"There are several objectives you might care about, and a multi-objective system treats each of those as an independent reward that the AI agent could care about as it tries to find a suitable trade-off across all those various factors.

"If you build a system and only optimise one thing, you risk overlooking a lot of other things that are important. For instance, the self-driving car can learn to get to its destination quickly, but it might drive on the wrong side of the road, it might speed and bounce off other cars and so on."

The funding worth $127,000 will allow the team to employ a research assistant to work on multi-objective methods – beginning with work on "very small problems" to set the groundwork for larger and more complex issues later.

"Our goal is to advance knowledge and develop techniques that will be incorporated into real-world AI. The AI that makes the real world is generally built by big tech, so for us, it's a case of developing something viable, publishing and getting the ideas out to industry," Professor Vamplew said.

"One of the core ideas we're exploring is this idea of what's called a low-impact agent. This is where there is a secondary objective to a task that needs to be considered.

"If I'm training an agent to vacuum the floor in a room, I don't want it throwing all the furniture out the window because it makes it easier to do the vacuuming. We have a system that encourages these agents to not move things or to not make changes, but if they do, we want them to put things back where they found them," he said.

"In this case, it could be moving a chair to vacuum under it but then putting it back where it was. The thinking there is the less impact you have on the environment, the less chance there is doing something negative."

Professor Vamplew began looking into safety issues in 2018 when the popularity of AI systems and capabilities first came to the mainstream. As the capabilities of AI systems began to ramp up, so did concerns from some researchers about the negative repercussions these could have.

With more recent breakthroughs in AI capability, Professor Vamplew says these are both exciting and terrifying, with AI technology poised to lead to breakthroughs in medicine and health while hopefully also being aligned to human needs and ethical values. On the flipside, the technology is being used in combat and is feared by many for its potential to replace humans in jobs.

"We are going to see massive breakthroughs in medicine enabled by this AI over the next few years, and that's the sort of AI we're incredibly excited about," he said.

"But the idea of just rolling out these large language models with basically no concern about the potential negative outcomes scares me. There has been a rush to be first to market for the big tech companies, and at times, this is getting in the way of ensuring these systems are not going to be dangerous." Professor Peter Vamplew

Professor Vamplew's research has also been accepted for the upcoming International Conference on Machine Learning (ICML), one of the world's premier AI conferences which had about 10,000 submissions, of which about one quarter were published. This paper was a collaboration with researchers from Switzerland and the USA and argues for the need to ensure that advanced AI systems do not reduce human agency. It was awarded 'spotlight' status – an accolade given to only the top 3.5 per cent of ICML submissions.

Professor Vamplew was also appointed to the Future of Life Institute's Artificial Intelligence Existential Safety Community in 2022.

The Future of Life Institute is a philanthropic organisation aimed at diminishing risks to the future of humanity and has particular interests in nuclear proliferation, environmental issues, and AI safety. The AI Existential Safety Community consists of AI researchers, mostly from the world's top 50 ranked universities. Professor Vamplew and his collaborator Associate Professor Richard Dazeley were the first Australians to be admitted to the group.

"It is important to note that it is not just safety, we're also concerned about things like ethics and fairness, lack of bias and transparency and so on as well. It all goes together."

Related reading:

The AI revolution — balancing power and regulation

Safety in the spotlight as AI booms


Researchers tackle challenges of harsh arid zone conditions

16 August 2024

Federation researchers are trialling an innovative tool they hope will give young plants a chance to thrive in an area where it has been almost impossible for them to survive.

Federation researcher awarded ARC Future Fellowship

5 August 2024

A Federation University researcher has been awarded a prestigious ARC Future Fellowship for a project that aims to increase energy resilience and efficiency in next-generation community microgrids.

Exploring the challenges and opportunities for livestock industry

30 July 2024

A study looking at innovations in meat-based food systems is exploring how traditional farming can survive and thrive with the growing popularity of plant-based meat alternatives.