Skip to main content

Associate Professor Peter Vamplew

Research Adviser

Information Technology Group B



Mt Helen Campus, Online


Associate Professor Peter Vamplew’s information technology expertise focuses on artificial intelligence, particularly reinforcement learning, neural networks and evolutionary computation.

Dr Vamplew is currently researching variations on reinforcement learning algorithms for multi-objective problems, which contribute to the explainability and safety of autonomous AI systems. Peter’s research has been published widely in highly-ranked international journals.

Peter leads the Federation Learning Agents Group, which focuses on reinforcement learning and related topics. He is an Associate Editor for Neurocomputing journal, and a grant reviewer for the Australia Research Council, the Flanders Research Foundation and the Dutch Research Council.

Peter has been Associate Professor in Information Technology at Federation University Australia since 2014, and was Senior Lecturer at the University of Ballarat (now Federation University) since 2005. Previously he was a lecturer within the computing discipline at the University of Tasmania from 1991–2005, where he received his PhD.

Field of Research

  • Reinforcement learning
  • Fairness, accountability, transparency, trust and
  • Autonomous agents and multiagent systems

A Brief Guide to Multi-Objective Reinforcement Learning and Planning JAAMAS track

  • Conference Proceedings

A conceptual framework for externally-influenced agents: an assisted reinforcement learning review

AI apology: interactive multi-objective reinforcement learning for human-aligned AI

A NetHack Learning Environment Language Wrapper for Autonomous Agents

Elastic step DDPG: Multi-step reinforcement learning for improved sample efficiency

Explainable reinforcement learning for broad-XAI: a conceptual framework and survey

Human engagement providing evaluative and informative advice for interactive reinforcement learning

Persistent rule-based interactive reinforcement learning

Scalar Reward is Not Enough JAAMAS Track

  • Conference Proceedings

An online scalarization multi-objective reinforcement learning algorithm: TOPSIS Q-learning

A practical guide to multi-objective reinforcement learning and planning

Discrete-to-deep reinforcement learning methods

Neural networks are effective function approximators, but hard to train in the reinforcement...

Evaluating Human-like Explanations for Robot Actions in Reinforcement Learning Scenarios

Scalar reward is not enough: a response to Silver, Singh, Precup and Sutton (2021)

An evaluation methodology for interactive reinforcement learning with simulated users

Interactive reinforcement learning methods utilise an external information source to evaluate...

A Prioritized objective actor-critic method for deep reinforcement learning

An increasing number of complex problems have naturally posed significant challenges in...

Explainable robotic systems: understanding goal-driven actions in a reinforcement learning scenario

Language Representations for Generalization in Reinforcement Learning

  • Conference Proceedings

Levels of explainable artificial intelligence for human-aligned conversational explanations

Over the last few years there has been rapid research growth into eXplainable Artificial...

Potential-based multiobjective reinforcement learning approaches to low-impact agents for AI safety

The concept of impact-minimisation has previously been proposed as an approach to addressing the...

Reanimating Historic Malware Samples

  • Book Chapters

The impact of environmental stochasticity on value-based multiobjective reinforcement learning

A common approach to address multiobjective problems using reinforcement learning methods is to...

A multi-objective deep reinforcement learning framework

This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL)...

API Based Discrimination of Ransomware and Benign Cryptographic Programs

Ransomware is a widespread class of malware that encrypts files in a victim’s computer and...

Discrete-to-Deep Supervised Policy Learning An effective training method for neural reinforcement learning

  • Conference Proceedings

Enhancing Model Performance for Fraud Detection by Feature Engineering and Compact Unified Expressions

The performance of machine learning models can be improved in a variety of ways including...

Function Similarity Using Family Context

Finding changed and similar functions between a pair of binaries is an important problem in...

Griefing in MMORPGs

Hybrid intrusion detection system based on the stacking ensemble of C5 decision tree classifier and one class support vector machine

Cyberttacks are becoming increasingly sophisticated, necessitating the efficient intrusion...

Identifying cross-version function similarity using contextual features

The identification of similar functions in malware assists analysis by supporting the exclusion...

Motivational Factors of Australian Mobile Gamers

Mobile games are a fast growing industry, overtaking all other video game platforms with year on...

Reanimating historic malware samples

An Empirical Study of Reward Structures for Actor-Critic Reinforcement Learning in Air Combat Manoeuvring Simulation

Reinforcement learning techniques for solving complex problems are resource-intensive and take a...

A novel ensemble of hybrid intrusion detection system for detecting internet of things attacks

The Internet of Things (IoT) has been rapidly evolving towards making a greater impact on...

Categorical features transformation with compact one-hot encoder for fraud detection in distributed environment

Fraud detection for online banking is an important research area, but one of the challenges is...

Evolved similarity techniques in Malware Analysis

Malware authors are known to reuse existing code, this development process results in software...

Integrating Biological Heuristics and Gene Expression Data for Gene Regulatory Network Inference

Gene Regulatory Networks (GRNs) offer enhanced insight into the biological functions and...

Memory-Based Explainable Reinforcement Learning

Reinforcement learning (RL) is a learning approach based on behavioral psychology used by...

Survey of intrusion detection systems:techniques, datasets and challenges

Cyber-attacks are becoming more sophisticated and thereby presenting increasing challenges in...

An anomaly intrusion detection system using C5 decision tree classifier

Due to increase in intrusion activities over internet, many intrusion detection systems are...

Human-aligned artificial intelligence is a multiobjective problem

As the capabilities of artificial intelligence (AI) systems improve, it becomes important to...

Non-functional regression: A new challenge for neural networks

This work identifies an important, previously unaddressed issue for regression based on neural...

Participant observation of griefing in a journey through the World of Warcraft

Through the ethnographic method of participant observation in World of Warcraft, this paper aims...

  • Journals

Rapid anomaly detection using integrated prudence analysis (IPA)

Integrated Prudence Analysis has been proposed as a method to maximize the accuracy of rule based...

SoniFight: Software to Provide Additional Sonification Cues to Video Games for Visually Impaired Players

SoniFight is utility software designed to provide additional sonification cues to video games,...

An agile group aware process beyond CRISP-DM: A hospital data mining case study

The CRISP-DM methodology is commonly used in data analytics exercises within an organisation to...

A taxonomy of griefer type by motivation in massively multiplayer online role-playing games

There is an anti-social phenomenon known as griefing that occurs in online games. Griefing refers...

Evaluating accuracy in prudence analysis for cyber security

Conventional Knowledge-Based Systems (KBS) have no way of detecting or signalling when their...

Softmax exploration strategies for multiobjective reinforcement learning

Despite growing interest over recent years in applying reinforcement learning to multiobjective...

Special issue on multi-objective reinforcement learning

Steering approaches to Pareto-optimal multiobjective reinforcement learning

For reinforcement learning tasks with multiple objectives, it may be advantageous to learn...

A Heuristic Gene Regulatory Networks Model for Cardiac Function and Pathology

Genome-wide association studies (GWAS) and next-generation sequencing (NGS) has led to an...

  • Conference Proceedings

Caliko: An Inverse Kinematics Software Library Implementation of the FABRIK Algorithm

The Caliko library is an implementation of the FABRIK (Forward And Backward Reaching Inverse...

Generating Synthetic Datasets for Experimental Validation of Fraud Detection

Frauds are dramatically increasing every year, resulting in billions of dollars in losses around...

  • Conference Proceedings

Patient admission prediction using a pruned fuzzy min-max neural network with rule extraction

A useful patient admission prediction model that helps the emergency department of a hospital...

Reinforcement learning of pareto-optimal multiobjective policies using steering

Griefers versus the Griefed - what motivates them to play Massively Multiplayer Online Role-Playing Games?

‘Griefing’ is a term used to describe when a player within a multiplayer online environment...

A Survey of Multi-Objective Sequential Decision-Making

Sequential decision-making problems with multiple objectives arise naturally in practice and pose...

Ganking, corpse camping and ninja looting from the perception of the MMORPG community: Acceptable behavior or unacceptable griefing?

Prudent fraud detection in internet banking

An empirical comparison of two common multiobjective reinforcement learning algorithms

In this paper we provide empirical data of the performance of the two most commonly used...

Applications of machine learning for linguistic analysis of texts


The process of sleep stage identification is a labour-intensive task that involves the...

Optimization and matrix constructions for classification of data

Max-plus algebras and more general semirings have many useful applications and have been actively...

  • Journals

RM and RDM, a Preliminary Evaluation of Two Prudent RDR Techniques

Taming the Devil: A game based approach to teaching immunology

  • Conference Proceedings

Using psycholinguistic features for profiling first language of authors

This study empirically evaluates the effectiveness of different feature types for the...

Visualising the value of water

  • Book Chapters

Empirical evaluation methods for multiobjective reinforcement learning algorithms

While a number of algorithms for multiobjective reinforcement learning have been proposed, and a...

Reinforcement learning approach to AIBO robot's decision making process in Robosoccer's goal keeper problem

Automated Opinion Detection: Implications of the Level of Agreement Between Human Raters

The ability to agree with the TREC Blog06 opinion assessments was measured for seven human...

Automatic sleep stage identification: difficulties and possible solutions

  • Conference Proceedings

The Ballarat Incremental Knowledge Engine

Ripple Down Rules (RDR) is a maturing collection of methodologies for the incremental development...

WINDSCREEN: A climate change visualisation tool for water allocation decisions

  • Conference Proceedings

A polynomial ring construction for the classification of data

Applying Clustering and Ensemble Clustering Approaches to Phishing Profiling

  • Conference Proceedings

Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks

Footy, flows and farms: a visualisation tool for determining community water allocation preferences

  • Conference Proceedings

Incorporating Expert Advice into Reinforcement Learning Using Constructive Neural Networks

This paper presents and investigates a novel approach to using expert advice to speed up the...

Inference of Gene Expression Networks using Memetic Gene Expression Programming

  • Conference Proceedings

MRF model based unsupervised color textured image segmentation using multidimensional spatially variant finite mixture model

We investigate and propose a novel approach to implement an unsupervised color image segmentation...

Unsupervised Segmentation of Industrial Images using Markov Random Field Model

We propose a novel approach to investigate and implement unsupervised image content understanding...

Weblogs for market research: Finding more relevant opinion documents using system fusion

On the limitations of scalarisation for multi-objective reinforcement learning of Pareto Fronts

Multiobjective reinforcement learning (MORL) extends RL to problems with multiple conflicting...

System fusion for opinion detection in weblogs

  • Conference Proceedings

Unsupervised Color Textured Image Segmentation Using Cluster Ensembles and MRF Model

We propose a novel approach to implement robust unsupervised color image content understanding...

Using Stereotypes to Improve Early-Match Poker Play

Weblogs for market research: improving opinion detection using system fusion

Portal-based Sound Propagation for First-Person Computer Games

First-person computer games are a popular modern video game genre. A new method is proposed, the...

  • Conference Proceedings

Using Corpus Analysis to Inform Research into Opinion Detection in Blogs

Opinion detection research relies on labeled docu-ments for training data, either by assumptions...

  • Conference Proceedings

An efficient approach to unbounded bi-objective archives: Introducing the Mak_Tree algorithm

Given the prominence of elite archiving in contemporary multiobjective optimisation research and...

  • Conference Proceedings

An efficient data structure for unbounded bi-objective archives: Introducing the mak_tree

  • Conference Proceedings

Enhanced temporal difference learning using compiled eligibility traces

Eligibility traces have been shown to substantially improve the convergence speed of temporal...

More effective web search using bigrams and trigrams

This paper investigates the effectiveness of quoted bigrams and trigrams as query terms to target...

  • Journals

Accelerating real-valued genetic algorithms using mutation-with-momentum

  • Conference Proceedings

An anti-plagiarism editor for software development courses

  • Conference Proceedings

Concurrent Q-learning: Reinforcement learning for dynamic goals and environments

  • Journals

Global versus local constructive function approximation for on-line reinforcement learning

  • Conference Proceedings

On-line reinforcement learning using cascade constructive neural networks

  • Conference Proceedings

The combative accretion model: Multiobjective optimisation without explicit pareto ranking

Contemporary evolutionary multiobjective optimisation techniques are becoming increasingly...

A language for platform independent communication and storage in multiobjective optimisation

  • Conference Proceedings

Generalised algorithms for redirected walking in virtual environments

  • Conference Proceedings

Learning place cells from sonar data

  • Conference Proceedings

LegoTM mindstormsTM robots as a platform for teaching reinforcement learning

  • Conference Proceedings

PoD can mutate: A simple dynamic directed mutation approach for genetic algorithms

  • Conference Proceedings

Reducing the time complexity of goal-independent reinforcement learning

  • Conference Proceedings

Refining search queries from examples using boolean expressions and latent semantic analysis

  • Conference Proceedings

Adaptive response function neurons

  • Conference Proceedings

A simplified artificial life model for multiobjective optimisation: A preliminary report

  • Conference Proceedings

Concurrent Q-learning for autonomous mapping and navigation

  • Conference Proceedings

A supervised neural network based on the cerebellum

  • Journals