Herke van Hoof

Assistant professor
AMLab
Informatics Institute
University of Amsterdam
Science Park, Lab 42, L4.05

 

Personal page   Google scholar   Twitter  

I am assistant professor at the University of Amsterdam in the Netherlands. My group works on various aspects of reinforcement learning in structured domains. Reinforcement learning is a very general framework, but the price of that generality is generally low data-efficiency. To address that, we investigate topics like using (symbolic) prior knowledge, encoding inductive biases in the policy structure, and transferring knowledge between tasks. We are furthermore interested in applying reinforcement learning to domains with structured states or actions, such as learning heuristics for combinatorial problem solving.


Selected Publications

  1. ICAPS
    Planning with a Learned Policy Basis to Optimally Solve Complex Tasks
    Kuric, D., Infante, G., Gómez, V., Jonsson, A., and Hoof, H.
    In International Conference on Automated Planning and Scheduling Jul 2024
  2. TMLR
    Reusable Options through Gradient-based Meta Learning
    Kuric, David, and Hoof, Herke
    Transactions on Machine Learning Research Jul 2023
  3. ICLR
    Multi-Agent MDP Homomorphic Networks
    Pol, Elise, Hoof, Herke, Oliehoek, Frans, and Welling, Max
    In International Conference on Learning Representations Jul 2022
  4. IJCAI
    Value Refinement Network (VRN)
    Wöhlke, Jan, Schmitt, Felix, and Hoof, Herke
    In International Joint Conference on Artificial Intelligence Jul 2022
  5. IJCAI
    Leveraging class abstraction for commonsense reinforcement learning via residual policy gradient methods
    Höpner, Niklas, Tiddi, Ilaria, and Hoof, Herke
    In International Joint Conference on Artificial Intelligence Jul 2022
  6. ICLR
    Estimating Gradients for Discrete Random Variables by Sampling without Replacement
    Kool, Wouter, Hoof, Herke, and Welling, Max
    In International Conference on Learning Representations Jul 2020
  7. ICML
    Addressing function approximation error in actor-critic methods
    Fujimoto, S., Hoof, H., and Meger, D.
    In International Conference on Machine Learning Jul 2018
  8. ICML
    An Inference-Based Policy Gradient Method for Learning Options
    Smith, M., Hoof, H., and Pineau, J.
    In International Conference on Machine Learning Jul 2018
  9. JMLR
    Non-parametric Policy Search with Limited Information Loss
    Van Hoof, H., Neumann, G., and Peters, J.
    Journal of Machine Learning Research Jul 2017